• PyData Triangle Q3-2019 Meetup

    Valassis Digital

    Speakers: • Tal Yifat: Building Automated Data-preparation Pipelines with Sklearn or How to Be a Smooth Data Janitor • Andrew Knight, PyCarolinas and You • Saravana Srinivasan, Learnings from predicting rare online consumer events Lightning Talks: • YOU! Sign-up for a 5 minute lightning talk slot at the meeting. Or pre-sign-up by sending a message to the organizers. Schedule: • 6:00 - 6:30: Mingle and Food - Thank you Valassis for the great dinner • 6:30 - 7:45: Presentations • 7:45 - 8:00: Lightning Talks The PyData code of conduct ( http://pydata.org/code-of-conduct.html ) is enforced at this Meetup. Attendees violating these rules may be asked to leave the meetup without a refund at the sole discretion of the meetup organizer. Propose a talk or tutorial for the meetup. Contact any of the organizers Gene Ferruzza, Eric Dill, Chris Calloway and Ginny Ghezzo, through meetup messages. Follow us on twitter at: https://twitter.com/pydatatriangle Building Automated Data-preparation Pipelines with Sklearn or How to Be a Smooth Data Janitor: Tal Yifat is a data scientist with MetLife’s R&D group. Learnings from predicting rare online consumer events: In digital advertising, a “conversion” refers to the event when the shopper clicks on the ad and performs a valuable action such as signup, registration, or make a purchase. While the number of conversion events could range from few hundreds to few thousands, the number of devices to target is more than half-a-billion. In other words, conversion events are very rare. In this talk, I will be discussing the challenges in predicting rare conversion events and possible means to address the problem. Specifically, I will be talking about utilizing non-linear feature selection methods, stratification of training samples, monotonic constraints, and model interpretation tools to build better models to predict rare events. Saravana Srinivasan is a senior data scientest with Valassis Digital. received B.S. in Electronics and Communication Engg., from Coimbatore Institute of Technology, M.S. in Electrical Engg., from Colorado State University, and M.B.A in Investment Management from University of Colorado, Denver. Saravana's interests are in applying signal & image processing algorithms to detect, track and/or predict signals of interest. Saravana’s expertise is in building applications using ML to solve problems as diverse as tracking enemy tank movements and UAVs, to tracking shopper movements within a store.

    4
  • PyData Triangle Q2-2019 Meetup

    Valassis Digital

    Speakers: • Shannon Kreps, Director, Product Marketing at Rizing Turn your Data into a Story • Schaun Wheeler, Senior Staff Data Scientist at Valassis Digital Theto: a Python library for abstracting and automating geospatial data visualization Lightning Talks: • Peter Baumgartner - Title: five cool things in five minutes • YOU! Sign up for a 5 minute lightning talk slot at the meeting. Pre-sign-ups: Schedule: • 6:00 - 6:30: Mingle and Food - Thank you Valassis for the great dinner • 6:30 - 7:45: Presentation • 7:45 - 8:00: Lightning Talks The PyData code of conduct ( http://pydata.org/code-of-conduct.html ) is enforced at this Meetup. Attendees violating these rules may be asked to leave the meetup without a refund at the sole discretion of the meetup organizer. Propose a talk or tutorial for the meetup. Contact any of the organizers Gene Ferruzza, Eric Dill, Chris Calloway and Ginny Ghezzo, through meetup messages. Follow us on twitter at: https://twitter.com/pydatatriangle Shannon Kreps is Director of Product Marketing at Rizing. "Don’t you hate going to a session you think should be interesting but find yourself 50 minutes later unimpressed and uninspired. The best presentations, especially those that convey complex concepts and content, tell a story. This talk provides a framework for creating a powerful presentation through storytelling with practical examples and a framework for success." Schaun Wheeler is a Senior Staff Data Scientist at Valassis Digital. His work focuses on the company's "consumer graph" that ties together online and offline behavioral signals. A lot of his time is spent making sense of mobile location data. Schaun will speak about a Python package he recently open-sourced, called "Theto", that abstracts and automates most of the overhead involved in visualizing geospatial data. Theto stores api keys, palettes, and other static resources typically needed throughout a visualization pipeline; loads and formats data sources to stage it for downstream needs; adds widgets for interactivity and infers the appropriate parameters for those widgets based on the source data; determines plot bounds, size, map zoom level, and other parameters; and in many other ways makes the path from "I have location data" to "I can explore my location data" as painless as possible.

    4
  • PyData Triangle Q1-2019 Meetup

    Valassis Digital

    Speakers: • Introduction to Keras Dhruv Sakalley, Data Scientist at LexisNexis Keras API is one of many deep learning frameworks out there, it is also the most friendly one to get started into deep learning. If you have been curious about getting into deep learning space, this would be a nice opportunity to get your hands on some of the cool new developments in the deep learning space and possibly build your own deep learning models in the process. *** Note: Bring you laptop with conda installed if you want hands on fun. (GPUs are nice by optional.)*** • Stacking Audience Models: Using a Model Ensemble Approach to Predict a Rare Event Alice Broadhead, Data Scientist at Valassis Digital In the advertising world, we need to predict who will do the action for which we get rewarded (“convert”) after we serve them an ad. Conversion is fairly rare, so we need a robust approach to predicting rare events. We call a list of users, ordered by propensity to convert, an “audience”. In this talk I will describe a model ensemble method that can be used to build an audience: stacking. Stacking not only strengthens the final model by reducing model bias, it also allows parallelization of model development amongst developers. In addition to describing the model itself, I will briefly address how to optimize a stacked model ensemble and pitfalls to avoid. At the end of this talk, the audience will know what stacking is and why it is useful in the context of audience modeling. They should also know some pitfalls to avoid when building their own stacked models. Lightning Talks: • YOU! Sign up for a 5 minute lightning talk slot at the meeting. Pre-sign-ups: • Peter Baumgartner, Applied Natural Language Processing in Python • Nick Haynes, Recent Advancements in Transfer Learning for NLP Schedule: • 6:00 - 6:30: Mingle and Food - Thank you Valassis for the great dinner • 6:30 - 7:45: Presentation • 7:45 - 8:00: Lightning Talks https://docs.google.com/presentation/d/e/2PACX-1vTfG0EGdR4JFbdGfK0QI2EDFp9D3idF_p1uNyl-rTpib7HBYbbu_JJ8GoObj755MDNjqzKGYF_qEeYf/pub?start=true&loop=true&delayms=10000 The PyData code of conduct ( http://pydata.org/code-of-conduct.html ) is enforced at this Meetup. Attendees violating these rules may be asked to leave the meetup without a refund at the sole discretion of the meetup organizer. Propose a talk or tutorial for the meetup. Contact any of the organizers Gene Ferruzza, Eric Dill, Chris Calloway and Ginny Ghezzo, through meetup messages. Follow us on twitter at: https://twitter.com/pydatatriangle

    39
  • JupyterDay in the Triangle

    Needs a location

    JupyterDay in the Triangle (https://libcce.github.io/TriangleJupyter/) is a single-day conference for Jupyter users in the Southeast. The event takes place at The Carolina Club (http://www.clubcorp.com/Clubs/Carolina-Club) to showcase applications of the open source software created by Project Jupyter and community. These interactive technologies are reshaping how people interact with code and data in industry and academia. Tickets: Student $10.00 Early Bird General Public $25.00 (Through October 13) General Public $30.00 (Starting October 13) Purchase at: https://www.eventbrite.com/e/jupyter-day-in-the-triangle-tickets-48813059174 JupyterDay in the Triangle is a chance for Jupyter users of all experience levels to share and learn about the state-of-the-art in open source scientific computing. We expect an exciting line up of keynotes, curated talks, lightning talks, and birds of a feather sessions. We are accepting talk submissions now (https://docs.google.com/forms/d/e/1FAIpQLSelrtWU2t7h64renxkisXePKAM8PqcAamDLD19Dh6hNZbTCTA/viewform). This event is inspired by past and future events hosted by the Jupyter community and hopes to connect people across the Southeast including Atlanta, Athens, Raleigh, Columbia, Charlotte, and more. Visit the conference website (conference website ) for the up-to-date schedule and speaker information. Make plans to join us in Chapel Hill on the campus of the Univerisity of North Carolina. There are a limited number of tickets or submit an accepted talk to receive free admission. A special thanks to our sponsors: University Libraries at the University of North Carolina at Chapel Hill Advance Auto Parts Valassis Digital Diveplane The NumFOCUS Foundation Project Jupyter More information about JupyterDay in the Triangle sponsors at https://libcce.github.io/TriangleJupyter/information/ Tickets: https://www.eventbrite.com/e/jupyter-day-in-the-triangle-tickets-48813059174 Call for proposals: https://docs.google.com/forms/d/e/1FAIpQLSelrtWU2t7h64renxkisXePKAM8PqcAamDLD19Dh6hNZbTCTA/viewform Agenda 8:00am — 9:00am: Check-in, network, and enjoy breakfast 9:00am — 9:30am: Introductions & logistics 9:30am — 10:30am: Keynote with Q&A 10:30am — 10:45am: Relax, it’s break time 10:45am - 12:00pm: Two brief 25 minute talks 12:00pm — 1:00pm: Lunchtime 1:00pm — 2:00pm: Two brief 25 minute talks 2:00pm — 2:15pm: Relax, it’s time for some PM snacks and a break 2:15pm — 3:30pm: Lightning talks from attendees, 5 minutes each 3:30pm — 5:00pm: Birds of a feather discussions on topics from the community Note: Breakfast, lunch, and PM snacks will be served.

    2
  • PyData Triangle Q4-2018 Meetup

    Valassis Digital

    Speakers • Christopher West NCSU Institute for Advanced Analytics - https://analytics.ncsu.edu/ Overview of the program, their practicums and how their curriculum is where data meets the future. • Brian Jones Proofpoint Detecting Entities in Tweets using Deep Learning, Crowdsourcing, and Python --- Lightning Talks • YOU! Sign up for a 5 minute presentation slot Schedule: • 6:00 - 6:15: Mingle and Food - Thank you Valassis • 6:15 - 7:45: Presentations • Social (after-meeting) - Cedar Fork Bar & Bistro in the Hotel Indigo lobby (151 Tatum Drive, take Miami Blvd towards I-40) The PyData code of conduct ( http://pydata.org/code-of-conduct.html ) is enforced at this Meetup. Attendees violating these rules may be asked to leave the meetup without a refund at the sole discretion of the meetup organizer. Propose a talk or tutorial for the meetup. Contact any of the organizers Gene Ferruzza, Eric Dill, Chris Calloway and Ginny Ghezzo, through meetup messages. Follow us on twitter at: https://twitter.com/pydatatriangle Event Details: https://docs.google.com/presentation/d/e/2PACX-1vSMPhqpai1jcN0j7Ql67Fv7A911bIyQZuF12TVqEBM8a6hwgXFm-G3ERkTHoqhPKKuHaQgXj7_g51-l/pub?start=true&loop=true&delayms=10000

    12
  • PyData Triangle Q3-2018 Meetup

    Valassis Digital

    Speakers • Gene Ferruzza – Senior Manager of Data Science, Valassis Digital Title: Automated Machine Learning (AutoML), Putting AI to Work in Data Science Description: The rising need for applied data science in nearly every business along with the forecasted shortage of data scientists is fueling the AutoML space. The number of AutoML offerings has grown by 300% over the last 18 months. During this time there has been a wide array of definitions, expectations and skepticism about these AI driven data science tools and how they might change time proven approaches to model development and deployment. In this talk we will review the results of 2018 research that explores AutoML, from its history and definition to the approach and capabilities of the leaders in the field. Attendees will gain a better understanding of the “state” of AutoML, why data scientists should embrace this technology, and what is available in the marketplace today. • Chris Bizon Title: OBOKOP: Answering Questions using the NCATS Biomedical Translator Platform Description: Biological and medical knowledge is published at an ever increasing rate, but that knowledge is often difficult to access and integrate. The Biomedical Translator program, funded by NCATS, is an attempt to standardize how such data is exposed so that scientists and developers can combine it in novel and interesting ways. One tool based on this platform is ROBOKOP: Reasoning Over Biological Objects in Knowledge Oriented Pathways, which performed distributed graph queries and ranks results based on the strength of literature support for an answer. Bio: Chris Bizon is the interim Director of Analytics and Data Science at RENCI. He has a Ph.D. in Physics from the University of Texas at Austin, and has worked in informatics (both chem- and bio-) for over a decade in industry and academia. --- Lightning Talks • YOU! Sign up for a 5 minute presentation slot Schedule: • 6:00 - 6:15: Mingle and Food • 6:15 - 7:45: Presentations • No Planned Social Tonight but please roll your own! The PyData code of conduct ( http://pydata.org/code-of-conduct.html ) is enforced at this Meetup. Attendees violating these rules may be asked to leave the meetup without a refund at the sole discretion of the meetup organizer. Propose a talk or tutorial for the meetup. Contact any of the organizers Gene Ferruzza, Eric Dill, Chris Calloway and Ginny Ghezzo, through meetup messages. Follow us on twitter at: https://twitter.com/pydatatriangle

    14
  • PyData Triangle Q2-2018 Meetup

    Valassis Digital

    Speakers • Peter Baumgartner, Data Scientist, RTI International Improving Estimates of Arrest Related Deaths with Machine Learning In 2015, the Bureau of Justice Statistics launched a redesign of the Arrest Related Deaths (ARD) program to include multiple methods of identifying and confirming arrest related deaths. In this talk we'll walk through the technical aspects of the redesign, including how natural language processing, machine learning, and the python ecosystem played a key role in reducing the volume of incoming news articles for review by 99%. This is a “behind the scenes” look at a project FiveThirtyEight declared one of the best data stories of 2016. • Amy Nail, Lead Consultant & Owner, Honestat Statistics & Analytics Determining Cause and Effect from Observational Data For a very specific problem situation--determining the effect of a binary categorical variable such as treatment vs control--I will compare different methods used to achieve the balance that a controlled experiment/randomized controlled trial would typically be used to achieve. --- Lightning Talks • YOU! Sign up for a 5 minute presentation slot Schedule: • 6:00 - 6:15: Mingle and Food • 6:15 - 7:45: Presentations • 8:00 Buy Your Own Social at TBD Misc: The PyData code of conduct ( http://pydata.org/code-of-conduct.html ) is enforced at this Meetup. Attendees violating these rules may be asked to leave the meetup without a refund at the sole discretion of the meetup organizer. Propose a talk or tutorial for the meetup. Contact any of the organizers Eric Dill, Chris Calloway and Ginny Ghezzo, through meetup messages. Follow us on twitter at: https://twitter.com/pydatatriangle

    6
  • PyData Triangle Q1-2018 Meetup

    MaxPoint Office

    Link to Event Overview including Code of Conduct: https://docs.google.com/presentation/d/e/2PACX-1vR_2wNcNGiRNdnvY7czvrR5m5BDcwKzmTGt8igQKQkJoPG8s8TGYDU_HsmPT-8dcIbwmt__h5jPZ4ah/pub?start=true&loop=true&delayms=10000 (Note: if you are having troubles give it is a try logged in to google) Speakers: * Schaun Wheeler, Senior Data Scientist, Valassis Digital, Design intuition, ethnography, and data science - Data science's future as a profession faces two major challenges: on the one hand, the technical portions of the day-to-day work are being increasingly automated, while on the other hand, increasingly frequent stories of algorithmic bias make people reticent to trust decision automation. In the interest of future-proofing the profession of data science, this talk will address the need for increased attention to a critical non-technical skill: design. Incorporating lessons from anthropological research methods, and referencing recent professional work on shape validation and geolocation precision calibration, the presentation will make a case for a greater focus on design skills and the need for a question-generating toolset. * Lightning Talks • Matthew Phillips - Vision and Imagine Processing • Jade Vinson - Duke Machine Learning Day • YOU Schedule: • 6:00 - 6:15: Mingle and Food • 6:15 - 7:45: Presentations • 8:00 Buy Your Own Social at TBD Misc: The PyData code of conduct ( http://pydata.org/code-of-conduct.html ) is enforced at this Meetup. Attendees violating these rules may be asked to leave the meetup without a refund at the sole discretion of the meetup organizer. Propose a talk or tutorial for the meetup. Contact any of the organizers Eric Dill, Chris Calloway and Ginny Ghezzo, through meetup messages. Follow us on twitter at: https://twitter.com/pydatatriangle

    14
  • PyData Triangle Q4-2017 Meetup

    MaxPoint Office

    Speakers: * Matthew McCormick, PhD (Kitware, Inc) - An overview of multi-dimensional, multi-modal, image registration methods * Nick Bollweg and Tony Fast (Bastille) - powers (of)ten and the relative size of things in the science Description: Science, and its application to engineering, policy, and business, ultimately affects people and communities. In solving new problems, while we may make use of ever more data and computers, we must also have greater impact and reach for people. Open science, along with open source software and open data, continues to expand the frontiers of innovation. Join Nick and Tony in traversing the powers of ten with data-driven stories. Experts in visual problem solving with Jupyter, Nick and Tony have explored problems across technology siloes, length scales, and disciplines. Together, we hope to provide a rough map of what might be on open science’s frontier of reusable, computable information, or, failing that, a style guide for any citizen scientist looking to access new powers often from the cutting edges of open science and software. --- In 1977 IBM released Powers of Ten, a nine-minute documentary illustrating the state-of-the-art in science from the edge of the universe, to early views into complex biological organisms, and into subatomic matter in their cell's nuclei. The last two generations of science and engineering have continued to decode more complex social and physical relationships in the universe. Computers and the open web are redefining modern science, enabling diverse communities new access to raw data, information insights, and knowledge sharing. In this culture shift, pixels rather than paper propel science forward. Today’s scientist is less information-constrained. Often, their domain is a data-rich environment with information and approaches adopted from other scientific communities. Nick and Tony will discuss working together to solve multi-scale problems and highlight how the power of accessible, computable information, powered by open source software, has changed narratives in science. Schedule: • 6:00 - 6:15: Mingle and Food • 6:15 - 7:45: Presentations • 8:30 Buy Your Own Social at LoneRider Brewery https://goo.gl/maps/btJx5QJigik (https://goo.gl/maps/btJx5QJigikAddress:) Address: 8816 Gulf Ct #100, Raleigh, NC 27617 Agenda: https://docs.google.com/presentation/d/e/2PACX-1vSb0waC7s0lvxLcSKnsKynacr7fDzsYhkCTS85M4c9GaiCcf4B94LICqtAfbjBf6Ahr_di59QZPQTN-/pub?start=true&loop=true&delayms=5000 Misc: The PyData code of conduct ( http://pydata.org/code-of-conduct.html ) is enforced at this Meetup. Attendees violating these rules may be asked to leave the meetup without a refund at the sole discretion of the meetup organizer. Propose a talk or tutorial for the meetup. Contact any of the organizers Eric Dill, Chris Calloway and Ginny Ghezzo, through meetup messages. Follow us on twitter at: https://twitter.com/pydatatriangle

    22
  • PyData Triangle Q3-2017 Meetup

    MaxPoint Office

    Speakers: * PyData Durham 2018 - Melodie Moorefield-Wilson & Chris Calloway * Say When? Using Reinforcement Learning and Neural Networks to Schedule Meal Events for Elite Athletes - Matt Phillips * Python and Stock Market Data- Conrad D'Cruz • Lightning Talks - 5 minutes of fame • Suzy Stiegelmeyer- How to publishing a science package distribution to PYPI and conda-forge • You! Schedule: • 6:00 - 6:15: Mingle and Food • 6:15 - 7:45: Presentations • 8:30 Buy Your Own Social at LoneRider Brewery https://goo.gl/maps/btJx5QJigik (https://goo.gl/maps/btJx5QJigikAddress:) Address: 8816 Gulf Ct #100, Raleigh, NC 27617 Misc: The PyData code of conduct ( http://pydata.org/code-of-conduct.html ) is enforced at this Meetup. Attendees violating these rules may be asked to leave the meetup without a refund at the sole discretion of the meetup organizer. Propose a talk or tutorial for the meetup. Contact any of the organizers Eric Dill, Chris Calloway and Ginny Ghezzo, through meetup messages. Follow us on twitter at: https://twitter.com/pydatatriangle

    21