Join us at the IBM Watson office for our kickoff meetup of NYC Data Science! We have three great talks in store for you.
Presentation Line Up:
Henri Dwyer - "Predictive Maintenance”
Kate Shillo - "Venture and Data Science"
Mike Tamir - "Text Understanding, Word2Vec, and Neural Networks”
• 6 pm: Doors open & networking
• 6:20 - 8 pm: Speakers present
• 8 pm: Closing time
*Note: Attendance Capacity: Due to the nature of the space, we are capped at 58 attendees. First come, first serve; so arrive a few minutes early :)
Henri Dwyer will discuss predictive maintenance - determining the condition of equipment that is currently in use, and predicting when and why equipment is likely to fail. First, Henri will describe the key concepts in predictive maintenance, as well as some applications. Next, he will build a workflow using data from the PHM society data challenge, showing how to go from raw data to the the final predictions. Henri will highlight how to avoid common pitfalls, give examples of what features can be engineered, what techniques and models can be used run to ultimately come up with a robust predictive maintenance model.
Meet the Speaker:
Henri Dwyer is a data scientist and engineer working on building the best platform for data scientists at Dataiku. Before, he did physics research on air pollution and solar cells. He received an MSc in Engineering from Columbia University and a BS and an Ms in Engineering from Ecole Polytechnique in Paris. Henri now lives in New York City and is always keen on discovering new data science problems to solve.
Text Understanding, Word2Vec, and Neural Networks
Supervised text classification is hampered by the need to acquire expensive labeled training sets. By leveraging algorithms similar to Word2Vec and other neural network based text embedding algorithms one can create vector representations of documents that enable a model to be successfully trained with a drastically reduced training set. By using this technique the implementer can now devote low investment to acquiring a small volume of labeled data examples in order to train proximity thresholds, without devoting significant resources using traditional text classification algorithms which typically require training volume examples that are orders of magnitude larger.
Meet the Speaker:
Mike Tamir serves as Chief Science Officer for Galvanize, supervising Galvanize's Immersive Education and accredited Masters Programing. He has led several teams of Data Scientists in the bay area as Chief Data Scientist for InterTrust and as Director of Data Sciences for the Sears Holding Company. Mike began his career in academia serving as a mathematics teaching fellow for Columbia University before teaching at the University of Pittsburgh.