Intro to Data Science - 6 Week Night Class (REQUIRES PAYMENT)

Name: Intro to Data Science - 6 Week Night Class (REQUIRES PAYMENT)
Start: 2016-07-12T18:30:00-04:00
End: 2016-07-12T21:30:00-04:00
Location: Metis New York

Hosted By

Metis

Intro to Data Science - 6 Week Night Class (REQUIRES PAYMENT)

Details

July 12 - August 18
Tuesdays & Thursdays
6:30pm - 9:30pm

NOTE: This is a paid evening course at Metis. ENROLL HERE (http://www.thisismetis.com/introduction-to-data-science)

COURSE OVERVIEW

Data science has become the central approach to tackling data-heavy problems in both the business and academic worlds today. The intent of this course is to expose students to the data scientific approach to thinking about and solving problems, and to help students learn to think about data-heavy problems that they’ll encounter in the future. Students learn how data science is done in the wild, including data acquisition, cleaning, and aggregation, exploratory data analysis and visualization, feature engineering, and model creation and validation. Students will use the Python scientific stack to work through examples that illustrate all of these concepts, with real-life use cases. Concurrently, students will learn some of the statistical and mathematical foundations that power the data scientific approach to problem solving.

DESIGNED AND TAUGHT BY SERGEY FOGELSON

https://vimeo.com/157347764

Sergey Fogelson (https://www.linkedin.com/in/sergeyfogelson) is a data science consultant currently working in the financial industry. He began his career as an academic at Dartmouth College in Hanover, New Hampshire, where he researched the neural bases of visual category learning and obtained his Ph.D. in Cognitive Neuroscience. After leaving academia, Sergey got into the rapidly growing startup scene in the NYC metro area, where he has worked as a data scientist in alternative energy analytics, digital advertising, and cybersecurity. He is heavily involved in the NYC-area teaching community and has taught courses at various bootcamps, as well as been a volunteer teacher in computer science through TEALSK12. When Sergey is not working or teaching, he is probably hiking. (He thru-hiked the Appalachian trail before graduate school).

WHY TAKE AN INTRO TO DATA SCIENCE COURSE?

The practice of data science involves both a collection of skills and a mindset for tackling data-intensive problems (or problems looking in need of data-intensive solutions). Working through this course will give students the tools and necessary background to think about datasets that they encounter in meaningful ways, and will provide enough knowledge to continue their own data science learning in a vast, exciting, and rapidly evolving field.

WHO IS THIS COURSE FOR?

This course is intended for people with a basic understanding of data analysis techniques, and those who are interested in improving their ability to tackle problems involving multi-dimensional data in a systematic, principled way. They want to glean actionable, data-driven insights from that data. A familiarity with some programming language is helpful but unnecessary if the pre-work for the course is completed. No prior advanced mathematical training (beyond an introductory statistics course) is necessary.

THE DETAILS

The 36-hour course is held on weekday evenings from 6:30pm - 9:30pm.
New York City: 27 East 28th Street, 3rd Floor,New York City, NY 10016

THE OUTCOME

Upon completing the Introduction to Data Science course, students will have:

An understanding of problems data science can help to solve, and an ability to attack those problems from a statistical perspective
An understanding of when to use supervised and unsupervised statistical learning methods on labeled and unlabeled data-rich problems
An ability to create data analytical pipelines and applications in Python
A familiarity with the Python data science ecosystem and the various tools one can use to continue to develop as a data scientist

COURSE STRUCTURE AND SYLLABUS

The class is comprised of a roughly even mix of lectures/instruction and hands-on programming/lab work. The week-by-week breakdown is as follows:

Week 1 | CS/Statistics/Linear Algebra Short Course

We start with the basics. In the CS portion, we briefly cover basic data structures/types, program control flow, and syntax in Python. In the statistics portion, we go over basic probability and probability distributions, along with general properties of some common distributions. As for linear algebra, we cover matrices, vectors, and some of their properties and how to use them in Python.

Week 2 | Exploratory Data Analysis and Visualization

We spend a considerable amount of time using the Pandas Python package to attack a dataset we’ve never seen before and to uncover some useful information from it. At this point, students decide on a course project that would benefit from a data scientific approach. The project must involve public (freely-accessible/usable) data and must answer an interesting question, or collection of questions, about that data. Several resources of free data will be provided.

Week 3 | Data Modeling: Supervised/Unsupervised Learning and Model Evaluation

We learn about the two basic kinds of statistical models, which have classically been used for prediction (supervised learning): Linear Regression and Logistic Regression. We also look at one of the ways from which we can glean information from unlabeled data: clustering using K-Means.

Week 4 | Data Modeling: Feature Selection, Engineering, and Data Pipelines

We switch gears from talking about algorithms to talk about features: what they are, how to engineer them, and what can be done (PCA/ICA, regularization) to create and use them given the data at hand. We also cover how to construct complete data pipelines, going from data ingestion and preprocessing to model construction and evaluation.

Week 5 | Data Modeling: Advanced Supervised/Unsupervised Learning

We delve into more advanced supervised learning approaches, during which we get a feel for linear support vector machines, decision trees, and random forest models for regression and classification. We also explore an additional unsupervised learning approach: DBSCAN.

Week 6 | Data Modeling: Advanced Model Evaluation and Data Pipelines; Presentations

We explore more sophisticated model evaluation approaches (cross-validation, bootstrapping) with the goal of understanding how we can make our models as generalizable as possible. Students complete their data science projects and share their learnings and discoveries.

PREREQUISITES

Students should have some experience with Python and have a passing familiarity with basic statistical and linear algebraic concepts (mean, median, mode, standard deviation, correlation, the difference between a vector and a matrix). In Python, it will be helpful to know basic data structures such as lists, tuples, and dictionaries, and what distinguishes them (that is, when they should be used).

Note: This is a paid evening course at Metis. ENROLL HERE (http://www.thisismetis.com/introduction-to-data-science)

Events in New York, NY

NYC Generative and Agentic AI

See more events

NYC Generative and Agentic AI

public group

Tuesday, July 12, 2016 at 6:30 PM to Tuesday, July 12, 2016 at 9:30 PM EDT

Metis New York

27 East 28th Street, 3rd Floor · New York, NY

NYC Generative and Agentic AI

public group

Intro to Data Science - 6 Week Night Class (REQUIRES PAYMENT)