Next Meetup

Intro to Data Science Bootcamp (2 Weekends)
DESCRIPTION This bootcamp is designed to introduce you to data exploration using the Python programming language. Through this intense, weeklong program you will begin your mastery of the skills necessary to manipulate, visualize, and explore datasets to extract valuable insights. Expert Instructor This course is taught by Ted Petrou, an expert in exploring data with Python through Pandas. He is the author of Pandas Cookbook, a thorough step-by-step guide to accomplish a variety of data analysis tasks with Pandas. Small Class Size This is a small class with at most 12 participants, which will allow all students to fully participate and ask questions that will be answered quickly. When Dec 15, 16, 22, 23: 9 a.m. - 5 p.m. Structure of Course Learning is accomplished by working through difficult assignments and receiving and reviewing modeled solutions. Using a 'flipped classroom', students will prepare and read each day's material before coming to class. In class, students will rotate from instructor guided lessons to student-focused exercises and projects. The instructor will personally review code and give feedback on course assignments. Approximately 300 short answer questions with detailed solutions will be available. Syllabus Before the Course: Students will need to set aside 10 - 20 hours to set up the programming environment and to complete a thorough overview of the fundamentals of Python. An additional webinar will be held the week before the bootcamp to ensure all students are completing this assignment. Day 1: Introduction to Pandas - Selecting Subsets of Data Perhaps the most popular and widely used open-source data wrangling tool of the times, the Pandas library and its main data structures, the Series and DataFrame will be introduced. Selecting subsets of data is a very common yet confusing task that must be mastered in order to be effective with Pandas. Day 2: Split-Apply-Combine Insights within datasets are often hidden amongst different groupings. The split-apply-combine paradigm is the fundamental procedure to explore differences amongst distinct groups within datasets. During the week: Tidy Data Real-world data is messy and not immediately available for aggregation, visualization or machine learning. Identifying messy data and transforming it into tidy data (as described by Hadley Wickham) provides a structure to data for making further analysis easier. Day 3: Exploratory Data Analysis Exploratory data analysis is a process to gain understanding and intuition about datasets. Visualizations are the foundations of EDA and communicate the discoveries within. Matplotlib, the workhorse for building visualizations will be covered, followed by pandas effortless interface to it. Finally, the Seaborn library, which works directly with tidy data, will be used to create effortless and elegant visualizations. Day 4: Applied Machine Learning After tidying, exploring, and visualizing data, machine learning models can be applied to gain deeper insights into the data. Workflows for preparing, modeling, validating and predicting data with Python's powerful machine learning library Scikit-Learn will be built. Post-Course Education It is vital to continue practicing the skills covered immediately upon course completion. Three additional open-ended data analysis assignments will be given to students to help practice their new skillset. Additionally, a specific curriculum is given to the students to help guide them towards an entry-level position as a data analysts or data scientist. Instructor Ted Petrou is the author of Pandas Cookbook and founder of both Dunder Data and the Houston Data Science Meetup group. He worked as a data scientist at Schlumberger where he spent the vast majority of his time exploring data. Ted received his Master's degree in statistics from Rice University and used his analytical skills to play poker professionally and teach math before becoming a data scientist.

The Cowork Lab

2500 Yale Street, Suite B · Houston, TX


Upcoming Meetups

Past Meetups (55)

What we're about

Houston Data Science is a Meetup group by and for Houston's data science community.

Our goal is to foster Houston's data science community by providing a forum for learning, teaching, and networking. We welcome both beginners and experts, as well as those simply interested in learning more about data science and the Houston data science community.

Members (3,547)

Photos (84)