Update: This workshop has been rescheduled from April 5th to April 19th.
Data Community DC and District Data Labs are excited to be offering a Building Data Apps with Python workshop on April 5th, 2014.
Python is one of the most popular programming languages for data analysis. Therefore, it is important to have a basic working knowledge of the language in order to access more complex topics in data science and natural language processing. The purpose of this one-day course is to introduce the development process in python using a project-based, hands-on approach.
Note: This course is focused purely on Python development in a data context for those who aren’t familiar with Python. Other courses like Python Data Analysis focus on data analytics with Python, not necessarily on Python development itself.
The main workshop will run from 11am - 6pm with an hour break for lunch around 1pm. For those that are new to programming, there will be an optional introductory session from 9am - 11am aimed at getting you comfortable enough with Python development to follow along in the main session.
The price per attendee is $220.
Introductory Session: Python for New Programmers (9am - 11am)
The morning session will teach the fundamentals of Python to those who are new to programming. Learners would be grouped with a TA to ensure their success in the second session. The goal of this session is to ensure that students can demonstrate basic concepts in a classroom environment through successful completion of hands-on exercises. This beginning session will cover the following basic topics and exercises:
Object Oriented Programming
Write a function to determine if input is even or odd
Read data from a file
Count the words/lines in a file
At the end of this session, students should be familiar enough with programming concepts in Python to be able to follow along in the second session. They will have acquired a learning cohort in their classmates and instructors to help them learn Python more thoroughly in the future, and they will have observed Python development in action.
Main Session: Building a Python Application (11am - 6pm)
The afternoon session will focus on python application development for those who already know how to program and are familiar with Python. In particular, we’ll build a data application from beginning to end in a workshop fashion. This course would be a prerequisite for all other DDL courses offered that use python.
The following topics will be covered:
Basic project structure
virtualenv & virtualenvwrapper
Building requirements outside the stdlib
Testing with nose
Ingesting data with request.py
Munging data into SQLite Databases
Some simple computations in Python
Reporting data with JSON
Data visualization with Jinja2 and Highcharts
We will build a Python application using the data science workflow: using Python to ingest, munge, compute, report, and even visualize. This is a basic, standard workflow that is repeatable and paves the way for more advanced courses using numerical and statistical packages in Python like Pandas and NumPy. In particular, we’ll use and fetch data from Data.gov, transform it and store it in a SQLite database, then do some simple computation. Then we will use Python to push our analyses out in JSON format and provide a simple reporting technique with Jinja2 and charting using Highcharts.
Although this is an introductory course, some prerequisites are required. You should have Python installed and some basic familiarity with it in addition to familiarity with the command line. You should also create a Github account if you don't already have one.
The following are suggested tasks to perform before this course:
Installing Python: https://wiki.python.org/moin/BeginnersGuide/Download
Install virtualenv and virtualenvwrapper: http://docs.python-guide.org/en/latest/dev/virtualenvs/
Getting familiar with Python:
Python Hello World: http://www.learnpython.org/en/Hello,_World!
Python Basics: http://www.codecademy.com/tracks/python
Getting familiar with the command line:
Using the terminal: http://cli.learncodethehardway.org/book/
Get a Github account: https://github.com/
Benjamin is an experienced Data Scientist and Python developer who has worked in military, industry, and academia for the past eight years. He is currently pursuing his PhD in Computer Science at The University of Maryland, College Park, doing research in Metacognition and Active Logic. He is also a Data Scientist at Cobrain Company in Bethesda, MD where he builds data products including recommender systems and classifier models. He holds a Masters degree from North Dakota State University where he taught undergraduate Computer Science courses. He is also adjunct faculty at Georgetown University where he teaches Data Science and Analytics.
Sarah is a Junior Engineer at Cobrain company in Bethesda, Maryland where she works on the data ingestion pipeline. She is a former math teacher with an MA in Education from Seattle University and an aspiring data scientist who seeks to inspire people of diverse, traditionally non-technical backgrounds to learn how to program.
District Data Labs is comprised of several Data Community DC members focused on providing data science educational offerings to help others in our community enhance and expand their existing technical and analytical skills.
For those that are driving, the best parking option we have found in the area is the garage behind the SunTrust building on the Southeast corner of Glebe Rd. and Fairfax Dr.