David Portnoy, HHS IDEA Lab External Entrepreneur (see Washington Post Article: U.S. Turns to Private Sector, HHS IDEA LAB: Demand-Driven Open Data and Discussion

Brand Niemann, Data Science for EPA Big Data Analytics and Data Science for EPA Fracturing Data

Note: DJ Patil, President's Chief Data Scientist (Invited-Conflict-Later Date)

President Obama has personally appointed and introduced a new Chief Data Scientist, Dr. DJ Patil, who has outlined his new program focusing on four activities and three priority areas.

In support of him, I am developing a Data Science for EPA Big Data Analytics ( Data Product and Meetup in cooperation with EPA using EPA Ecosystem Data to answer not only EPA's Ethan McMahon's excellent questions (see below), but address the broader matter of:

Turning Data Into Value

Organizations do not fear a shortage of data. Systems, applications, and devices all produce and consume exponentially increasing amounts of it. The Federal Big Data Working Group Meetup Data Scientists help organizations use their data to best advantage by integrating, analyzing, and acting on it via event processing as follows:

• Integration provides the right data to the right system or person in real time.

• Analytics lets users develop insights using vast amounts of data to understand the past and anticipate the future.

• Event processing combines the knowledge gained from analytics with real-time information to identify patterns of events and act to bring about the best outcomes.

As you can see, analytics (of Earth Science Data from EPA, etc.) is just one of three elements to support the President’s new Chief Data Scientist.

DJ Patil and Hilary Mason have recently written an excellent short booklet on Data Driven: Creating a Data Culture (

EPA's Ethan McMahon's excellent questions are:

EPA is planning to stand up a big data analytics service within the agency. We’d appreciate ideas from the community in a few areas:

1. What problems have you tried to solve using data analytics and/or visualization?

2. Are there any strategies or best practices you used to manage data within or between enterprise data systems?

3. What techniques make sense for integrating large or varied data from multiple sources?

4. What technologies have you used and how did you select them?

5. Did you use any particular training resources for using big data analytics systems, and if so which ones?

6. What lessons would be helpful for us to learn as we set up this service?

We're open to your ideas and we're ready to share what we have learned.

