addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramlinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Analyze US Government Survey Data with R

  • Jan 10, 2013 · 6:30 PM
  • This location is shown only to members

Please join us for our January meetup where Anthony Damico will talk about analyzing US Gov't survey data with R.

The United States Government spends over a billion dollars annually to collect and distribute information about its population. While much of this can be downloaded at no cost, data researchers have historically either relied on inflexible online data query tools (like AmericanFact Finder) or needed to purchase expensive, proprietary statistical software - like SAS, SUDAAN, or Stata - in order to correctly account for complex sample survey designs. A new website - - hosts easy-to-use, obsessively documented syntax to analyze government survey data with free, open-source software (the R language and MonetDB) using reproducible techniques.

Since its inception in mid-2012, a new data set has been added every few weeks. The repository currently includes: Area Resource File, American Community Survey, Basic Standalone Medicare Claims Public Use Files, Behavioral Risk Factor Surveillance System, Consumer Expenditure Survey, Current Population Survey, General Social Survey, Medical Expenditure Panel Survey, National Health and Nutrition Examination Survey, National Health Interview Survey, National Study on Drug Use and Health

Each data set posted follows a basic rubric with these core components:

1) Download Automation - no-changes-necessary programs to download every microdata file from every survey year as an R data file onto your local disk.

2) Publication Replication - match published numbers exactly to show that R produces the same results as other statistical languages.

3) Analysis Examples - fully-commented, easy-to-modify examples of how to load, clean, configure, and analyze the most current data sets available.

This presentation will outline why the R language is well-positioned to become the "lingua statistica" of survey methodologists, how the R survey and sqlsurvey packages work, and how to get started using one of the government survey data sets available in the repository. There will also be a brief introduction to the column-oriented database MonetDB and a new method of communication with the R language. In speed tests on a regular desktop computer, MonetDB was able to analyze the 67 million-record Medicare Claims Public Use Files in about one hundred twenty seconds. For large data sets, MonetDB has been integrated into all survey analysis commands with minimal hassle for the user.

Anthony Damico is a Statistical Analyst at the Henry J. Kaiser Family Foundation, where he conducts data analysis for Marketplace, Medicare, and Medicaid health care policy reports. He has published in peer-reviewed policy and methods journals using the R, SAS, Stata, and SUDAAN statistical programming languages. Prior to joining the Kaiser Family Foundation, Anthony worked as a survey researcher at the Center for the Study of Services in Washington D.C. Anthony holds a Bachelor’s degree in Mathematics from Oberlin College and a Masters in Health Policy from Johns Hopkins University.

Tentative Agenda:

6:30 - 7:00: Networking and food/drinks

7:00 - 7:10: Introduction and announcements

7:10 - 7:25: Warm-up act: VOLUNTEERS NEEDED

7:25 - 8:30: Anthony Damico discussing Government Survey Data

8:45ish: Off to beeR.

Join or login to comment.

  • George

    I liked the way Anthony kept away from the annoying projector and power points. I disagree with his assessment that for the newbies, R would not provide instant gratification. I found as a newbie, between CRAN and the help files I could find the tools AND an example of something similar to what I tried to do, and achieved gratification within minutes.

    January 15, 2013

  • Tony O.

    A brief summary of the event can be found at along with a link to a full review of the presentation, a link to download the audio, and a link to a video of the presentation. Enjoy!

    January 14, 2013

  • Chris G.

    Anthony Damico's website is very cool but 45 minutes was a torturously long time to spend introducing R with bad jokes, as if it were a magic trick.

    January 13, 2013

  • Hans E.

    Excellent, my first meetup. Very lively speaker with great stage presence.

    January 11, 2013

  • Paul S.

    Very good. I was grateful to learn about CSV agents for R, and pleased to learn of pending availability for matrix/linear algebra tools. Yes, I am still impressed with what one can do on a portable computer, since I go back to patch-boards, paper tape and punch cards.

    January 11, 2013

  • Jerome Y.

    Entertaining and informative introduction to the features of R. Informative answers during the Q and A period.

    January 11, 2013

  • Lloyd B.

    Excellent presentation. Wish the speaker had spent a bit more time on MonetDB

    January 11, 2013

  • Abhijit

    Some attendees last night were asking about R and big data. Revolution Analytics has a webinar coming up on using R with Hadoop, by the excellent Jeffrey Breen. Follow the link:

    January 11, 2013

    • Sridhar M.

      Thanks Abijit. I registered. Sri

      January 11, 2013

  • Jon C.

    It was a highly informative and engaging talk on R and the role of open source statistical computing in fostering greater transparency in statistical research.

    January 10, 2013

  • Cid S.

    The speaker was very engaging, but I fear the presentation spent too much time talking about why R is awesome and too little time talking about how to do new things in R. I assume that anyone coming to the R meetup is already converted, even if they're a novice like me.

    1 · January 10, 2013

  • A former member
    A former member

    Big thanks to Anthony for the enthusiastic encouragement of R and his excellent packages + blog for all of us gov't-survey-data-users!

    January 10, 2013

  • steve g.

    The meetup was awesome. As a newbie to R, I was impressed with the preso, the group and great information.

    January 10, 2013

  • Chad C.

    What is SUDAAN?

    January 9, 2013

    • A former member
      A former member

      SUDAAN is more than an add-on to SAS. It is the SUrvey DAta ANalysis statistical package, developed by the Research Triangle Institute (RTI). It used to be a premiere variance estimation statistical package, in the same league as WesVar (from Westat) and VPLX (from the Census Bureau), and is rapidly becoming overshadowed by both SAS and R.

      January 9, 2013

    • A former member
      A former member

      as indiana jones would say, "it belongs in a museum"

      January 9, 2013

  • Peter

    Coursera has a free "Computing for Data Analysis" class starting up this week. It should be a great introduction to R:

    2 · January 3, 2013

  • Daniel

    when I try to go to my malware software says it is a malicious site. Is that the correct url? Is it well vetted? thanks!

    December 19, 2012

    • Daniel

      Here is what my security folks said: This URL contained the Zeus trojan, which captures banking account information, among other things.

      As it is still part of the malware domain infrastructure, this page will continue to be blocked.

      Please notify your source that this domain is bad, and, if legitimate, have it changed to another domain.

      December 19, 2012

    • A former member
      A former member

      i asked mcafee and websense to re-evaluate it the other day, so it should be resolved soon. until then, just pretend it's NSFW ;) --­

      December 20, 2012

  • Mike M.

    My fed agency blocks the site, calls it "forbidden".

    December 19, 2012

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy