NYC 311 and Socrata Queries and understanding the data

From: Ralph Y.
Sent on: Wednesday, May 7, 2014 3:00 PM
Thanks Andrew for posting about NYC 311 data.


I believe this is the correct link to start.

But when I sort by created_date on the socrata site it shows entries from 2003.  The title says 2010 to present.  More data is better so I'm only asking because when I download as CSV.  I only receive 2 million records (actually,  2,117,547) but if you run a count(*) query using Socrata using this link:


[{u'count': u'7349826'}]

You receive over 7 million records from Socrata via the API

Is the download limited?

Also the socrata query language seems to be limited. See http://dev.socrata.com/docs/queries.html

For example, I'd like to select year(created_date) and group by that year but there seems to be no way to do this with the Socrata query language.

Since the download from Socrata  only gave me 2 Million records I was only able to load those records into BigQuery and I made them public here: https://bigquery.cloud.google.com/table/personal-real-estate:nyc.311?pli=1

The YEAR function is supported in BigQuery so a query such as 

SELECT count( Complaint_Type ), Complaint_Type, YEAR(Created_Date) as year1 FROM [nyc.311] group by Complaint_Type,year1 order by 1 desc

is possible and easy.

And it returns:   Heating complaints are the top complaint and the download data from Socrata only goes back to 
[masked]:49:00 UTC

Query Results2:56pm, 7 May 2014
Rowf0_Complaint_Typeyear1 
1156475HEATING 2013 
2100650GENERAL CONSTRUCTION2013 
397597 HEATING2014 
484084Street Light Condition2013 
577272PLUMBING2013 
670594PAINT - PLASTER2013

Inline image 1

This email message originally included an attachment.

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy