Re: [betaNYC] A Day in the Life of a New York City Taxi

From: Chris W.
Sent on: Sunday, July 20, 2014 2:10 PM
Related to this thread, there is a great Taxi Complaint app entered into Big Apps by Jeff Novich, and he has an outstanding blog post here about his many experiences filing complaints, FOILing complaint data,  and actually following-up with phone-based Taxi hearings.  

The app essentially files a 311 complaint on your behalf and helps you track it.  In the blog, he makes some great analogies to Uber and how their drivers must maintain good customer reviews if they want to stay employed.  

It's worth a read.  

-Chris


On Thu, Jul 17, 2014 at 4:09 PM, Jeremy Barth <[address removed]> wrote:
Having people with different backgrounds look at these issues is a very good idea.  Reading through the privacy literature, I'm struck by how tricky it is to get things right and how specialized the issues can be.  HIPAA, in particular, has driven a great deal of the recent legal, policy and technical research.  A good resource that's specific to the health field but has broader applicability is Anonymizing Health Data.

Many people (including me) tend to use specialist terms vaguely and inconsistently.  In the open data arena, that can lead to trouble.  Anonymity, pseudonimity, masking and de-identification are commonly-used terms that mean different things to different people.  Developers in an IT shop might be speaking with a medical privacy specialist from an institutional review board and not mean the same thing when referring to "anonymization".  One thing I think Beta-NYC can help with is to clearly define some of these terms (and give concrete examples).

Whether to "anonymize", what to "de-identify" and how to do it safely would, superficially, seem to involve different kinds of experts (e.g. lawyers / ethicists, database privacy specialists, programmers).  Not having followed the development of NYC's open data laws closely, I'm curious whether any of this was spelled out or just left to regulatory discretion.  In the biomedical research world, for example, numerous specialists from different disciplines meet in advance to discuss how to deal with data privacy and in some cases (HIPAA) there may actually be specific laws that need to be followed, e.g. Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule.

As we've seen with the NYC taxi data, technical people who aren't expert in crypto continue to make the same mistakes over and over.  Andrew discusses this a bit and there's an interesting discussion on another of Ed Felten's (co-author of the paper cited by Ariel) blog posts on whether hashing-based anonymization ever works.  My reading of the comments section is that using a "secret" key as the salt (as opposed to the plaintext salt that's used in password applications) might work and might be less computationally-intensive than "re-hashing the hash" (one of Andrew's suggestions).  Andrew raises other interesting institutional workflow questions that perhaps Beta-NYC can think through and provide suggestions about.



On Thu, Jul 17, 2014 at 12:24 PM, Joel Natividad <[address removed]> wrote:
Love all the stuff you dug up on privacy Ariel!

As your research and this thread shows, privacy is central to the issue of open data - a key pillar of the Civic Tech community.

Perhaps we can invite experts like Arvind Narayanan, and folks who deal with publishing Open Data on a daily basis to a BetaNYC privacy working group and a series of talks?

A lot of these experts are already members of BetaNYC - Andrew with OpenNY, Steve as co-chair of W3C Data on the Web Best Practices Working Group, our friends at MODA and GovLab, Philip Ashlock as chief architect of Data.gov and core member of Project Open Data, etc. 

As Andrew and Noel pointed out, there are a couple of open govt laws being proposed in the City Council right now.  Maybe we can even create a formal paper with the group's policy recommendations?

- Joel



=======================================================
Think Different! (http://en.wikipedia.org/wiki/Think_different#Text)
Imagine Different! (http://www.youtube.com/watch?v=H5tOgRD4EqY)


On Wed, Jul 16, 2014 at 5:46 PM, Ariel <[address removed]> wrote:
Great conversation, everyone! Thoughtful, respectful debate is one of the best parts of BetaNYC. I was curious about the question of are the cab drivers "opting in" to their data being collected so did a little digging. 

The technology system that logs all trips, allows passengers to pay by credit card, etc. is dictated as part of the TLC official rules, which all taxi drivers agree to. See pages 56-59: http://www.nyc.gov/html/tlc/downloads/pdf/rule_book_current_chapter_54.pdf Also on the NYC rules site: http://rules.cityofnewyork.us/content/section-54-24-vehicle-trip-records It explicitly states what data is being collected. 

I had a hard time finding any place that dictates how the data is used, other than stating it goes into a database. Interestingly, cab drivers were manually recording trip data before the GPS devices were added. 

Recently, cab drivers have been upset with the TLC, saying the data is being used to punish them, claiming it is against the 4th amendment, and filing a lawsuit against the city. The case is currently being appealed, but the legal documents are an interesting read. The decision from the United States District Court in January: http://www.capitalnewyork.com/sites/default/files/GPS%20Decision.pdf and the appeal document from May: https://www.rutherford.org/files_images/general/05-08-2014_HEN-Appeal-Brief.pdf

Kate Crawford also just tweeted this article by two Princeton professors on re-identification in data: http://randomwalker.info/publications/no-silver-bullet-de-identification.pdf
"Data privacy is a hard problem. Data custodians face a choice between roughly three alternatives: sticking with the old habit of de-identification and hoping for the best; turning to emerging technologies like differential privacy that involve some trade-offs in utility and convenience; and using legal agreements to limit the flow and use of sensitive data. These solutions aren’t fully satisfactory, either individually or in combination, nor is any one approach the best in all circumstances.

Change is difficult. When faced with the challenge of fostering data science while preventing privacy risks, the urge to preserve the status quo is understandable. However, this is incompatible with the reality of re-identification science. If a “best of both worlds” solution exists, deidentification is certainly not that solution. Instead of looking for a silver bullet, policy makers must confront hard choices."



Ariel Kennan
2013 Code for America Fellow
[address removed]







--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Ariel ([address removed]) from #betaNYC, a Code for America Brigade for NYC.
To learn more about Ariel, visit his/her member profile

To report this message or block the sender, please click here
Set my mailing list to email me As they are sent | In one daily email | Don't send me mailing list messages

Meetup, POB 4668 #37895 NY NY USA 10163 | [address removed]





--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Joel Natividad ([address removed]) from #betaNYC, a Code for America Brigade for NYC.
To learn more about Joel Natividad, visit his/her member profile
To report this message or block the sender, please click here
Set my mailing list to email me As they are sent | In one daily email | Don't send me mailing list messages

Meetup, POB 4668 #37895 NY NY USA 10163 | [address removed]





--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Jeremy Barth ([address removed]) from #BetaNYC, a member of Code for America's Brigade Program.
To learn more about Jeremy Barth, visit his/her member profile

To report this message or block the sender, please click here
Set my mailing list to email me As they are sent | In one daily email | Don't send me mailing list messages

Meetup, POB 4668 #37895 NY NY USA 10163 | [address removed]

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy