DataPhilly September 2013 - Data Storytime

Welcome back from summer, and now for something completely different! This month we have two awesome speakers lined up who will be discussing their research and the related datasets. Aaron Masino from CHOP will be discussing his work developing a clinical diagnostic pipeline for whole genome testing. Sadia Afroz will be discussing her work with stylometry and anonymity. Thanks to AWeber for providing food this month!


6:30 PM - 7:00 PM - Food, Networking, and a word from our sponsors

7:00 PM - 7:30 PM - Developing a clinical diagnostic pipeline for whole genome testing, by Aaron Masino

7:30 PM - 8:00 PM - Stylometry and anonymity, by Sadia Afroz

8:00 PM - 8:30 PM - Lightning Talks

8:30 PM - Leave for Nodding Head

More Details:

Developing a clinical diagnostic pipeline for whole genome testing, by Aaron Masino


In this talk, I will cover some of our recent efforts at The Children’s Hospital of Philadelphia’s Center for Biomedical Informatics to develop a clinical diagnostic pipeline for whole genome testing. Specifically, I will discuss research on algorithms that prioritize a patient’s genetic variants relative to patient phenotypes. The algorithms utilize semantic similarity metrics to provide a measure of similarity between the Human Phenotype Ontology terms known to annotate a given gene and those terms describing the patient. Genes are then ranked by their similarity scores. P-value tables describing the probability of randomly obtaining a similarity score greater than or equal to the observed score provide a statistical significance measure of the ranking. The p-value tables are estimates of the true distributions and were generated from roughly 9 billion data points computed using a Scala Akka Monte Carlo simulation deployed on Amazon’s EC2.


Dr. Masino is currently a member of the Center for Biomedical Informatics at the Children's Hospital of Philadelphia where his research includes algorithm development for personalized medicine and intelligent software design and implementation in support of translational research and clinical care. Prior to joining CHOP, Aaron was a senior scientist at SAIC and MZA Associates Corporation, where he developed adaptive optics system concepts, control algorithms, and mathematical atmospheric laser propagation models. He also created innovative simulation and analysis software platforms for adaptive optics system performance prediction. Aaron received a PhD in applied mathematics from the University of Central Florida in 2004. He also holds a master's in aerospace engineering from the University of Colorado and a bachelor's in mathematics from Rutgers University.

Stylometry and anonymity, by Sadia Afroz


In digital forensics, questions often arise about the authors of documents: their identity, demographic background, and whether they can be linked to other documents. The field of stylometry uses linguistic features and machine learning techniques to answer these questions. While stylometry techniques can identify authors with high accuracy in non-adversarial scenarios, their accuracy is reduced to random guessing when faced with authors who intentionally obfuscate their writing style or attempt to imitate that of another author. In my talk I will talk about current authorship attribution techniques and how they can be evaded.


Sadia Afroz is a PhD candidate in Computer Science at Drexel University where she works at the Privacy, Security and Automation Laboratory (PSAL) with Rachel Greenstadt. Sadia is also involved with SCRUB at UC Berkeley.

Directions to Nodding Head

Head South down 15th St (past City Hall)
Walk 4 blocks, turn right onto Sansom St
Nodding Head is on your left (230 ft)

Join or login to comment.

  • Aaron M.

    Thanks everyone for taking time out of your busy schedules to attend the meetup. My slides are available at

    1 · September 28, 2013

  • Sadia A.

    Hi all, thanks a lot for yesterday's meet up. I had a lot of fun discussion our research with you. If you are interested, my slides for yesterday are in here:

    3 · September 27, 2013

  • Oliver

    Very enjoyable presentations.

    September 27, 2013

  • Russell M.

    Great meetup!
    I didn't get the chance to mention three opportunties with our Masters program at our college (near Atlantic City).
    1. We are interested in finding real data science problems from industry or academia that our graduate students could work on for as part of a practicum. If you have a data set and a question but don't have the time/resources we would like to help you out for free.
    2. We are looking for experienced data scientists and analysts to serve on our program's Advisory Board. This would be service to the data science community on your part and is something that you could put on your resume.
    3. We are also in interested in hearing from data scientists and analysts who would like to adjunct in our program. Courses can be offered online so there would be no need to travel to AC. Again something for your resume - a chance to get some teaching experience and in this case cold, hard cash!
    Send me a PM if any of these appeal to you and we can discuss.

    September 27, 2013

    • David J W.

      Russell - Very interested in participating as a Advisor and consider the role as adjunct - for 2014 to 2015. My background is in data and analytics for health care. That applies across payers, providers and patients. I have teaching experience at the session level with Wharton and Sloan. Also have a series of publications and papers.

      1 · September 27, 2013

    • Jeffrey N.

      Hi Russell - I am interested in all of these things, wherever you need help with. I lived out there a few years back, didn't know Downbeach had a Data Science community :). Reach out when you can.

      September 27, 2013

  • john A.

    content rich presentations; good discussion; unobtrusive & excellent leadership by Mike

    September 27, 2013

  • Jake

    Great presentation! Really interesting topics & analysis. Thanks for sharing.

    September 26, 2013

  • Russell M.

    Can anyone suggest the best place to park for this meetup?

    September 24, 2013

    • Oliver

      Last time at this building I found a ton of street parking a couple blocks away north of Ben Franklin Parkway.

      1 · September 25, 2013

    • Russell M.

      Cheers guys

      September 25, 2013

  • Jeffrey N.

    wish I could make this meeting, looks great! class that night :(

    September 18, 2013

  • Conrad M.

    Analyst with M.S in Business Intelligence. Interested in data science, informatics and all sorts of data analysis to create value and competitive advantage.

    September 12, 2013

People in this
Meetup are also in:

Create your own Meetup Group

Get started Learn more

Meetup has allowed me to meet people I wouldn't have met naturally - they're totally different than me.

Allison, started Women's Adventure Travel

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy