Data Science at GMU and Elsevier Research Data Services

•dial in[masked]

pass [masked]


No funds for parking validation  tonight!  ARRGH

 6:30 pm 6:30 p.m.Welcome and Introduction Slides

• 6:35 pm Continue Data Science Tutorial: Practical Data Science for Data ScientistsData Science Students and Careers See Professor Dr. Kirk Borne of George Mason University SlidesGraduate Students Working on Semantic Medline-YarcData Projects: GMU Updates Master's Program for Data Science and Sarah Soliman, Rand, and IV MOOC Student Project (invited-rescheduled to June)

• 7:00 p.m. Brief Member Introductions

• 7:010 pm Big Data - Forward - Backward, Charles Randall Howard, Adjunct Professor in the Applied IT Department and Sr. Data Scientist at Novetta Solutions. Slides

Professor Howard has a Ph.D. in Information Technology from George Mason University, a M.S. in Information Systems from Virginia Commonwealth University, and a B.S. in Information Systems from Virginia Commonwealth University.  As a Sr. Data Scientist at Novetta Solutions, he guides Big Data Science initiatives to solve problems and seize opportunities towards enabling organizations in realizing Big Data benefits across their entire organization (vs. just a few data scientists). He focuses on bringing technology back to making businesses more efficient and effective in delivering results.  Previous experience includes Principal Data Scientist at Berico Technologies, Principal Semantic and Knowledge Scientist at Boeing-SMSI, Principal Consultant at SRA International, INC./Raba Technologies; Vice President of Engineering at Tech I2; and Principal Software Engineer at Raytheon.

• 7:45 pm Stories that Persuade, Anita de Waard, VP Research Data Collaborations at Elsevier Research Data Services/University of Utrecht. Slides. Also see Looking for Data: Finding New Science and Ten Habits of Highly Effective Data

Anita de Waard has a background in experimental physics. She joined Elsevier as publisher in physics and neurology in 1988, and since 1997 she is employed as a Principal Researcher for Disruptive Technologies in the Labs group. Her main focus is the development of innovative product concepts, with a specific interest in establishing collaborations between Elsevier and academic groups in information and computer science. In 2003, Anita founded and ran the Reed-Elsevier Data Standards Group. Her interests include the application of Semantic Web technologies for scientific communication, and the development of a new, semantic form for the scientific article.

• 8:30 p.m. Open Discussion

• 8:45 p.m. Networking

• 9:00 p.m. Depart

Join or login to comment.

  • Brand N.

    David, Thanks for excellent links which are very relevant to our work on data publications:

    Quickly search and analyze billions of public records published by governments, companies and organizations: http://enigma.io/

    Visual document mining for journalists: http://overview.ap.org/

    RMarkdown language used to create 'living' research documents: http://rpubs.com/dabata/17384

    May 22

  • Brand N.

    Orest, Thanks for coming last night. I have high praise for a HP Vertica webcast I participated in recently and wrote about:
    http://semanticommunity.info/Data_Science/Earth_Insights_from_Big_Data We would welcome a presentation like that with Conservation International.

    1 · May 22

  • Brand N.

    My summary comments are (continued): In our next Meetup will will talk about the use of ontology as a knowledge representation for organizing and relating concepts and then trying to reason and infer new facts. We already had an example (HealthCare.gov) of knowledge modeling tooling (Be Informed) that "automates" ontology development and the essential role of ontology (UMLS) in knowledge discovery in RDF triple stores (Semantic Medline in YarcData). Semantics like ontology have an essential role in Big Data Ecosystems, Federation, and Integration as we will see in the next two Meetups.

    May 22

  • Brand N.

    My summary comments are (continued): My experience with US EPA Administrator William Ruckelshaus was as follows: You as scientists are to give me the best description of the scientific problem (e.g. acid rains effects on lakes and streams), and I as the Administrator are to make proscriptions of what to do about that to the President and Congress. You can even tell me we have to collect better data (we did) to accomplish your work and I have to support that, but you as scientists should steer clear of the politics of proscription.

    1 · May 22

  • Brand N.

    Thank you for all of the comments below. My summary comments are:
    Anita said that knowledge is really in the researcher's head and not the textual papers (their notebooks could be more useful, but still problems getting at it). Stories could/should be the best source, especially when based on real facts (data), and statistics if there is enough data, but only to support statement like compared to what and not absolutes. Computers, natural language processing, Watson, etc. all try to help automate this but have limitations.

    1 · May 22

  • Todd S.

    Brand, a 'small data' question: where can I find Antia's slides?

    May 21

  • John Eric H.

    part 3

    Comments:

    On Todd's question regarding Watson as a counter-example to Anita's statement that computer's don't have the hardware for dealing with Triples. I suspect Anita (and she can correct me if I am wrong on this) wanted to say computer's don't have the context (currently) to handle this sort of approach. I don't think this is a hardware issue.

    There were some other things I would have mentioned last night had there been more time for discourse.

    May 21

  • John Eric H.

    part 2.

    For Anita (or anyone else who wants to chime-in) on the mention of the escalation of credulity over time for uncorrorborated or less-than-certain relationships cited in publications: do you see this as good, bad or context dependent? It wasn't clear to me from the talk. In context, it seems a case could be made that, over time, the likelihood of a "tentative claim" being true increases if it does not meet with any serious resistance. On the other hand, that seems both a fuzzy problem space and a heuristic that would be hard generalize.


    Mostly off-topic, I also like to use baseball data for examples but don't have a good analogy in the space for threat (e.g., comparing a facet of baseball analytics with fraud analytics) and was wondering if Orest, Brand (or anyone else) had one that holds up. I'd be happy sure some of the use-cases I use in the domain.

    May 21

  • John Eric H.

    I have a handful of questions and comments regarding last night's meet-up. Thanks to Anita and Randy for taking the time to share. Thanks to Brand and Orest for mentioning baseball data multiple times. Thanks to David for pointing me to enigma.io which, at least superficially, looks promising. Thanks to Xcelerate for their part in this.

    Questions:

    This meet-up seemed to focus heavily on data management (with the exception of the first half of Anita's presentation); should I expect this theme in future meet-ups? [I went through the 1000 character limit and need to make multiple posts]

    May 21

  • Orest Roman S.

    I want to thank everyone who attended. As it was my first meeting I was very impressed with the speakers and the venue. I am looking forward to working with everyone and future meetings.

    May 21

  • David M.

    I was looking for Spotfire pricing and found this: http://apandre.wordpress.com/2013/12/14/spotfire-cloud-pricing/. Note the comparisions to Tableau.

    May 21

  • Niels N.

    Yesterday I mentioned the MarkLogic conference coming to Nationals Park on June 24. Here's a link to register for the event. http://www.marklogic.com/events/marklogic-world-tour-2014-washington-d-c/

    May 21

  • David M.

    Another relevant seeming link is for Overview (http://overview.ap.org/) which is a Visual Document Mining tool.

    May 21

  • k g.

    Anita was just amazing

    1 · May 21

  • David M.

    The meeting was great. I mentioned http://enigma.io/ which collects 70K public datasets into one easy to use exploration web interface. I also mentioned RMarkdown which is a language used to create 'living' research documents. http://rpubs.com/dabata/17384 is an example of its use.

    1 · May 20

  • Andrea W.

    Need slide numbers from the speakers

    May 20

  • Andrea W.

    Finally in... :-)

    1 · May 20

  • Andrea W.

    I just keep getting a busy signal for the dial-in.

    May 20

  • k g.

    dial in[masked]

    pass [masked]

    May 20

  • Brand N.

    Remember parking on Level P2 near 8450, take the ticket, and bring it for validation.

    May 20

  • Brand N.

    The call-in is:[masked] and the PIN is:[masked]

    1 · May 20

  • Brand N.

    I am in discussions with MIchael Stonebraker for a June 30th or July 7th Meetup presentation. For background see: http://www.scidb.org/

    Paradigm4, the company that develops, supports and builds SciDB: http://www.paradigm4.com/
    and SciDB Community Forum: Visit the SciDB Forum to download the open-source Community Edition of SciDB and interact with other SciDB users and developers: http://www.scidb.org/forum

    Also see: http://en.wikipedia.org/wiki/SciDB, http://en.wikipedia.org/wiki/Michael_Stonebraker, and: http://www.theregister.co.uk/2010/09/13/michael_stonebraker_interview

    And his newest company: http://www.tamr.com/

    I told him our 200+ members are interested in both academic (NSF, NIH, etc. funding) and consulting activities (Federal Government and Private Industry) which suggests that he could provide an overview and context for all the above and then some specific examples of how to work with both MIT and Tamr.

    May 19

  • Brand N.

    Our Meetup is tomorrow night with excellent presenters:

    1. GMU Professor Howard and Sr. Data Scientist at Novetta Solutions whose slides have been posted and include an exercise and case studies.

    2. Anita de Waard, VP Research Data Collaborations at Elsevier Research Data Services/University of Utrecht, has traveled here on business and included us in here itinerary.

    The White House - MIT Final Workshop Report on Big Data and Privacy has been published: http://web.mit.edu/bigdata-priv/images/MITBigDataPrivacyWorkshop2014_final05142014.pdf

    May 19

  • Brand N.

    We past 200 members this week! Kate and I thank you for your support.

    Next week's Introduction and Tutorial Slides are posted:
    http://semanticommunity.info/@api/deki/files/29355/BrandNiemann05202014.pptx

    The summer schedule shifts to one Meetup a month starting on Monday June 2nd at the same time and place, and the July and August Monday dates will be announced later in the summer.

    May 14

Our Sponsors

People in this
Meetup are also in:

Create a Meetup Group and meet new people

Get started Learn more
Bill

I started the group because there wasn't any other type of group like this. I've met some great folks in the group who have become close friends and have also met some amazing business owners.

Bill, started New York City Gay Craft Beer Lovers

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy