For our April Meetup, we're thrilled to have Kalev Leetaru, Yahoo! Fellow in Residence at Georgetown University, talk about data mining at a global-scale. What does it take to build a system that monitors the entire world, analyzing global newsmedia in realtime, compiling catalogs of everything happening in the world and makes that data accessible for analysis, visualization, forecasting, and operational use? What does it take to support querying of a quarter-billion-record-by-58-column database in near-realtime? How do you visualize networks with hundreds of millions of nodes, tease structure from chaotic real-world observational graphs, or explore networks in the multi-petabyte range? How do you process and geographically visualize the emotion of the live Twitter Decahose in realtime? How do you rethink tone mining from scratch to power a flagship new reality television show? How do you adapt systems to work with machine translation, OCR and closed captioning error, and the messiness of real-world data? How do you process half a million hours of television news, five billion pages of historic books, or 60 million images dating back 500 years?
• 6:30pm -- Networking, Empenadas, and Refreshments
• 7:00pm -- Introduction
• 7:15pm -- Presentation and Discussion
• 8:30pm -- Data Drinks (Tonic, 2036 G St NW, Patio)
This talk will pull back the curtain and present a behind-the-scenes view of what its really like to work with really big data. How does one blend the world’s most powerful supercomputers, virtual machines, cloud storage, infrastructure as a service, plus a ton of software, into a single end-to-end environment that supports all of this research? I’ll be deep-diving on the GDELT Project, a catalog of human societal-scale behavior and beliefs across all countries of the world, connecting every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what's happening around the world, what its context is and who's involved, and how the world is feeling about it, every single day. What does it take to build and run a system that monitors the entire world each day and delivers a quantitative model that increasingly powers operational conflict watchboards across the world?
Kalev H. Leetaru is the[masked] Yahoo! Fellow in Residence for International Values, Communications Technology and the Global Internet at the Institute for the Study of Diplomacy in the Edmund A. Walsh School of Foreign Service at Georgetown University. He holds three US patents (cited by a combined 44 other issued US patents) and his work has been profiled in Nature, the New York Times, The Economist, BBC, Discovery Channel and the media of more than 100 countries. His most recent work includes the first in-depth study of the geography of social media and the changing role of distance and location in online communicative behavior around the world (named by Harvard’s Nieman Lab as the top social media study of 2013), the creation of the GDELT Project, a database of more than a quarter-billion georeferenced global events 1979-present and the people, organizations, locations, and themes connecting the world, and the creation of the SyFy Channel’s Twitter Popularity Index, the first realtime character “leaderboard” created for television. Most recently he was named as one of Foreign Policy Magazine’s Top 100 Global Thinkers of 2013. More on his latest projects can be found on his website at http://www.kalevleetaru.com/.
This event is sponsored by the GWU Dept. of Decision Sciences, Cloudera, Statistics.com, IBM Analytics Solution Center, Elder Research, and InformIT. Would you like to sponsor too? Please get in touch!