After the successful hackday (by the way we're still hoping to get the application we built live, can you help?) we're back to the usual evening Meetup. Our first talk is by Tom White:
"Recently Apache Solr has been integrated into the Hadoop ecosystem to provide full text search at "big data" scale. This talk will give an overview of how Cloudera has tackled integrating Solr into the Hadoop ecosystem and highlights some of the design decisions and future plans. Learn how Solr is going to get closer to Hadoop, which contributions are going to what project, and how you should consider tackling search at Hadoop scale in the future. "
Tom White is a Software Engineer at Cloudera, has been an Apache Hadoop committer since February 2007 (and is a PMC Member), and is a member of the Apache Software Foundation. He is also the author of the best-selling O'Reilly book, "Hadoop: The Definitive Guide." Tom has a Bachelor's degree in Mathematics from the University of Cambridge and a Master's in Philosophy of Science from the University of Leeds, UK.
For our second talk I'll be discussing some of the work we've done recently for media monitoring companies, and in particular how we've developed a way of applying tens of thousands of stored queries to a document in around a second. We'll shortly be open-sourcing some of the core technology behind this idea (based on a branch of Apache Lucene). We're hoping this will be useful not just for those who need to monitor incoming documents, but for automatic classification and categorisation as well.
PLEASE NOTE we have a new venue this time as the usual place is booked up. We'll have a bar tab and some free nibbles as usual. We'll be in the room at the top of the stairs.