Past Meetup

Cloudera & Lucidworks: SolrCloud Failover, Testing, and Integration with Hadoop

This Meetup is past

100 people went

Location image of event venue


Join us in Palo Alto on Tuesday, July 15th, for an evening of talks from Cloudera's Mark Miller and LucidWorks' Yann Yu around new tools that make SolrCloud replica failover and testing, as well as integrating Solr with Hadoop, easier and more effective. As usual, there will be food and beer provided. Hope to see you there!

6:00pm-6:30pm: Networking & Refreshments

6:30pm-7:30pm: Presentations

"A Bit About SolrCloud Replica Failover and SolrCloud Automated Testing", Mark Miller, Cloudera

Summary: In SolrCloud today, when a Solr replica goes down, a user must intervene to get the replication factor back up. In this talk, you will learn about a new feature that is being adding that automatically creates new replicas to enforce the configured collection replication factor using healthy cluster nodes. This is especially interesting when using a shared file system like HDFS.

The automated tests for SolrCloud have come a long way over the years from their very humble beginnings. You will be given a few tips on SolrCloud test debugging and development, hear about some of the current testing limitations and history, and get an overview of where the testing needs to go.

Speaker: Mark Miller is a Lucene/Solr committer and PMC chair / member as well as an Apache member. After starting with Lucene in 2006, Mark has spent most his time getting paid to work on the open source software projects that he loves. Mark has given many talks on Lucene/Solr at various conferences and meetups around the world and is currently learning all about Hadoop as a software engineer at Cloudera.

"Using the Solr Hadoop Connector to Integrate Solr and Hadoop", Yann Yu, LucidWorks

Summary: Yann will discuss the new LucidWorks open-source Hadoop connector for Solr and how to use it to integrate your Hadoop cluster with open-source Solr search. He will cover how the Hadoop connector can leverage MapReduce to quickly modify and process documents to send to Solr, what kinds of additional document parsing and language processing the Hadoop connector is capable of, and how to scale Solr to handle the high volume of documents and data coming from Hadoop. This talk will include live demonstration of indexing data from Hadoop into Solr, and is intended as a practical introduction and primer to implementing this in your own environment.

Speaker: Yann Yu is a Systems Engineer at LucidWorks and brings several years of experience in natural language processing, internationalization, and search to the LucidWorks team. As part of the Professional Services group at LucidWorks, he has helped to size, deploy, and troubleshoot Solr installations for over fifty customers. Previously a software developer at Basis Technology, Yann worked extensively on Rosette Entity Extractor and Rosette Language Identifier products.

7:30pm-8:00pm: Open announcements, Wrap-up

See all Meetups from SFBay Apache Lucene/Solr Meetup