Skip to content

Simple Fuzzy Name Matching in Solr & What's New in Solr 5

Photo of Basis Technology
Hosted By
Basis T.
Simple Fuzzy Name Matching in Solr & What's New in Solr 5

Details

We're hosting a meetup (first in a long time) featuring talks by David Murgatroyd (http://www.linkedin.com/in/dmurga) and Brian Sawyer (http://www.linkedin.com/pub/brian-sawyer/21/6a1/b40) of Basis Technology and Anshum Gupta (http://www.linkedin.com/in/anshumgupta) of LucidWorks

ABSTRACTS

From Basis Technology:
Simple Fuzzy Name Matching in Solr

We all know normalization is crucial to delivering high quality search results. We don’t want uninteresting variations between the query and the document to lead to missed hits (e.g., “celebrity” v. “celebrities”). Normalization of dictionary words is well understood, but what if your application focuses on names? Whether you’re tackling patent examination, sports records, e-commerce, watchlist screening or many other topics, names are often the key. Can your users find “Abdul Jabbar, Karim” if they search for “Kareem AbdalJabar” or “كريم عبد الجبار”? Solr application architects have attempted to address this through custom integration of nickname lists, edit distance, case normalization, phonetic encoding and n-grams (see example #1 (http://stackoverflow.com/questions/5516503/searching-names-with-apache-solr) or example #2 (http://www.searchtechnologies.com/name-searching)), but doing so requires significant effort and may not address all desired variations. A simpler approach is to use a Solr field type for names that handles these linguistic nuances behind-the-scenes. We’ll talk about how we built this sort of field type via a Solr plug-in for the Rosette Name Indexer. We’ll also discuss examples of use cases this has enabled, how it can be tuned if necessary, and how it connects to the broader trend of entity-centric search.

Presented by:
by David Murgatroyd (http://www.linkedin.com/in/dmurga) and Brian Sawyer (http://www.linkedin.com/pub/brian-sawyer/21/6a1/b40) of Basis Technology

From Lucidworks
What's New in Solr 5

Apache Solr has matured in a variety of ways over the past releases. Not only has it become more scalable and mature, it has also become much easier to use. While Solr 4.0 introduced a new design and architecture in terms of SolrCloud, Solr 5.0 is much about hardening and usability improvements. In this presentation, Anshum will talk about what's new in Solr 5.0. He will cover usability improvements like bash scripts to config APIs and scalability improvements like splitting of the clusterstate, along with interesting new features like distributed IDF and additions to the Stats component that will be out with the upcoming release.

Presented by:
Anshum Gupta (http://www.linkedin.com/in/anshumgupta) of LucidWorks

Photo of Boston Apache Lucene and Solr Meetup group
Boston Apache Lucene and Solr Meetup
See more events
Basis Technology
One Alewife Center · Cambridge, MA