Smarter Search, Big Data, and Machine Learning


Details
For our holiday meeting, we're fortunate to have Lucidworks' Chief Data Engineer Jake Mannix, who tells me he lives at the intersection of search, recommender-systems, and applied machine learning, with an eye for horizontal scalability and distributed systems. Currently Chief Data Engineer at Lucidworks doing research and development of data-driven applications on Lucene/Solr and Spark.
To create "smarter search" in today's search engines, we need to understand a) what our users are looking for, b) what kinds of questions they've generally been interested in the past, c) what previous attempts to satisfy this search question have been, d) what others have been feeling were successful answers to this (and related) questions. These questions are difficult in general, but with our user's collective behavior to guide us, an "80% solution" can be achieved by doing that most effective "machine learning model": counting. The next 80% relevance gain can be found in a combination of collaborative filtering recommender systems wired through your search engine, and simple supervised machine learning for query intent classification. If you really need to get the remaining 80% of relevance gains left, you'll need to roll up your sleeves and do some NLP or knowledge graph mining, but first, let's see if you even need to get that far.
Have a safe holiday this week and see you in December!

Smarter Search, Big Data, and Machine Learning