• SolrCloud Autoscaling


    Join us for an evening of networking and Solr Autoscaling discussions with at Reddit HQ! Apache Solr Committer, Varun Thacker from Lucidworks will share some of the goals of Solr's Autoscaling framework, and Reddit Software Engineer, Jerry Bao from will be sharing how the Reddit teams continues to scale their search infrastructure using features built on Solr. Food & drinks will be provided. Hope to see you there! --Talks-- --Solr's AutoScaling Framework, Presenter: Varun Thacker, Solr Committer, Lucidworks-- The goal of Solr's AutoScaling framework is for search clusters to be able to grow to a trillion documents without much human intervention. We'll discuss practical use-cases to keep the cluster healthy and performing optimally, complete with fault tolerance. For example, we'll cover how to achieve these scenarios by utilizing the framework. - Effectively managing disk space by setting triggers and sending out alerts. - Maintaining a minimum replication factor when nodes go down. We'll also use rules to make sure the replicas are spread out, thus maximizing fault tolerance. The talk will cover how to use the suggestions end-point to know the violations in your cluster. --Search Infra @ Reddit: Challenges in Scaling to Millions of Cat Posts, Presenter: Jerry Bao, Software Engineer, Reddit-- As Reddit continues to grow year after year, Search has become a vital part of the Reddit experience, both internally and externally. Come learn about the tools we’ve used, challenges we’ve faced, and lessons we’ve learned rebuilding and scaling our search infrastructure to the next million cat posts.

  • Fusion Day San Francisco with Reddit and Uber

    The W Hotel

    ***Please register at the following page to request your spot at this event*** http://www.cvent.com/d/sgqd1g Lucidworks is excited to bring you Fusion Day San Francisco, a half-day seminar to learn how companies like Reddit and Uber are innovating more, enabling employees to work smarter, and connecting users to the content they love through personalized digital experiences. Join us on Wednesday, August 8 to learn how to forge data and behavior to connect people through common interests and ideas at work and at play. Breakfast & Lunch will be provided. Featured Speakers: - Senior Engineering Manager at Reddit - Engineering Manager at Uber - Director of Fusion Product at Lucidworks - CEO of Lucidworks Date: Wednesday, August 8, 2018 Time: 8:00am - 1:30pm Location: The W Hotel, 181 3rd St · San Francisco, CA *************** Important: Registration for this event is NOT through Meetup. Those interested in attending must register via the event page here: http://www.cvent.com/d/sgqd1g ***************

  • Solr & Machine Learning at Target

    Target Sunnyvale Office

    Join us on Tuesday, January 30th at Target's Sunnyvale Office for an evening of networking, food, and Solr discussions from Target and Lucidworks focused on Learning to Rank and other Machine Learning techniques. Agenda-- 6:00pm - 6:30pm: Food, Drinks, Networking 6:30pm - 7:15pm: Talk #1 7:15pm - 8:00pm: Talk #2 Talks-- Talk #1: Applying Newness Signal to Solr LTR for Improving Search Ranking Speakers: Sunil Srinivasan, Lead Search Engineer, Target Satheesh Akkinepally, Senior Search Engineering Manager, Target Sunny Li, Lead Search Product Owner, Target Talk #2: AI with Solr and Lucidworks Fusion Summary: Learn how to improve and enhance search results on your website using the most common machine learning techniques driven on Solr. We will demonstrate the out of the box capabilities that Lucidworks Fusion offers to help implement these AI techniques. Speaker: Lasya Marla, Director of Product for Fusion, Lucidworks

  • Fusion Day San Francisco

    Hotel Nikko

    Lucidworks Fusion is built with the power of Apache Solr & Apache Spark and provides everything you need to build and deploy intelligent search applications that wow your customers and empower your employees. Join us for Fusion Day in San Francisco to see first-hand how Fusion can help you develop powerful search apps and cut months off your development cycle. Stick around after lunch for a hands-on Lucidworks Fusion training that will help you become a Fusion pro! *************** Important: Registration for this event is NOT through Meetup. Those interested in attending must register via the event page here: http://www.cvent.com/d/b5qdkn *************** (http://www.cvent.com/d/b5qdkn)

  • Cloudera Solr Meetup


    Please RSVP here: https://www.eventbrite.com/e/solr-meetup-at-cloudera-tickets-36015248578?ref=estw Join us at Cloudera in Palo Alto for an evening of networking and Solr talks. Food & drinks will be provided. Talk #1: Solr performance and scalability effort update (lightning talk) Summary: As users put more and more workloads onto Solr, performance and scalability become even more critical. In this talk, we will cover our high-level strategy to work with the community to improve them -- including a new microbenchmark framework -- and how you can make use of it to improve the performance of your own system. Speaker: Michael Sun has been working on performance engineering and system fundamentals for several years. Before joining Cloudera he has helped to improve performance and scalability for Microsoft HyperV, Salesforce and Workday. Talk #2: Solr consistency and recovery internals Summary: How does SolrCloud ensure that replicated data remains consistent? How does Solr avoid data loss when hardware inevitably fails? In this talk, we will cover how Solr addresses failures and what recovery steps the cluster can automatically perform. Speaker: Mano Kovacs has been an application developer for more than 15 years. In the past years he was focusing on distributed, large scale services. He was working on an IoT platform before he joined the Search/Solr team at Cloudera in 2016.

  • 2017 Kickoff: Cloudera Lightning Talks


    Join us to kick off the new year at Cloudera's HQ in Palo Alto. Don't miss an evening of food, drinks, and a set of lightning talks from the below Solr experts and committers. See you there! Topic #1: Backup and Disaster Recovery for Solr Speaker: Hrishikesh Gadre Topic #2: Analyzing Large Solr Log Files Speaker: Mark Miller Topic #3: The SolrCloud Recovery Process Speaker: Mike Drob Topic #4: Graph Traversal and Streaming Expressions Speaker: Yonik Seeley Hrishikesh Gadre is a software engineer at Cloudera working on Cloudera Search. Prior to Cloudera, Hrishikesh worked for virtualization giant VMware for more than three years building next-generation network/security virtualization platform. He has a master’s degree in Computer Engineering from Rutgers University, New Jersey specializing in large-scale distributed systems. Mark Miller is a longtime Lucene / Solr committer and Apache member. After starting with Lucene in 2006, Mark has spent most his time getting paid to work on the open source software projects that he loves. He has given many talks on Lucene/Solr at various conferences and meet-ups around the world and is currently learning all about Hadoop as a software engineer at Cloudera. Mike Drob has been immersed in Big Data for over 5 years, previously with the US Government and now with Cloudera. His current role is to provide operational support for Apache Solr, a world-class search engine built on top of Apache Lucene. He is also a hobbyist contributor to several other open source projects including Apache Curator, Apache Accumulo, Apache HTrace-incubating, JUnit, JLine, and JCommander. When not coding, he likes to mentor middle school students in robotics, take his dog running along the Houston bayous, and to tend his tragically low yield vegetable garden. Yonik Seeley is the creator of Solr. He works at Cloudera integrating and leveraging "Big Search" technologies into the many components comprising the Cloudera enterprise data hub (EDH). Yonik was previously a co-founder of LucidWorks, and he holds a master's degree from Stanford University.

  • SolrCloud Rebalance API at Bloomreach

    BloomReach Headquarters

    Join us for a Meetup at Bloomreach for the below presentation. Food & drinks will be provided. 6:00pm - 6:30pm: Food, drinks, networking 6:30pm: Presentation SolrCloud Rebalance API at Bloomreach: Presented by Suruchi Shah and Nitin Sharma, Bloomreach SolrCloud with large data sets and collections usually run into unevenly balanced clusters. This causes skewed data and replica distribution across solrcloud nodes. Automatic Node Discovery in an existing cluster does not cause data to be re-distributed. Dynamically scaling up or scaling down collections based on index/config size or cluster size is non trivial and poses operational overhead. Rebalance API offers a flexible way of redistributing data in SolrCloud while guaranteeing zero downtime. It offers multiple scaling strategies that aid in smarter/faster index manipulation and multiple allocation strategies that offer smarter collection placement in the cluster. It also offers a platform for users to write their own scaling strategy as they see fit. The api has been built in an open source friendly fashion.

  • "R for Solr", "Blacklight/GeoBlacklight for Discovery & Spatial Search in Solr"

    Lathrop Library at Stanford University

    Parking and venue map can be found here: http://bit.ly/1eHsJEP Join us for an evening of networking, food & drinks, and the below Solr talks from the Stanford Library team. 6:00pm - 6:30pm: Networking, food & drinks 6:30pm - 8:00pm: Presentations Blacklight: A discovery platform framework: Presented by Chris Beer and Jessie Keck, Stanford Library Libraries, archives and museums are custodians of cultural heritage material in a wide variety of formats, metadata standards, and user interaction models. In this presentation, we will discuss how the Stanford University Libraries, along with many other libraries and cultural heritage institutions, use Blacklight [1], an open source discovery platform for Ruby on Rails that helps institutions around the world provide access to their knowledge repositories and assets using Apache Solr. [1] projectblacklight.org Speaker Bios: Coming soon Spatial search in Solr; GeoBlacklight and Solr Facet Heatmaps: Presented by Jack Reed, Stanford Library This presentation will talk about spatial search capabilities of Apache Solr and how they are used at Stanford University. GeoBlacklight, an open source spatial discovery application, will be discussed focusing on spatial search and discovery problems. In addition, a fairly new feature in Solr 5, Facet Heatmaps will be explored and put to task using Leaflet.js and the GeoNames.org corpus. Speaker Bio: Jack works on increasing access to geospatial data at Stanford University Libraries. A contributor to open-source software, Jack is active in the GIS, library, and open-data communities. He also serves on the executive committee of The International Association for Geoscience Diversity. R client for Solr Scott Chamberlain, rOpenSci/UC Berkeley Solr is a widely used search engine tool. As such, the R community needs a client to talk to Solr - to simplify searching Solr installations, and even managing configurations, cores, collections, and documents. This talk will go over the R client in development, and run through a couple of use cases that should provide good reasons to interact with Solr from R. Speaker Bio: Scott co-founded rOpenSci (http://ropensci.org housed within UC Berkeley), a developer collective building R tools for open and reproducible science. Scott was an ecologist in a former life, but now makes open source software, mostly in R, and dabbles in Python and Ruby.

  • Solr & Cassandra: Searching Cassandra with DataStax enterprise

    Join us for an evening of networking, food & refreshments, and the below talk from Datastax about search in Cassandra. Hope to see you there! Searching Cassandra with DataStax enterprise, presented by Rachel Pedreschi, Lead Technical Evangelist for Datastax Enterprise and Data Science at Datastax Wait! Back away from the Cassandra secondary index. It’s ok for some use cases, but it’s not an easy button. “But I need to search through a bunch of columns to look for the data… and I can’t model that in C*, even after watching all of Patrick McFadins data modeling videos. What do I do?” The answer, dear developer, is in DSE Search. With it’s easy Solr API, Lucene indexes (and fault tolerance) you can search data stored in your Cassandra database until your heart’s content. Take my hand. I will show you how. Speaker bio: Rachel is a Lead Technical Evangelist at DataStax. A "Big Data Geek-ette," Rachel is no stranger to the world of high performance databases and data warehouses. She is a Vertica, Informix and Redbrick certified DBA on top of her work with Cassandra and has more than 15 years of business intelligence and ETL tool experience. Rachel has an MBA from San Francisco State University and a BA in Mathematics from University of California, Santa Cruz. She loves collecting new interesting experiences, including being on a game show when she was a kid, floating down underground rivers in an inner tube, and scuba diving with lemon sharks. Her current new experience is juggling a 2-year old and a 5-year old while trying to stay sane.

  • Building Search @ Airbnb & Security in Solr

    Please note, you must RSVP through the Airbnb event page (https://www.airbnb.com/meetups/852mxnezc-building-search-airbnb-security-in-solr) for this Meetup. RSVP here: ( https://www.airbnb.com/meetups/852mxnezc-building-search-airbnb-security-in-solr ) *Check in will start at 6pm and promptly end at 6:45pm in order to start the program. An RSVP per person is required to attend. We will not be able to accommodate any walk-ins. -------------------------- Join us for an evening of networking, food & refreshments, and two exciting Lucene and Solr presentations from Airbnb & Lucidworks. Hope to see you there! Building Search At Airbnb using Lucene, Presented by Mousom Dhar Gupta, Airbnb Airbnb has experienced exponential growth in recent years. Not only has our data and number of users been growing exponentially but so has the complexity of our search algorithms as we've wanted to do more to better match our Hosts with our Guests. Our team has learned to mix the best features of Lucene while augmenting them with in-house, custom-built technologies to build a scalable, efficient but simple Search infrastructure which will all be shared in this talk. Speaker Bio: Mousom Dhar Gupta has been a engineer at Airbnb Search and Marketplace Infrastructure team since last year. He focuses on building and scaling Search and other services like Calendar, Social Connections etc. Before Airbnb, he was at Facebook , Netflix and Amazon working on various product backends (Java) like Facebook Messenger, Amazon Kindle etc. Introduction to Security in Solr, Presented by Anshum Gupta, Lucidworks Apache Solr has evolved into a highly scalable system, capable of handling a lot of data and high number of queries but only recently was a mechanism to secure access in Solr provided. Apache Solr 5.2 shipped with pluggable authentication and authorization modules. These modules enable users to write their own plugins for managing security in Solr. This talk would cover an overview of both, the authentication and authorization frameworks, and how they work together within Solr. It will also provide an overview of existing plugins and how to enable them to restrict user access to resources within Solr. Speaker Bio: Anshum Gupta ia a committer and PMC member on the Apache Lucene / Solr project with over 10 years of experience with search and related technologies. He currently works at Lucidworks and spends most of his time working on SolrCloud i.e. the distributed feature set of Apache Solr. Prior to joining Lucidworks, he was a member of the team that developed and launched AWS CloudSearch - The AWS search as a service offering. He was also a key contributor in the search teams at various start ups.