Apache Lucene: Then and Now By Doug Cutting

Name: Apache Lucene: Then and Now By Doug Cutting
Start: 2013-09-24T18:00:00-04:00
End: 2013-09-24T21:30:00-04:00
Location: Embassy Suites Washington D.C. - Convention Center

Hosted by Ed K. and 2 others

Hadoop-DC

Details

Doug Cutting originally wrote Lucene in 1997-8. It joined the Apache Software Foundation's Jakarta family of open-source Java products in September 2001 and became its own top-level Apache project in February 2005. Until recently it included a number of sub-projects, such as Lucene.NET, Mahout, Solr and Nutch. Solr has merged into the Lucene project itself and Mahout, Nutch, and Tika have moved to become independent top-level projects. While suitable for any application which requires full text indexing and searching capability, Lucene has been widely recognized for its utility in the implementation of Internet search engines and local, single-site searching. At the core of Lucene's logical architecture is the idea of a document containing fields of text. This flexibility allows Lucene's API to be independent of the file format. Text from PDFs, HTML, Microsoft Word, and OpenDocument documents, as well as many others (except images), can all be indexed as long as their textual information can be extracted.

In today’s discussion, Doug will share background on the impetus and creation of Lucene. He will talk about the evolution of the project and explain what the core technology has enabled today. Doug will also share his thoughts on what the future holds for Lucene and SOLR

Schedule

6:00-6:30 - Networking

6:30-6:45 - Announcements

6:45-8:00 - Doug Cutting

Hadoop-DC

Apache Lucene: Then and Now By Doug Cutting

Hadoop-DC

Details

Related topics

You may also like