Skip to content

(Virtual) Hands-on Apache Tika tika-pipes workshop

Photo of Tim Allison
Hosted By
Tim A.
(Virtual) Hands-on Apache Tika tika-pipes workshop

Details

This will be a virtual, hands-on workshop to introduce the capabilities of the tika-pipes module.

The tika-pipes module(s) greatly improve robustness, network efficiency and scalability. These modules allow developers to specify a data source (local file share, S3, GCS) and a target (local file share, S3, Apache Solr, OpenSearch) in a tika-config.xml file, and then at parse time, developers only have to send a path/key to tika-server, and it will grab the bytes, safely parse the file and emit the parsed data to the specified target.

Attendees will learn how to configure tika-pipes to have a local tika-server read from a local file share and write the parsed output to Apache Solr and/or OpenSearch.

Attendees should be comfortable running tika-server with a configuration file. See the link below for prerequisites (still a work in progress).

https://cwiki.apache.org/confluence/display/TIKA/Apache+Tika+Meetups

Photo of Apache Tika Community (Virtual) group
Apache Tika Community (Virtual)
See more events