Name: (Virtual) Hands-on Apache Tika tika-eval workshop (Part 1)
Start: 2021-11-09T11:00:00-05:00
End: 2021-11-09T12:00:00-05:00

This will be a virtual, hands-on workshop to introduce the capabilities of the tika-eval module. This workshop is designed for those interested in:
1) profiling files (digests, mime types)
2) profiling text extracted from files (number of tokens, automatic language detection, out-of-vocabulary statistic/junk detection)
3) comparing text extracted from different text extractors.

There will be a heavy emphasis on processing PDF files.

Attendees should be comfortable running tika-app from the commandline or curl'ing to a local tika-server. See the link below for prerequisites (still a work in progress).

https://cwiki.apache.org/confluence/display/TIKA/Apache+Tika+Meetups

Tim Allison

Apache Tika Community (Virtual)

Technology

New Technology

Web Technology

Digital Forensics

Text Analytics

Data Science

Data Analytics

Big Data

Open Source

(Virtual) Hands-on Apache Tika tika-eval workshop (Part 1)

Online event

Share

Apache Tika Community (Virtual)

(Virtual) Hands-on Apache Tika tika-eval workshop (Part 1)

Apache Tika Community (Virtual)

Details

Related topics

You may also like