Tech Talk: Natural Language Processing - OCR/Text Extraction & Data Extraction
This meetup is for intermediate/advanced machine learning engineers (MLE). We will be covering using an NLP framework for text/OCR extraction from documents/images, text preparation and data extraction. The code along will be in Python.
This should be useful if you are, or plan to, work with thousands to millions of PDF documents, camera captured images containing text, and facsimile messages.
Source code, and installation instructions for the NLP framework: https://github.com/andrewferlitsch/epipog-nlp
About the Speaker: Andrew Ferlitsch has a Masters Degree in Artificial Intelligence, and is a former Principal Research Scientist for Sharp Corporation with 115 issued US patents. Andrew was ranked #1 continuously for 2 years in the StackOverflow category for Open Data (Data Engineering). He is the founder of the shareware product Winnex and the open source project OpenGeoCode. Andrew has ran a number of volunteer organizations in Portland, including a local tech incubator (NW Startups), the Portland Data Science Group, and oversaw running an international based NGO in Yemen. Currently Andrew is freelancing as a Data Scientist/Consultant, specializing in Natural Language Processing and Computer Vision in the Health and Life Sciences.
12:45 p.m. Pizza and Refreshments
1:00 p.m. Tech Talk starts
Please RSVP to help us ensure we order enough food for everyone.
We look forward to seeing you!