Hard to Read: Recognizing Text with Neural Networks in Unconstrained Settings

Are you going?

129 spots left

Share:
Location image of event venue

Details

Please join us in June as we learn how neural networks are being used to improve Optical Character Recognition.

Agenda
-------------------------------------------------
6:30 PM -- Networking & Food

7:00 PM -- Greetings

7:05 PM -- Multilingual Optical Character Recognition (OCR) in Unconstrained Image and Video - David Etter

9:00 PM -- Post event drinks at a nearby bar

Location
-------------------------------------------------
JHU APL
Building[masked] Johns Hopkins Rd
Laurel, MD 20723

Directions
-------------------------------------------------
Building 200 is on the South campus.

Parking
-------------------------------------------------
There is ample free parking near the building.

Food and Drinks
-------------------------------------------------
Complimentary food, such as pizza and chips, and non-alcoholic beverages will be provided.

Talks
-------------------------------------------------
Multilingual Optical Character Recognition (OCR) in Unconstrained Image and Video
Optical Character Recognition (OCR) is the task of detecting and recognizing text in images or video. While current OCR systems perform well on tasks such as scanned books or newspapers these systems begin to break down in unconstrained settings. Unconstrained setting include video or images from maps, forms, web pages, and social media. This challenging setting is often multilingual and can include text over complex backgrounds, multiple fonts, lighting changes, and occlusions. In this talk we will discuss state-of-the-art neural network architectures for training models on the unconstrained OCR problem. We will also discuss approaches for synthetic training data generation that promises unlimited training data at zero annotation cost. The talk will include a detailed code walk-through of a PyTorch neural network solution to train and evaluate a multilingual OCR system.

Speakers
-------------------------------------------------
David Etter
David Etter is a Principal Machine Learning Scientist with over 24 years of experience researching and developing large scale multimedia solutions for government and industry. His research experience includes Computer Vision, Natural Language Processing (NLP), and Large Scale Retrieval. David graduated from George Mason University in 2015, with a PhD in Computer Science, where his dissertation focused on multimedia search and ranking. He is currently researching large scale solutions for Face Recognition and Optical Character Recognition (OCR) in unconstrained video and image. David can be found on LinkedIn at https://www.linkedin.com/in/david-etter-3207665/