Skip to content

Hard to Read: Recognizing Text with Neural Networks in Unconstrained Settings

Photo of Jason Barbour
Hosted By
Jason B.
Hard to Read: Recognizing Text with Neural Networks in Unconstrained Settings

Details

ATTENTION: Please note the change in location to the north side of APL!

Please join us in June as we learn how neural networks are being used to improve Optical Character Recognition.

Agenda

6:30 PM -- Networking & Food

7:00 PM -- Greetings

7:05 PM -- Multilingual Optical Character Recognition (OCR) in Unconstrained Image and Video - David Etter

9:00 PM -- Post event drinks at a nearby bar

Location

Kossiakoff Conference Center
11100 Johns Hopkins Rd,
Laurel, MD 20723, USA

https://goo.gl/maps/1HeLZU1bp2NMN8sD9

Directions

The Kossiakoff Conference Center is on the north side, on the left after turning onto Pond Road.

Parking

There is ample free parking near the building.

Food and Drinks

Complimentary food, such as pizza and chips, and non-alcoholic beverages will be provided.

Talks

Multilingual Optical Character Recognition (OCR) in Unconstrained Image and Video
Optical Character Recognition (OCR) is the task of detecting and recognizing text in images or video. While current OCR systems perform well on tasks such as scanned books or newspapers these systems begin to break down in unconstrained settings. Unconstrained setting include video or images from maps, forms, web pages, and social media. This challenging setting is often multilingual and can include text over complex backgrounds, multiple fonts, lighting changes, and occlusions. In this talk we will discuss state-of-the-art neural network architectures for training models on the unconstrained OCR problem. We will also discuss approaches for synthetic training data generation that promises unlimited training data at zero annotation cost. The talk will include a detailed code walk-through of a PyTorch neural network solution to train and evaluate a multilingual OCR system.

Speakers

David Etter
David Etter is a Principal Machine Learning Scientist with over 24 years of experience researching and developing large scale multimedia solutions for government and industry. His research experience includes Computer Vision, Natural Language Processing (NLP), and Large Scale Retrieval. David graduated from George Mason University in 2015, with a PhD in Computer Science, where his dissertation focused on multimedia search and ranking. He is currently researching large scale solutions for Face Recognition and Optical Character Recognition (OCR) in unconstrained video and image. David can be found on LinkedIn at https://www.linkedin.com/in/david-etter-3207665/

Photo of Data Works MD group
Data Works MD
See more events
Kossiakoff Conference Center
11100 Johns Hopkins Rd · Laurel, MD