Hard to Read: Recognizing Text with Neural Networks in Unconstrained Settings


Details
ATTENTION: Please note the change in location to the north side of APL!
Please join us in June as we learn how neural networks are being used to improve Optical Character Recognition.
Agenda
6:30 PM -- Networking & Food
7:00 PM -- Greetings
7:05 PM -- Multilingual Optical Character Recognition (OCR) in Unconstrained Image and Video - David Etter
9:00 PM -- Post event drinks at a nearby bar
Location
Kossiakoff Conference Center
11100 Johns Hopkins Rd,
Laurel, MD 20723, USA
https://goo.gl/maps/1HeLZU1bp2NMN8sD9
Directions
The Kossiakoff Conference Center is on the north side, on the left after turning onto Pond Road.
Parking
There is ample free parking near the building.
Food and Drinks
Complimentary food, such as pizza and chips, and non-alcoholic beverages will be provided.
Talks
Multilingual Optical Character Recognition (OCR) in Unconstrained Image and Video
Optical Character Recognition (OCR) is the task of detecting and recognizing text in images or video. While current OCR systems perform well on tasks such as scanned books or newspapers these systems begin to break down in unconstrained settings. Unconstrained setting include video or images from maps, forms, web pages, and social media. This challenging setting is often multilingual and can include text over complex backgrounds, multiple fonts, lighting changes, and occlusions. In this talk we will discuss state-of-the-art neural network architectures for training models on the unconstrained OCR problem. We will also discuss approaches for synthetic training data generation that promises unlimited training data at zero annotation cost. The talk will include a detailed code walk-through of a PyTorch neural network solution to train and evaluate a multilingual OCR system.
Speakers
David Etter
David Etter is a Principal Machine Learning Scientist with over 24 years of experience researching and developing large scale multimedia solutions for government and industry. His research experience includes Computer Vision, Natural Language Processing (NLP), and Large Scale Retrieval. David graduated from George Mason University in 2015, with a PhD in Computer Science, where his dissertation focused on multimedia search and ranking. He is currently researching large scale solutions for Face Recognition and Optical Character Recognition (OCR) in unconstrained video and image. David can be found on LinkedIn at https://www.linkedin.com/in/david-etter-3207665/

Hard to Read: Recognizing Text with Neural Networks in Unconstrained Settings