@Scale 2019: Reading text from visual content at scale
Vinaya presents multiple innovations across modeling, training infrastructure, deployment infrastructure, and efficiency measures Facebook has made to build its state-of-the-art OCR system running at Facebook scale. There are billions of images and videos posted on Facebook every day, and a significant percentage of them contain text. It is important to understand the text within visual content to provide people with better Facebook product experiences and remove harmful content. Traditional optical character recognition systems are not effective on the huge diversity of text in different languages, shapes, fonts, sizes, and styles. In addition to the complexity of understanding the text, scaling the system to run high volumes of production traffic efficiently and in real time creates another set of engineering challenges.