Growing user generated content with content translation tools in Wikipedia


Details
Wikipedia has 287 language editions with almost 32 million articles contributed by editors from across the world. But it is a lop-sided world of user generated content. The top 10 languages account for almost 16 million articles. That’s 50%!
15% of Wikipedia editors edit multiple language editions of Wikipedia. What if the best written articles in English, German, Spanish and other languages could be used as sources by an editor who could select a target language they edit in and get a suggested translation to bootstrap their editing effort? A tool-driven workflow could jumpstart and increase the pace of creation of articles as well as quantity and quality of well-written articles created by editors in smaller languages as well as long-tail languages.
This presentation will walk through Wikipedia’s ambitious technology project to help editors create high quality content faster using Wikipedia’s content translation CX platform. This content translation tool will be enriched with multilingual data sources, machine translation (MT), translation memory, dictionaries, glossaries, other linked data sources to help a Wikipedia editor create content faster, aided by recommendations by MT engines as well as other language resources. This talk with also cover technical architecture, features and challenges in building such a platform.
Alolita Sharma is Director of Engineering for Internationalization and Localization at Wikipedia. She is driving the initiative for Wikipedia to build open source tools and technologies to support hundreds of languages.
Alolita Sharma is an engineering manager and software engineer who has been working with open source software and has promoted open source adoption for more than a decade. She is on the board of the Software Freedom Law Center and a passionate advocate of open source and the open Web.
She holds Bachelors and Masters degrees in Computer Science and speaks internationally on multilingual web, language technologies and standards, open source trends, women in technology and building successful developer communities.
Schedule:
6:30-7:00 Social time with snacks
7:00-8:00 Presentation and discussion
8:00-8:30 Social time
Disclosure policies: Salesforce requires all visitors to their office to sign a non-disclosure agreement in case you see or hear anything Salesforce-confidential while at their office. Note however that all information shared at the meet-up itself is considered public and may be used by anyone at the meet-up with no restrictions. Therefore, please do not share proprietary information or intellectual property that you or your organization would not appreciate to become public knowledge.
The Salesforce office at 123 Mission Street (one of several Salesforce offices in that area – check the address!) is in easy walking distance from the Embarcadero BART and Muni subway station, the temporary Transbay Terminal with its plethora of bus lines, the Ferry Plaza with yet more bus lines, the Ferry Terminal, the Market Street street car, the California Street cable car, and Muni bus line 1. If you still want to bring a car, note that parking in the area is generally expensive, and many places close at 8pm. Reportedly there’s attended parking at Main and Folsom, which should be open until 10pm, and unattended parking north of Market Street.

Growing user generated content with content translation tools in Wikipedia