addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramlinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Extreme Localization: Translating Every Word in Every Language

What would it take to localize interfaces into thousands of dialects and standard languages? Or to let a speaker of Zulgo-Gemzek in Cameroon search for “mázlə̀rpə́pa” and retrieve an image tagged with “исленг” by a speaker of Tati in Azerbaijan? In other words, how could we translate instantly across 50 million language pairs?

A critical enabler would be a corpus of lexical translations between arbitrary pairs of languages. PanLex, a project of The Long Now Foundation in San Francisco, is building such a corpus out of thousands of bilingual dictionaries, glossaries, wordlists, thesauri, standards, and other lexical resources, paper and digital. It now documents about 22 million lexemes in about 10,000 languages and dialects. It can supply a billion attested translation pairs, plus 30 billion distance-2 (bridged) translation pairs.

The PanLex team will show how it acquires resources, extracts data from them, and provides access to the data. Decisions on database design, operationalizing “word” and “language”, text encoding, polysemy and ambiguity, attribution and provenance, and API and human-interface design will be discussed. You will learn how you can help in the effort.

David Kamholz is PanLex’s Lexical Data Specialist. He has a Ph.D. in linguistics from the University of California, Berkeley. His research focuses on Austronesian languages, computational lexicography, historical linguistics, and language typology.

Alexander DelPrioreGary Krug, and Benjamin Yang are Source Analysts, and Julie Anderson is a Source Acquisition Specialist, in the PanLex project. DelPriore is completing an MLIS at Rutgers with an emphasis in digital libraries and has worked on electronic publishing and cataloging. Krug is a computational linguist who has worked as a programmer and technical support engineer. Yang has worked as a linguistic engineer and voice interface designer. Anderson has an MA in linguistics from the University of Hawaii and has worked on language and software documentation and nonprofit management.

Jonathan Pool directs the PanLex project. He has taught at SUNY/Stony Brook and the University of Washington in Seattle and published on the politics and economics of language and artificial and controlled languages.

Schedule:

6:30-7:00 Social time with snacks 
7:00-8:00 Presentation and discussion 
8:00-8:30 Social time

Disclosure policies: Adobe requires all visitors to their office to sign a non-disclosure agreement in case you see or hear anything Adobe-confidential while at their office. Note however that all information shared at the meet-up itself is considered public and may be used by anyone at the meet-up with no restrictions. Therefore, please do not share proprietary information or intellectual property that you or your organization would not appreciate to become public knowledge. 

This meeting is in a branch office of Adobe Systems that’s only one block from the SF Caltrain with all its rail and bus lines, two blocks closer than their main SF office.

Beitreten oder anmelden um zu kommentieren.

  • valeria b.

    Great project!

    3. November 2015

  • valeria b.

    Loved the Panlex project. Thank you for your presentation and thank you Janice for the organization!

    1 · 3. November 2015

  • Norbert L.

    Slides for the presentation are available at http://panlex.org/pubs/etc/20151102-sfg-all-final.pdf

    2 · 3. November 2015

  • Jose

    Very interesting project. Thanks to Jonathan's team for sharing their knowledge.

    3. November 2015

  • Jonathan P.

    Audience was engaged and asked informed questions.

    1 · 2. November 2015

  • Merle T.

    This was a very worthwhile meeting. PanLex has very aggressive goals, but they are structuring their work intelligently. They have good results already, but they promise to deliver much more in the mid and long term.

    1 · 2. November 2015

  • Janice

    The meeting room is not equipped for live streaming. We will attempt to record, although this room is not set up for optimal recording.

    2. November 2015

  • gwen A.

    recording , livestream

    1 · 31. Oktober 2015

Mitglieder in diesem
Meetup sind auch bei:

Registrieren

Meetup Mitglieder, Anmelden

Mit der Registrierung erklärst Du Dich mit den Allgemeinen Geschäftsbedingungen und der Datenschutzerklärung von Meetup einverstanden.