Cross-Lingual Information Retrieval: Towards Text Multilinguality (Shadi Saleh)
Details
Abstract:
English is not the dominant language of the Internet anymore: although 55% of the online content is available in English, around 75% of the world’s population does not speak English.
Cross-Lingual Information Retrieval (CLIR) is the task that allows internet users search for information (documents) available in a language that they do not speak, using a query (search question) written in their native language. This helps breaking a huge language barrier and facilitates information access.
In this talk, Shadi will describe the methodologies used in CLIR systems, mainly query-translation vs. document-translation approaches. We will see how machine translation is used today towards multilingual Internet. The talk will also show how CLIR is sensitive in the medical domain, when people use the Internet to find health-related information in a way that can be crucial to their well-being.
Speaker:
Dr. Shadi Saleh (Microsoft; PhD from UFAL MFF UK; founder of Arabic search engine Shamra)
Those who can not attend in-person can join us at Google Meet: http://meet.google.com/qas-xnan-pap
