Name: Speech to text, for low resource languages (Serbian use case)
Start: 2024-02-15T18:00:00+01:00
End: 2024-02-15T20:00:00+01:00
Location: Haos Community Space

**Speech to text, for low resource languages (Serbian use case)**

**Lecturer: [Andrija Sagić](https://rs.linkedin.com/in/andrija-sagic)**

Automatic speech recognition for low resource languages was a great challenge during the years. When OpenAI Foundation released their ASR model Whisper, we got an effective tool for this task. This release was somehow in a shadow because of ChatGPT euphoria. This focus on AI development and, what is more important, it's effective usability in different areas put Whisper as one of the best tools for ASR. Its support for low resource languages was a huge improvement but results in a low quality of transcripts. The new challenge was to improve support for low resource languages and the response of the community was great, they published fine tuned models in various low resource languages with improved results in accuracy.

This talk will introduce Whisper ASR model, explain the process of fine tuning for Serbian and give possible use cases. Problems that appear in a process of fine tuning for Serbian will be also marked and this can be a good discussion issue.

**About the lecturer**

Graduated Philosophy on contemporary epistemology. Certified Linux System Administrator and Audio ML expert. Was a part of Islandora 7 release team. Work on improving and adapting ResCarta Toolkit, FLOSS digitization workflow set. Co-chair EuropeanaTech Community, Co-chair IIIF A/V Community, Secretary of IFLA Audio Video and Multimedia Section. As a early music presenter and explorer is a member of one of the oldest Early Music Ensembles in Europe "*Renaissance*" (founded in 1969).

Get in touch with Andrija: [https://rs.linkedin.com/in/andrija-sagic](https://rs.linkedin.com/in/andrija-sagic)

Goran S. Milovanovic

Data Science Club

Technology

Open Source

AI and Society

Data Science

Data Science using Python

Data Architecture

Artificial Intelligence

AI Algorithms

Machine Learning

Data Science using R

Big Data

Data Science for Business

Machine Learning Apps

Machine Learning with Python

Automated Machine Learning

Machine Learning Interpretability

Speech to text, for low resource languages (Serbian use case)

Natural Language Processing

Haos Community Space

Share

Data Science Club

Speech to text, for low resource languages (Serbian use case)

Data Science Club

Details

Members are also interested in