Facebook researchers have introduced a model that can recognize words in 51 languages. In preliminary tests, the tool showed record accuracy, this indicator will improve with training.
Facebook researchers unveiled the largest automatic speech recognition (ASR), model. She learned to understand 51 languages after she was trained in 16 thousand hours of voice recordings. In an article published on Arxiv.org, the co-authors of the work argue that a system that contains about a billion parameters improves speech recognition efficiency by up to 28.8%.
Before downloading materials, scientists divided 51 languages into separate groups, and then selected 10 thousand units of the dictionary as a set of information for each language group. After that, they manually combined some small language groups until they were only 6. This accelerated the process of learning the model several times.
“As far as we know, this is the first work that studies multilingual systems on a massive scale. We got a unified speech recognition architecture for 51 languages, which does not require a lot of resources”, noted Facebook.
Researchers report that during several experiments, the most effective version of their model recognized words with an efficiency of 28.75%. This indicator is several times higher than that of analogs and will improve with training.
In the article, scientists also noted that they will soon publish the second version of the system. It has become simpler and achieves the desired results in just 10 minutes. She was trained for 53 thousand hours of “raw” materials.