Published 13:46 IST, October 20th 2020
Facebook unveils AI translator for 100 languages without relying on English data
Facebook announced that the MMT model that can directly translate “100×100 languages” in any direction without relying on only English-centric data.
Facebook unveiled a software based on machine learning that can translate between any pair of 100 languages without relying on English. The first multilingual machine translation (MMT) model is an open-source artificial intelligence software which directly trains on one language to another without using English as intermediate data which helps preserve the meaning.
Facebook AI research assistant Angela Fan said in a blog post that advanced multilingual systems can process multiple languages but they compromise on accuracy by relying on English data to bridge the gap between the source and target languages. Fan announced that the MMT model that can directly translate “100×100 languages” in any direction without relying on only English-centric data.
“Our model directly trains on Chinese to French data to better preserve meaning. It outperforms English-centric systems by 10 points on the widely used BLEU metric for evaluating machine translations,” wrote Fan.
The research assistant further described that Facebook built the “many-to-many” data set with 7.5 billion sentences for 100 languages. She said that tech giant used several scaling techniques to build a universal model with 15 billion parameters, which captures information from related languages and reflects a more diverse script of languages and morphology.
Bridge languages
Fan said that the team identified a small number of bridge languages, which are usually one to three major languages of each group, to connect the languages of different groups. Giving the example of Hindi, Bengali, and Tamil as bridge languages for Indo-Aryan languages, she said that the team mined parallel training data for all possible combinations of these bridge languages.
“Our training data set ended up with 7.5 billion parallel sentences of data, corresponding to 2,200 directions. Since the mined data can be used to train two directions of a given language pair...our mining strategy helps us effectively sparsely mine to best cover all 100×100 directions in one model,” she wrote.
Updated 13:45 IST, October 20th 2020