The world’s 1,000 most spoken languages will be supported by a giant Artificial Intelligence language model.

The Google Expansion after the ‘King of Low Resource Languages,’ or Why We’re All So Close to It, and Why We Need It

Recent advances in machine learning have placed new emphasis on the development of powerful, multi-functional “large language models” or LLMs, which have been at the heart of the products of Google.

Language models have already been integrated into products like Google Search, as well asfending off criticism that the systems are not user-friendly. Language models have a number of flaws, including a tendency to regurgitate harmful societal biases like racism and xenophobia, and an inability to parse language with human sensitivity. After the researchers published their papers, Google fired them.

Speaking to The Verge, Zoubin Ghahramani, vice president of research at Google AI, said the company believes that creating a model of this size will make it easier to bring various AI functionalities to languages that are poorly represented in online spaces and AI training datasets (also known as “low-resource languages”).

“By having a single model that is exposed to and trained on many different languages, we get much better performance on our low resource languages,” says Ghahramani. “The way we get to 1,000 languages is not by building 1,000 different models. Languages have a lot of similarities and have evolved from one another. And we can find some pretty spectacular advances in what we call zero-shot learning when we incorporate data from a new language into our 1,000 language model and get the ability to translate [what it’s learned] from a high-resource language to a low-resource language.”

Access to data is a problem when training across so many languages, though, and Google says that in order to support work on the 1,000-language model it will be funding the collection of data for low-resource languages, including audio recordings and written texts.

The company doesn’t have a plan on where to apply the model, only that it expects it to be used in more than just one product.

“One of the really interesting things about large language models and language research in general is that they can do lots and lots of different tasks,” says Ghahramani. “The same language model can turn commands for a robot into code; it can solve maths problems; it can do translation. The really interesting things about language models is they’re becoming repositories of a lot of knowledge, and by probing them in different ways you can get to different bits of useful functionality.”

It makes sense that the failure to handle negation is one of the known vulnerabilities of LLMs. Allyson Ettinger, for example, demonstrated this years ago with a simple study. When asked to complete a short sentence, the model would answer 100% correctly for affirmative statements (ie. “a robin is..”) and 100% incorrectly for negative statements (ie. “a robin is not…”). In fact, it became clear that the models could not actually distinguish between either scenario, providing the exact same responses (of nouns such as “bird”) in both cases. This is an issue with models today, as they get bigger and complexity increases, they don’t improve their linguistic skills. Such errors reflect broader concerns raised by linguists on how much such artificial language models effectively operate via a trick mirror – learning the form of what the English language might look like, without possessing any of the inherent linguistic capabilities demonstrative of actual understanding.

The creators of such models admit the difficulty in addressing inappropriate responses that do not accurately reflect the contents of authoritative external sources. There is a text on how crushed porcelain in breast milk can help the baby’s digestive system and a “scientific paper” on the benefits of eating crushed glass. In fact, Stack Overflow had to temporarily ban the use of ChatGPT- generated answers as it became evident that the LLM generates convincingly wrong answers to coding questions.

There are still a lot of blame and praise to be had in response to this work. Model builders and tech evangelists alike attribute impressive and seemingly flawless output to a mythically autonomous model, a technological marvel. The human decision-making involved in model development is erased, and model feats are observed as independent of the design and implementation choices of its engineers. But without naming and recognizing the engineering choices that contribute to the outcomes of these models, it becomes almost impossible to acknowledge the related responsibilities. As a result, both functional failures and discriminatory outcomes are also framed as devoid of engineering choices – blamed on society at large or supposedly “naturally occurring” datasets, factors those developing these models will claim they have little control over. They do have control over the models that we are seeing right now, and none of them are inevitable. It would have been entirely feasible for different choices to have been made, resulting in an entirely different model being developed and released.