How we taught Google Translate to stop being sexist


Online translation tools have helped us learn new languages, communicate across language boundaries, and view foreign websites in our native language. But the artificial intelligence (AI) behind them is far from perfect, often reproducing rather than rejecting the prejudices that exist within a language or society.

These tools are particularly vulnerable to gender stereotypes because some languages ​​(like English) do not tend to use gender names, while others (like German) do. When translating from English to German, the translation tools need to decide which gender to assign English words such as “cleaner”. In a very large majority, the tools conform to the stereotype, opting for the feminine word in German.

Prejudice is human: it is part of who we are. But when left unchallenged, prejudices can emerge in the form of concrete negative attitudes towards others. Now our team has found a way to retrain the AI ​​behind translation tools, using targeted training to help them avoid gender stereotypes. Our method could be used in other areas of AI to help technology reject, rather than reproduce, prejudices within society.

Biased algorithms

Much to the dismay of their creators, AI algorithms often develop racist or sexist traits. Google Translate has been accused of gender stereotypes, such as its translations assuming that all doctors are men and all nurses are women. Meanwhile, the GPT-3 AI language generator – which wrote an entire article for The Guardian in 2020 – has recently shown that it is also terribly effective at producing harmful content and disinformation.

These AI failures are not necessarily the fault of their creators. Academics and activists recently drew attention to gender bias in the Oxford English Dictionary, where sexist synonyms for “wife” – such as “slut” or “maid” – show how even a catalog of words constantly revised and edited by academics may contain biases that reinforce stereotypes and perpetuate sexism in everyday life.

AI learns prejudice because it isn’t built in a vacuum: it learns to think and act by reading, analyzing, and categorizing existing data – like that contained in the Oxford English Dictionary. In the case of translating AI, we expose its algorithm to billions of words of text data and ask it to recognize and learn patterns it detects. We call this process machine learning, and along the way, patterns of bias are learned along with those of grammar and syntax.

Ideally, the textual data that we show AI will not contain bias. But there is a continuing trend in the field to build larger systems trained on ever-increasing datasets. We are talking about hundreds of billions of words. These are obtained on the internet using indiscriminate text scraping tools such as Common Crawl and WebText2, which maraud the web, gobbling up every word they come across.

The sheer size of the resulting data makes it difficult for any human to know what is in it. But we do know that some of it comes from platforms like Reddit, which made headlines for presenting offensive, false, or conspiratorial information in user posts.

A magnifying glass over the Reddit logo on a web browser