Services like Google Translate support only 100 languages, give or take. What about the thousands of other languages -- spoken by people just as vulnerable to this crisis? From a report: If we want to avoid a pandemic spreading to all the humans in the world, this information also has to reach all the humans of the world -- and that means translating Covid PSAs into as many languages as possible, in ways that are accurate and culturally appropriate. It's easy to overlook how important language is for health if you're on the English-speaking internet, where "is this headache actually something to worry about?" is only a quick Wikipedia article or WebMD search away. For over half of the world's population, people can't expect to Google their symptoms, nor even necessarily get a pamphlet from their doctor explaining their diagnosis, because it's not available in a language they can understand.
[...] In a pandemic, the challenge isn't just translating one or a handful of primary languages in a single region -- it's on a scale of perhaps thousands of languages, at least 1,000 to 2,000 of the 7,000-plus languages that exist in the world today, according to the pooled estimates of the experts I spoke with, all of whom emphasized that this number was very uncertain but definitely the largest number they'd ever faced at once. Machine translation might be able to help in some circumstances, but it needs to be approached with caution. [...] That's not to say that machine translation isn't helpful for some tasks, where getting the gist quickly is more important than the nuanced translations humans excel at, such as quickly sorting and triaging requests for help as they come in or keeping an eye on whether a new misconception is bubbling up. But humans need to be kept in the loop, and both human and machine language expertise needs to be invested in during calmer times so that it can be used effectively in a crisis.
The bigger issue with machine translation is that it's not even an option for many of the languages involved. Translators Without Borders is translating Covid information into 89 languages, responding to specific requests of on-the-ground organizations, and 25 of them (about a third) aren't in Google Translate at all. Machine translation disproportionately works for languages with lots of resources, with things like news sites and dictionaries that can be used as training data. Sometimes, like with French and Spanish, the well-resourced languages of former colonial powers also work as lingua francas for translation purposes. In other cases, there's a mismatch between what's easy to translate by machine versus what's useful to TWB: The group has been fielding lots of requests for Covid info in Kanuri, Dari, and Tigrinya, none of which are in Google Translate, but hasn't seen any for Dutch or Hebrew (which are in Google Translate but don't need TWB's help -- they have national governments already producing their own materials). Google Translate supports 109 languages, Bing Translate has 71, and even Wikipedia exists in only 309 languages -- figures that pale in comparison to the 500-plus languages on the list from the Endangered Languages Project, all human-created resources.
Read more of this story at Slashdot.