Creating a universal language

Creating a universal language

According to linguist and political thinker Noam Chomsky, “A language is not just words. It’s a culture, a tradition, a unification of a community, a whole history that creates what a community is. It’s all embodied in a language.”

So can we have a universal, global language unifying and embodying all of us? Given the diversity of human life, is that even possible? Proto-Indo- European may have come closest, and then Hellenistic Koine, then Latin. What about CHTML or Python? After all, computers talk to each other in 1s and 0s regardless of the language used to program them and they span the globe. It seems that wherever languages are used, the desire for some form of universal language is identified as a means of circumventing the one-to-one translation process. The idea of a bridge (a koine language) connecting a number of languages, understandable to a large population, does indeed have a strong appeal, especially when a goal of globalism is real-time, multilingual communication. Does a universal language make sense in today’s network-connected world?

Language has many functions. We do not have a universal means of communicating with each other quite simply because we do not have a universal topic to discuss — we have millions. This is excellent news for translators and localizers. Perhaps not such good news for those hoping that computer-assisted translation will be a magic bullet for cross-cultural communication. Yet the idea persists and with a growing appreciation of what characterizes a global community, it is still an idea under investigation.

As Latin fell into decline in the post-Renaissance world of European letters, many thinkers sought to replace its abilities to express all manner of subjects in a widely understood form. Mathematicians Rene Descartes and Gottfried Leibniz attempted to formulate a means of constructing a language capable of expressing conceptual thoughts. In England, John Wilkins, among others, sought to facilitate trade and communication between international scholars using a system of “real characters,” symbols that constituted a lingua franca.

In 2001, Professor Abram de Swaan of the University of Amsterdam described how power and languages are connected in the global community in his book Words of the World: The Global Language System. His accomplishments as a social scientist enabled him to detail how a multilingual world can also be described in hierarchical terms that expose the uneven field upon which languages compete for dominance. In his model, English occupies the “hypercentral” position, whereas other languages exist more diffusely from central to peripheral positions. In the translation community, we work in this arena on a daily basis. There have been critics of de Swaan’s ideas in the academic community, but the work has been highly influential in furthering our understanding of how communication can facilitate human affairs globally.

Theoretical approaches to specifying how a universal language works are essential to understanding how the global, multilingual community might operate using a single or dominant language. But how might this work in practice? As mentioned above, thinkers in the 17th century were interested in using signs and symbols to communicate. This is still an idea that is being explored with the unlikely sounding Lovers Communication System (LoCoS) devised by Yukio Ota, Professor of Design at Tama Art University in Japan. Ota is world-famous for his design of the green running man used to mark exit doors in millions of proliferating use of emojis and their incorporation in Unicode, this is hardly surprising. But their use thus far has been largely confined to text messages and websites. They do, however, represent text to varying degrees and it is premature to say just what their future is. That said, if a picture is truly worth a thousand words, then they surely must have a bright future. When actor Kyle MacLachlan was asked to explain the plot of the film Dune on Twitter, he managed to describe the entire movie with 41 emoji characters (see Figure 1).

Movie Dune in emojis

The emoji “language” is already recognizable universally and this is due to the Unicode Consortium, which has embraced the new language, and is diligently defining and approving new emoji characters. Every new version of Unicode includes recommendations for implementation but companies are free to represent the emoji whichever way they wish, thus growing the range of expressions. With growth comes diversification, confusion and misunderstandings. With representations now covering multiple skin tones and occupations having a female variant, Unicode is doing a spectacular job in providing creative solutions. For example, the gender-diverse emoji for occupations is a combination of the standard “man” and “woman” emoji with a second emoji to represent the occupation. These are joined together by a special invisible character called the “zero-width joiner” (ZWJ). Platforms that support the new emoji recognize the ZWJ and display a single emoji while others will display two separate ones. The ability to create new emojis brings its own problems, mainly fragmentation and inability to include them in the official Unicode version. For example, Twitter has a pirate flag, Windows has added ninja cats and WhatsApp has an Olympic rings emoji, which in other platforms is shown as five plain circles. The potential for confusion and misrepresentation across different platforms can only be avoided by sticking to the official Unicode version. As the emoji language grows and increases its expressions, its universal nature is what appeals to people.

With cultural and commercial imperatives driving the world’s need for instant and universal communication, it’s hardly surprising that many place great faith in technology to provide universal, workable solutions to common problems. However, the ATA skillfully and properly put the White House in its place when a call was made in 2009 for bigger and better automated translation. In a deftly-worded response to President Obama, ATA President Jiri Stejskal asserted that “translation software and qualified human translators are vital to your goal of achieving language security. Today all the leading proponents of computer translation recognize that human beings will always be essential, no matter how sophisticated translation programs become.” I doubt any language professional would disagree, and this tends to suggest that there is assuredly no place for a universal language in our community. But the pace of change at the cutting edge of tech is still blistering. Welcome to the brave new worlds of the Internet of Things and machine learning.

Picking up on Chomsky’s idea that all aspects of a community are embodied in its language, can we say the same for a community’s technology? The ancient Greek word τη˜ λε (tele, meaning afar), which we find in telephone, television, telecommunication and so on, bridges enormous distances. These devices shrink our world, but they enlarge our communities. The Internet of Things promises to connect us to an even greater degree, if we are to believe the hype. We are promised that just as our phones pack enormous computing power into a hand-held device, billions of gadgets will be similarly empowered. A recent report identified “implementation problems” as a barrier to progress in achieving this ultra-ambitious internet-connected network of devices. Implementation in what respect? Business and tech analysts will cite security and privacy as massive headaches or achieving robust and reliable connectivity in a massively heterogeneous networked world. But what if your device speaks one language and you speak another? Will we need to localize smart fridges? The answer has to be yes. If Siri, Apple’s virtual assistant is available in a growing variety of languages; if PayPal’s services are now available in over 200 languages; if Amazon has operations in at least 15 international locations; then to paraphrase H. G. Wells, I’ve seen the future and it’s multilingual.

So what exactly is powering this hugely diverse yet intimately connected world of tech? The computing community, like the language community, communities, of which artificial intelligence (AI) is one. In turn, it too is made up of many varied communities. AI used to be regarded by more mainstream computing communities as exotic. That, however, has changed dramatically and AI is now truly mainstream. AI has many areas of application, but one that is of particular interest to the language community is natural language processing. In particular, machine learning is being applied to endow computers with the capability of “understanding” texts and, taking a further leap, of “translating” them.

With a field that draws input from computer science, cognitive psychology, neurolinguistics, data science and numerous theories of education, it is no wonder that numerous different approaches are taken to automating language acquisition. It would be counter-productive to even attempt to generalize efforts in the field. However, two approaches to training a computer to learn how to translate a language are worth a very brief examination. These are: rule-based systems and statistical systems. We should note that neither of these are cognitive approaches. They involve processing.

Rule-based systems rely upon a set of syntactic and orthographic conventions that are used to analyze the content of a source text, which provides the input to be generated into the target language. But the problems with using this approach are obvious. Word order, for example, is anything but universally the same. Indeed, the notion of core grammar just doesn’t relate to the real world of diverse language families, not to mention accommodating isolates like Basque. Clearly the problems can be overcome such that there is a way of connecting languages in pairs, but for the present we rely upon the hard work of the poor old human linguist.

The other approach involves the use of statistical processing based on bilingual text corpora. Google Translate is the perfect example of this approach which harnesses raw processing power to detect patterns of equivalence in language pairs. Almost all of the texts that are mined in this way are the product of human translators in the first place and this is what gives proponents of this approach confidence that the quality of the output target will be of satisfactory quality. Another benefit of this approach is that it is responsive to use with new language pairs and this gives some researchers hope that a monolinguistic text corpus, the engine of a universal translator, is a future possibility.

If computers are not already everywhere, they soon will be. And what of our silicon friends who speak at light-speed in 1s and 0s? Will they ever achieve consciousness as some researchers believe? AI researcher Alan Stewart, who is working on neural networks, says “I am optimistic about the future capabilities of computers and by that I mean that raw power and sophisticated logic will create amazing technology, but unless there are some startling breakthroughs, it will still fall short of nature’s biological capabilities.” However, he speculates that with the learning capabilities that computers are being given, it’s possible that they will begin to look for more efficient ways to achieve the tasks we ask them to do. That’s one of the products of learning algorithms. At the recent DEFCON in Las Vegas, a Cyber Grand Challenge was staged that pitted two computer systems against each other with the aim of discovering weaknesses in the opposing systems. The results fell short of present human standards, but this is just the beginning.

With computers able to learn large amounts of material at high speed, a new communications paradigm is a strong possibility. For example, is it possible that computers will actually create their own language? I know it sounds ridiculously far-fetched, but there was a time not so long ago that we scoffed at even quick-and-dirty machine translation. That universal language may still be out there in the future, but will we be able to understand it?

This article was originally published in Multilingual magazine, December 2016 edition.

Google has designed a free and open source font for all the world’s languages

Five years ago Google set out on an ambitious project to create a font family that encompasses the 800 languages and 110,000 characters found in the Unicode standard. Now available under open source, Noto’s aim is to get rid of the blank boxes that commonly appear when a computer or website isn’t able to display text.

 

Blank “tofu” boxes (⯐) are often encountered when sending emoji from newer Android and iOS devices to older ones. They appear when a computer or website lacks font support for a particular character. While a blank box in place of emoji is annoying, tofu in languages “can create confusion, a breakdown in communication, and a poor user experience.”

Google set out to solve this problem with the Noto — No more tofu — font project. At the time, it was a necessity for bringing Android and Chrome OS to international markets. By making Noto freely available for anyone to use and modify under an Open Font License, the goal is to permanently remove blank boxes in common languages used everyday.

The enormous project required design and technical testing in hundreds of languages. Specialists in specific scripts were consulted as characters in many languages often change based on the context of the surrounding message. Additionally, Google partnered with Monotype, Adobe, and a network of volunteer reviewers.

More ambitiously, Noto will be used to preserve the history and culture of rare languages through digitization. As new characters are introduced into the Unicode standard, Google will add them into Noto. The full font familydesign source files, and the font building pipeline are available now for free.

noto

Original article published here: https://9to5google.com/2016/10/06/google-noto-font-open-source/

5 Excuses for not Localising a Website

Many organisations planning to enter a new market and attract new customers still choose not to invest time and resources in a multilingual website. Marketing your products or services online with only one, usually an English version of your website can be a challenging task.

Below you can see counter-arguments to the most common excuses of many companies who decide not to speak the language of their customers.

1. It is expensive

Localisation doesn’t have to cost a fortune. You don’t have to localise your complete website to multiple languages. A good solution is to begin with the most important items, such as the homepage and several other pages, to see if the adapted website resonates with your target customers and if it helps to promote your business abroad. In this way you’ll spread your expenses over a period time and invest only in languages or website items that are relevant.

2. I don’t need it

It’s hard to expand to new markets with a monolingual website only. To enter a new market, you’ll have to present your services or products in the language of your target customers. Adapting your products and online presence to the culture of your target market will help you gain trust of your prospects. It will also help to create an impression that you treat your customers individually, respecting their traditions, languages and habits. So, yes, you’ll need a localised website to communicate with your customers abroad and eventually, increase your sales on the new market.

3. Everyone understands English

According to this study 60% of non-native English speakers never buy from English sites and about 50% prefer to buy in their native language. There is also a strong correlation between the amount of time spent on websites with a local language version and ordering products or services from such a website. Even if it seems that nearly everyone understands English, your customers will still prefer to buy products or services from websites in their local language.

4. I don’t have time to maintain it

Do you have time to update your business social media or post news and info about new products? Well, that’s how long it takes to manage a localised website: to add new products, new articles or images. You can also use content management systems or hire a company that offers website management services in multiple languages.

5. It won’t help me

If you think your business has to enter a new market, then website localisation will definitely help you to grow and expand. With a website adapted to the culture and expectations of your foreign customers you’ll be able to increase your sales on the local market, enhance your online presence and gain advantage over competitors.

Your customers around the world might be looking for a business like yours. But you won’t find out about it nor benefit from it with only one standard website version. So, research your target market and start adapting your content to the needs, culture and language of your users.

Original article published here: dorotapawlak.eu