Using technology that seems to be taken right out of a Star Trek episode, Microsoft has demonstrated its ‘universal translator’ that can translate spoken English into Chinese in real time, and speaks back in the speaker’s own voice.
The Deep Neural Networks technology, developed through a research partnership between Microsoft and the University of Toronto, has made this possible, as was demonstrated at a presentation in Tianjin, China, at the end of October, GigaOM reports.
According to a blog post written by Rashid after his presentation, the Deep Neural Networks process is patterned after human behaviour, making it possible to better recognize words and structure in speech. Rashid explains that this technology has lead to a much more accurate rate of translation: previous methods would get about one word in every four or five wrong, while this new process using Deep Neural Networks has an error rate of about one word in every seven or eight.
Geek.com explains that this speech translation technology fixes two major problems that existed with translation software. First, it gives the translated audio a more natural sound, since what it outputs is a blend of the original speaker’s voice and samples of a voice of a Chinese speaker who had recorded several hours of vocabulary. This results in a more natural-sounding final product. Second, it performs the translation of the English words into Chinese (Mandarin, in this case), which is the simple task all translators do, but then rearranges the words to better suit the translated-to language.
Rashid admits that the software isn’t even close to ready for wide use yet and still needs a substantial amount of work, but it still shows a great deal of potential for what’s possible with the Deep Neural Networks technology.