Print Page   |   Contact Us   |   Sign In   |   Register
BTA Hotline: News

Nuance Advances Text-to-Speech Technology Through Deep Learning

Tuesday, February 20, 2018  
Share |
Nuance Communications Inc. has announced that it has advanced its text-to-speech (TTS) technology with deep neural networks (DNN) to deliver a new standard of quality, reducing errors by 40 percent compared to previous speech synthesis techniques.

Combining advancements in deep learning with knowledge-based developments, Nuance's Vocalizer suite of TTS solutions — including Vocalizer Embedded for embedded platforms, Vocalizer Server for cloud applications and the Vocalizer Studio development tool — enables speech output that is nearly indistinguishable from human speech, enriching user experiences across automotive, enterprise, health care, IoT and smart home offerings, and resulting in a more intuitive and conversational interaction between people and machines. The application of artificial intelligence (AI) techniques gives Vocalizer the ability to quickly learn new words, phrases and pronunciations and communicate with more expressivity and personality across more than 50 languages.

Nuance's approach to use DNN for speech synthesis is as follows. First, the networks learn the relation between written text and the corresponding voice characteristics from Nuance's vast speech data. Then, the system applies this knowledge to the words and phrases in an unseen text. In addition to learning the relations between the orthographic representation of the words and the acoustic output, Nuance's deep neural nets also use the context of the utterances to ensure that words are spoken in the appropriate expressive manner for the application, with the proper pattern of stress and intonation. For example, street names and driving directions sound clearly intelligible and articulated, whereas dialogs with a virtual assistant sound more fluent and dynamic.

"The advancements we have made through the application of DNN allow our text-to-speech technology to deliver high-quality, more expressive speech output, enabling more natural interactions between man and machine," said Christophe Couvreur, vice president and general manager of TTS, Nuance. "We're able to create highly tailored and computationally efficient solutions adapted to our customers' unique needs, their application domains and the voice persona they want to realize."

Key applications of Nuance Vocalizer include:
  • Automotive in-dashboard systems and virtual assistants
  • Robotics and autonomous virtual agents
  • Digital television and set-top boxes
  • Omni-channel customer engagement services
Nuance's enhanced text-to-speech solutions are available for the cloud today and will be made available for embedded devices this year.

BTA's Vendor Members

Powered by YourMembership  ::  Legal