Amazon announced it will offer eight new voices that may be applied for Alexa skills for third-party developers without charging a fee. “The new capability can help you enrich your skill’s experience, making it more engaging for customers,” said Amazon in a statement.
When telling a story or playing a game, various voices for different characters will make the voice experience more engaging.
For now, the eight new voices will be available in U.S. English only, with a mix of male and female voices. Developers may adopt the new voices in their skills via Amazon Polly.
First introduced in Amazon’s developer event in 2016, Amazon Polly enables developers to turn text into speech with the company’s deep learning technology. Today it’s capable of voice skills like whispering, speech marks, timbre effects and other compression that makes voices sound more natural.
Third-party developers can use Structured Speech Markup Language (SSML) and give a “voice name” tag on the specify voice they want, in order to change voices in Alexa skills. The method will be faster than recording an mp3 file.
Multi-voices on Alexa versus Google Assistant
At Google I/O last week, Google just announced six new voice options for its Google Assistant generated by Wavenet, including one from singer John Legend.
Google’s news is different from what Amazon just announced.
The six new voices for Google Assistant are only for the Assistant itself, rather than third-party skills use. Google’s goal is to make the Assistant talk more vivid and human to users.
By Amazon, the company’s eight new voices are for skill-builders, which makes the skills more vivid and natural.
Google has this feature already. Developers can build Google Actions (voice skills) selecting from four different voices. However, developers can only choose “one” voice from the Assistant app for the entire user experience.
Amazon’s multi voices are for various voices being used in one Alexa skill.
In this way, Amazon provides more options on voices for developers on their voice skills.
And with the help of Amazon’s text-to-speech technology, developers can build their voice applications in a more flexible way, instead of recording voices for skills and re-recording every time for skill updates.