The global market for voice and speech recognition software will increase from US$1.1 billion in 2017 to US$6.9 billion by 2025, with key demand rising from healthcare, automotive, voice commerce and customer service applications, according to market intelligence firm Tractica.
Additional use cases will include smart home controls, security and authentication, and voice search, consumer robot controls, Tractica says in the recently released “Voice and Speech Recognition” report.
Voice and speech recognition technology has undergone a transformation in recent years, thanks to the successful utilization of machine learning and deep learning-based natural language processing (NLP) systems. “Increasingly, companies are building viable market use cases in which the human voice will control highly sophisticated, automated processes and operations,” says Tractica.
“Speech recognition technology has been around for quite a while, but it was the emergence of smartphones and cloud computing that was the real game-changer for this market,” says principal analyst Mark Beccue.
“Artificial intelligence algorithms have improved voice and speech recognition accuracy rates significantly in the span of a few short years, and these new capabilities are enabling a wider range of applications for spoken interfaces across multiple industry sectors,” adds Beccue.
Amazon Develops Offline Voice Recognition
Amazon, a major voice assistant player in the smart home space, announced recently that it plans to enable voice recognition on edge devices, instead of carrying out voice processing in the cloud.
Last week, the Alexa Auto team of Amazon announced the release of its new Alexa Auto Software Development Kit (SDK), enabling developers to bring Alexa functionality to in-vehicle infotainment systems.
The initial release of the SDK assumes that automotive systems will have access to the cloud, where the machine-learning models that power Alexa currently reside.
“But in the future, we would like Alexa-enabled vehicles — and other mobile devices — to have recourse to some core functions even when they’re offline. That will mean drastically reducing the size of the underlying machine-learning models, so they can fit in local memory,” Amazon says in a blog post.
Amazon has reportedly developed navigation, temperature control and music playback algorithms that can be performed locally, on-device. Which means users will be able to access these functions via voice command, even when they do not have Internet access.
Amazon said it will present the findings of their edge voice processing development at this year’s Interspeech machine learning conference in Hyderabad, India.
Overcoming the Accent Barrier
Another challenge that awaits to be tackled is overcoming the accent barrier when it comes to voice recognition. A recent study commissioned by the Washington Post found Google Home and Amazon Echo speakers were 30% less likely to understand non-American accents than those of native speakers.
An Amazon spokesperson told the Washington Post that Alexa’s voice recognition is continually improving over time, as more users speak to it with various accents. Google in a statement pledged to “continue to improve speech recognition for the Google Assistant as we expand our datasets.”
Voice recognition systems are likely to improve as more people begin to use them regularly. Some companies are creating technologies that aim to address the accent problem. One of them is Massachusetts-based Nuance. Its machine-learning model switches automatically among various dialect models depending on users’ accents.
It is claimed to perform 22.5 percent better for English speakers with a Hispanic accent, 16.5 percent better for southern U.S. dialects, and 17.4 better for Southeast Asian speakers of English.