the future of corporate voice technology

Most of us have a laid back attitude when it comes to painting a picture about our technology. Europe as a whole is closely following the technological trends in the United States.

Recent research has shown that more and more consumers are using smart speakers to make their lives easier, and that Amazon voice-activated devices now account for about 70% of the US market share, with US households using nearly 100 million devices. By 2023, the number of digital assistants should reach about 8 billion, which exceeds the current number of inhabitants of the planet. This exponential growth indicates that the population is becoming more and more familiar with these solutions.

This is even more striking when you recall that not long ago, the only experience many consumers with voice technology in the professional world had was during disappointing calls with customer support, whose system seemed almost designed to prevent you get access to a consultant. But since then, these programs have made an incredible leap forward. Now they are much closer to the hero of my childhood, the computer, who was a digital assistant in the series “Star Trek”: it allowed to fully recognize the voice without any problems with comprehension and without the need to repeat. In short, technology can finally fulfill its true characteristics, and natural language processing promises almost unlimited gains in time and effort.

The genesis dates back to the 70s

How did this invention come about? Speech recognition and digital assistance solutions did take effect in 1971. The origins of their development lie earlier, but in my opinion, the creation of the Harpy system by Carnegie Mellon University is really a starting point. Capable of processing more than 1,000 words and a few sentences, this was the first truly working version of this technology.

In 1986, IBM released the IBM Tangara solution based on the Hidden Markov model. Thanks to statistics, she was able to predict the following phonemes of speech, which marked the beginning of great progress in this area: more than 20,000 words were recognized.

NaturallySpeaking 1.0, the first computer product for continuous dictation, was released in 1997 by Dragon Systems. Ten years later, the PAL (Personal Learning Assistant) program was born out of a DARPA-led military research initiative, and artificial intelligence came to the fore. SRI Inc., born of this development, was later acquired by Apple.

In 2008, Google introduced a voice search app for mobile phones, while Apple introduced cloud voice recognition before releasing SIRI in 2011. Finally, in 2014, Amazon launched the Echo solution based on Alexa, its famous digital voice assistance system. Despite this late entry into the market, you will not forget the figure above, which is a clear indicator of Amazon’s role in democratizing voice programs.

The world of tomorrow

But what opportunities do these advanced features offer in the professional world? I think we are still exploring the surface. There are a number of simple applications that can be transferred to the work environment: remote voice control, online digital assistants or bots that are responsible for simple tasks, such as referral to the appropriate department in large companies. For mobile workers, this feature can have a wide range of uses. Imagine that in a warehouse you can ask your mobile computer terminal when the last time a certain part was checked, or list the last ten faults. Or maybe even explain a specific maintenance or repair procedure. These types of solutions would be possible, in addition to answering more common questions about your next site you visit, or just remembering where you put the keys!

But, like any idea in development, these potential new programs still have some issues that need to be addressed. Underground or out of range Wi-Fi is required for the mobile computing device to have sufficient computing power to analyze the task, as well as memory capacity to accommodate the natural language library. Solutions will also need to learn to deal with differences in stress and pronunciation. However, working offline also has benefits. Local processing provides additional security as well as greater speed, as information no longer needs to be processed in the cloud.