The Power of Voice; Understanding vs. Recognition

It seems these days our greatest science fiction dreams are coming true with new advances in how we communicate with our devices. Moving from type and touch to spoken word and conversations with our hardware is encouraging development teams to find new approaches in presenting the same services. Asking for sports scores, bus schedules, your favorite song to play and even summoning a personal driver, have all began a migration from our forward facing devices, to invisible, voice controlled, intuitive, interactive conversations with our hardware.
Technology experts have been hacking at the code to bring advanced communication capabilities to our devices since as early as the 1970’s. Over the last few years there have been tremendous advances in AI, processing power and connectivity that have began to successfully bridge the gap between human interaction with machines. Just the other day after years of research, Microsoft announced they have made a “major breakthrough in speech recognition, creating a technology that recognizes the words in a conversation as well as a person does”. This is an exciting step towards human connection with our devices that now opens the door for developers to experiment with ways to reach understanding. We are only steps away from asking our devices (even more) questions and receiving greater in depth, meaningful, tailored responses based on our presentation of those questions.
Voice Assistants such as Apple’s Siri, Amazon’s Alexa and Google’s OK Google have already bridged communication gaps beyond belief. Individuals who have struggled with impairments in vision or tactile response can now use their voice to command the same functionality. Whether you are driving or on the go and need to call out for directions to Siri; or at home in the kitchen and Alexa is there to pull up that difficult recipe and read you the next steps. These assistants have won a place of acceptance in our hearts which only leaves room to grow as advances in understanding are reached.
“The next frontier is to move from recognition to understanding” — Geoffrey Zweig, Manager of the Speech & Dialog research Group at Microsoft
Google announced ‘Home’ earlier this quarter to compete with Alexa as the go to in home assistant. Google Home could put a dent in Alexa as it is integrated directly with the entire suite of Google apps and of course the power of Google Search. Alexa has taken an open approach allowing developers to form integrations with the device known as “skills”. Companies who have developed “skills” for Alexa to date include Spotify, FitBit, Capital One, Uber and Domino’s Pizza.
Businesses are also benefiting from advances in voice recognition and understanding. Skype Translator allows distributed teams with language barriers to communicate via voice or chat with real time translation capabilities. Dragon speech recognition software allows for on the go, accurate dictation so thoughts can be shared easily among team members.
So now that we know that ‘understanding’ is the missing puzzle piece, we can put our heads back down and continue hacking towards the future. As I’m waiting for C3PO and R2D2 to come, I’ll be exploring some of their distant cousins that have already landed on planet earth. Check out some links to SDK’s and other articles on this topic listed below!
Interested in learning new ways to integrate voice technology? Talk to us about your project or check out these open source developer kits: