One day back in October 2011 I arrived at work just after 0430 in the morning. We had a queue of people waiting for the iPhone 4S. Most weren’t aware of the new features in the iPhone 4S but were there for the experience. As a quick history lesson the iPhone 4S introduced a dual core processor, 8 MP camera and a digital voice assistant, Siri, to the Apple smartphone.
Siri was introduced as unlike other voice control systems because you spoke to it normally, it understood accents and dialects and you didn’t need to train it to understand you.
Despite Apple’s massive marketing budget, Siri wasn’t the marketing coup d’état that we might have expected but it did invigorate the competition into doing something similar. Samsung’s 2012 flagship Android device, the Galaxy S III, came with S Voice to cite the obvious.
Let me write about the technology behind the new generation digital voice assistants. Modern digital voice assistants are cloud-based because of the complexities of understanding the human voice. As powerful as our smartphones and tablets are, they lack the power to interpret speech in a timely fashion without extensive voice training, powerful processors and plenty of memory. Instead, when we activate the voice system, the handset records our speech and then sends it to a server somewhere on the Internet. The server interprets what we’ve just said and sends instructions back to the device.
Our devices need to be online in order to use the function, but this is hardly a handicap for most.
The advantages of using a cloud based digital voice assistant are straightforward. First is that any improvements made to the service can be rolled out to all devices that use it, usually without requiring a software update. Linked with this is how some systems use crowd-sourcing to improve the technology; as they listen and interpret more accents, they take note of any corrections we may make.
Second, it avoids the requirement for high powered devices to use the digital assistant. We’re already seeing this thanks to Android Wear, where a comparatively low powered smartwatch doesn’t understand what you ask it but instead transfers information to and from the Internet via a smartphone.
Google are using their voice recognition services in conjunction with Google Now, the
mind reading preemptive system for guessing what, when and where we need information. Google’s voice recognition system understands commands ranging from asking it for directions, starting emails and messages, calling people or asking general knowledge questions.
Now it seems that everybody has a digital voice assistant function embedded into their products but despite the technology improving year on year, it’s rare that we see people using it in public. We’re not yet at the Star Trek stage of our relationship with computers, partially because the technology is not yet perfect. Sometimes, it gets it wrong and it’s going to get it wrong when you’re being watched! Even innovative features such as the Moto X touchless control haven’t stopped that awkward feeling of talking to your device in public.
Do you use a digital voice assistant? If so, what’s your favourite use of it? Hit us up in the comments below.