10th December 2017
I attended an interesting WARC mini-conference on voice tech and machine learning. The stand out part was a joint session from Mindshare and JWT, highlighting their research in this area.
Here are some observations from this session regarding voice tech, beginning with some background before looking specifically at the commercial challenges and opportunities for brands. (Please also see my recent article, Why brands should be bothered about voice bots).
Adoption of voice will be fast and simple
A key dynamic is the change in the way we are accessing the web, moving from ‘typing’ to ‘speaking’. Those who are not yet connected will not even bother with typing – they will go straight to voice; as the WSJ observed in this recent piece – The End of Typing: The Next Billion Users Will Rely On Video and Voice.
Adoption of these technologies will be fast and simple. This is the first technology that doesn’t need to be learned; rather it will adapt to us. As speaking is already ‘learned’, voice interaction will be easy and intuitive. The JWT/Mindshare neuroscience research asserts that this form of interaction is less demanding and the unequivocality derived from direct answers is highly engaging:
..for heavy users, the key benefits of interacting with devices through voice are ease, convenience, and speed. The cognitive load of both inputting and receiving information through voice is drastically reduced – and this lighter load is core to the liberation from the screen.
Voice is increasingly accurate
The voice assistant market has been recently energised, with Google bringing out a price competitive mini smart speaker to compete with Amazon’s Echo Dot; alongside Chinese search company Baidu investing heavily in voice.
The kernel of this technology’s current success is its accuracy. Siri’s launch in 2012, received mixed reviews and initial take-up was slow. By contrast, its current comprehension error rate currently stands at around 95% (which is the same as human speech) whilst Baidu’s technology can deliver up to 98%.
As voice interaction (different from ‘isolated’ screen activity) reflects basic human experience it may encourage ancillary human to human interaction in and around its use.
Jeremy Pounder, Futures Director at Mindshare, observed that voice “gives an early indication that speaking to a brand delivers a deeper emotional connection than interacting with it through type or touch. When people asked a question involving a brand name, their brain activity showed a significantly stronger emotional response compared to people typing that same brand question.”
More than Jeeves
But the rise of voice (backed up by machine learning) is not just about enjoyable interaction and quick answers. These ‘digital butlers’ are about anticipating needs and making suggestions. Although skills and apps are important right now, this technology will soon will become more about conversation – unilaterally surfacing stuff of interest.
There can also be health and well-being benefits. One perspective is that we won’t be hunched endlessly over our screens, thus reducing incidences of ‘tech-neck’ and there will also be benefits in automotive environments where voice interaction will be much safer than screen use, which can distract drivers.
Another example of the health benefits of voice, comes from Vietnam where the insurance company AIA has created a smartphone app called Open Aiya (based on the Vietnamese word for ‘Help’) which uses Siri’s always-on software to provide a voice-activated panic system that alerts up to five loved-ones and emergency services. By saying “Hey Siri, open Aiya,” the app will activate.
In the Automotive sector, the Honda Hana uses a system that makes traffic suggestions and understands (basic) human emotions. The Hana claims to “learn from you always”, and is one example of a proactive assistant within a controllable environment.
The rise of these machine focused, quasi human relationships (epitomised so vividly by Scarlet Johannsson in Her) means that we are becoming connected to machines in new and more compelling ways. 73% of personal assistant users, said they would use the technology ‘all the time’ if understood properly (Neuro Insight Study 2017).
As we continue to crave intimacy in an increasingly disconnected world, machines will move into spaces once occupied by people. A compelling example of this is ‘Hikari’, a female hologram and personal assistant – targeted at single young men in Japan.
In the same vein, Xiaoice, an assistant from Microsoft with over 40 million users in China and Japan, and which converses with users on Tencent’s WeChat platform; fills an emotional void – 25% of users have told Xiaoice “I love you”.
So what does this all mean for brands?
Just as brands created personas across previous touchpoints, so they must do with voice. They need to find their voice, and create an emotional, human connection – bearing in mind the words to be used, the nature of voice employed and an appropriate personality.
In terms of delivery channels for brands, there are two possible routes. Either through a personal assist intermediary (Alexa, Home) or via a brand’s own assets or channels, whether these be products, web properties, or retail spaces.
It is also essential content is optimised for voice interaction. It is now even more important to appear at the top of the list of search results. After all, if you want some information quickly, how many answers will you really need?
According to research carried out by digital agency 360i, voice optimisation increased click-through rates by 30%. To do this a brand needs to deliver in the following areas:
- Answer the question
- Place the answer high up in the page
- Answer directly and use the language of “is”, “what” and “how” and be objective and informative
- Brands often try to answer with “our approach is…” which doesn’t work;
- and finally if possible, make the search term part of the page title.
Additionally, we will see the rise of (hyper) local, Q&A content, and the rise of more specific longtail keywords, as people tend to verbally ask more specific questions.
Paid search will continue to be useful, but brands can also optimise their mark-up for voice search by giving it more structure from which the voice assistant can work. Voice will especially make a difference in ‘micro-moments’ when people need specific help at a key moment in time. Perhaps when dealing with a stain, a cooking challenge or re-ordering a product.
Good use of voice reduces friction, augments the customer experience and even causes pleasure in the interactive instance.
Four things brands must consider in the area of voice technology
- Optimise content for voice interaction. People don’t want their assistant to read out the first full page of search results. An assistant will only curate a handful of top hits.
- Identify micro-moments. A brand’s voice strategy should kick in only during key moments during the customer journey. At each of these, an interaction with voice must provide real value.
- Learn the rules of voice engagement. Conversations have structure, cues that help humans navigate an interaction. Google has taken much of its thinking about conversational UX from Paul Grice’s work on the Cooperative Principle of conversational structure.
- Develop the brand’s voice to forge an emotional bond. Roughly three quarters of smartphone users surveyed believed brands should have unique voices and personalities for their apps and skills.
Tech has been on a long-term path to becoming more casual and voice is a less cognitively demanding and more emotionally-driven way to take in and deliver information. For brands, the transition to voice can be advantageous, and there are clear opportunities for those that can find a persuasive voice and make their brands sound as compelling as possible.