Portions of this article originally appeared on VentureBeat
Five years ago, computer scientist and former Chief Scientist at Baidu, Andrew Ng made the prediction that by 2020 “at least 50% of all searches are going to be either through images or speech.” Importantly, this prediction was for China specifically despite commonly being misrepresented in the marketing press. The industry’s infatuation with voice search and its rise didn’t stop with Ng’s prediction though, in 2016 Google shared that 20% of search done in the Google mobile app is done by voice. Later that year, Gartner predicted that 30% of the searches would be done without a screen by 2020.
So, where do all the predictions and opaque stats leave us? Unfortunately, with an imperfect picture of precisely how dominant (or not) voice search is today. However, what we do have are strong signals that the shift is happening albeit at a potentially slower pace than some may have hoped for. A July 2019 survey from Voicebot found that “nearly 60% of U.S. adults say they have used voice search and 47% expect to increase usage this year.“ and that 33% of online adults are utilizing voice search monthly. Additionally, a broader global study done by Microsoft Bing Ads found that “The majority of survey respondents reported using voice search through a digital assistant (72%), and over half of respondents have used voice skills and actions with their smart voice search through a smart home speaker.” These numbers show traction and provide evidence that consumers are embracing the medium as a new way to query the digital world.
With this context in mind, let’s pivot toward how search operates via voice assistants today. Where do voice assistants turn when they need an answer? One of four avenues are taken by the AIs:
- Conversationally Optimized Web Content
- Voice Experiences such as Alexa Skills or Google Actions
- Knowledge Graph Resources and Database Partnerships
- Personality Teams
Each of these avenues represents a unique answer type and creates implications for how brands and businesses should be optimizing their digital ecosystems for visibility in a voice-first world.
Conversationally Optimized Web Content (aka Voice SEO)
Voice-optimized web content can come from a variety of sources. Typically these answers are drawn from web pages that feature conversationally written content, often using natural language modifiers with keywords, such as questions alongside concise answers. If you are looking for an answer to come through a device like a smart speaker, you should be seeking a ‘featured snippet’ to give your content the best chance of surfacing. Generally, content should aim to be in the top 10 of SERP on Google and Bing to be selected as an answer by voice assistants.
While there is lots of inexact science being postulated about how to win voice SEO, we’ve found it is best to start with a landscape analysis to understand what you are up against. While for some industries and question types, dominant sites such as Wikipedia may be hard to topple, there are many cases where we have seen 3rd party sites with limited authority being surfaced over brands themselves who have the right to win with their content. If this is the case, you should be jumping at the opportunity to try and reclaim your brand presentation.
Other use cases that need to be considered when discussing optimized web content are ecommerce applications and local search. For example, when using an Alexa or Google Assistant device to search for products, you will likely be driven to Amazon and Google Shopping Partners to hear about products. Having up-to-date product pages and product detail content here is critical for any brand’s voice presence. Another type of content you may receive from voice assistants when searching for products or information are local listings. Yet again, brands should be ensuring their presence here is prepared for voice visibility as a BrightLocal study found that 76% of smart speaker users perform local searches at least weekly. Unfortunately, according to Uberall, only 3.8% of businesses have the correct information available for these voice searches.
Ensuring your brand content is voice ready isn’t just a matter of rewriting content to be voice-friendly. Winning voice SEO is an ecosystem challenge that requires thoughtful enhancements to best position your brand as the answer across a number of consumer inquiry types that may arise.
Voice Experiences (e.g. Skills and Actions)
If you are building a voice application today, it would be wise to incorporate content and answers to questions you could expect your consumers to ask. Although experiences are not surfaced through native voice search inquiries nearly as often as optimized web content, Alexa and Google Assistant will still offer them as recommendations for some queries. Both companies have also rolled out features that make it easier for developers and brands to have their applications discovered in this way (CanFufillIntent and Implicit Invocations, respectively).
Despite the fact that there has been clear thought and emphasis going toward the design of name-free discovery and invocation of experiences, Voicebot’s recent study asked 4,000 brand and product questions to voice assistants and found that “The voice app discovery rate for questions is a mere 0.5%.” However, it’s unclear how well the apps that were passed over were designed for voice search discovery (i.e. if they had implemented the developer tools mentioned above) and what their usage was like, a factor we suspect carries weight when the assistants decide what to recommend a user.
Regardless of whether your experience surfaces every time a consumer searches on a voice-first device or not, if you are promoting your presence on voice and driving usage of your application(s), it’s important to have answers easily accessible. A consumer’s interest in answers doesn’t stop once they enter your application, in fact it may very well increase. Perhaps more importantly, once a user enters your voice experience the amount and level of data you can receive about their interactions significantly increases from the relative black box of the native platforms. Even if you have a small number of users at first, the data you capture around their queries and intents will be qualitatively valuable and should be used to inform your larger Voice SEO strategy.
Knowledge Graph and Database Partnerships
Another area organizations need to be mindful of is the ever-changing landscape of knowledge graphs and data partnerships being employed by the big technology companies for use with their voice assistant AIs. Two notable graphs that the assistants turn to regularly are Wikipedia and Yelp. For the assistants, these are seen as good resources because they are highly trafficked sites where consumers look for answers. For companies with information on these sites, it’s key to make sure everything is being represented properly as assistants may choose to turn there regardless of accuracy or user experience. Regarding Wikipedia, Voicebot states “voice assistants are using Wikipedia today because they are optimizing not to fail instead of optimizing for success when questions are posed about brands.” In the case of Yelp, Voicebot says “the consistency of the structured data and general reliability of these information sources makes them more appealing than a company or other third-party websites that have far more limited information or inconsistent data representation.”
Aside from shoring up your information on these portals, it is important to be cognizant of other behind-the-scenes partnerships that serve answers to voice device users. The partnerships tend to focus on filling knowledge gaps for the AIs in areas with agreed-upon factual answers or that depend on reliable real-time data. A couple examples of this include Amazon Alexa’s use of Wolfram Alpha to answer difficult computational questions and Samsung Bixby’s partnership with theScore for sports scores and news. Lastly, we have local search results being pulled by Google Assistant through Google’s own platform or with help from Yext on Alexa.
The final piece of the search landscape on voice comes from the assistants themselves….kind of. Each major voice assistant on the market today has what they call a ‘Personality Team’. This is an internal group of employees dedicated to anthropomorphizing the AIs by defining their tone, attitude and characteristics. Along with this work, they help to write and decide select replies the assistant will provide users. Typically these replies are reserved for direct engagement with the assistant when users ask things like “Alexa, how are you?” or “Who will win the Super Bowl?”
Where this gets interesting is when users start asking for information that has biased parties vying for authority and the AI will need to become a mediator for content distribution. We are already starting to see the effects of this dilemma in some areas, such as politics.
What to Watch For Next
Beyond developments in the core four areas of answer sources discussed above, there are several larger shifts occurring in the voice landscape which we expect to have significant influence on search practices.
‘Native’ Voice Search Growth
As voice continues to accelerate as a preferred input mechanism for consumers and they increasingly use it for search needs, we anticipate that more websites, mobile apps and other digital experiences will integrate a ‘native’ voice search component. This is something Google themselves have been doing with their android apps as they are now evolving text-to-speech inputs to utilize Google Assistant and its voice search functionality. With voice search becoming an expectation of consumers, they will want brands to offer it, in familiar experience formats, across all touchpoints.
Impact of Voice in the Car
We have written about the importance of the car before and our position on its impact for voice technology has not waned. In terms of search specifically, we see the car being a catalyst to brands and companies prioritizing their local search presence and geo-enabled functionality. All brands know they need to meet their consumers where they are and now, they’ll need to be there for them while on the road.
Once an ecosystem is formed, it doesn’t stay stagnant for long. Although Amazon and Google have a foothold on the market today, we know other players are emerging quickly. Specifically Samsung’s Bixby is starting to catch-up in terms of creating a sustainable platform for users and developers and Facebook has announced plans to create a proprietary voice assistant. Much like the assistants of today, the assistants of tomorrow will come with their own set of rules and a specific way of providing answers which brands and organizations will need to learn how to fuel.
It is early days for voice search and the discipline is still far from defined. However, this initial era of voice assistants has provided us a framework for how to think about voice-first search and how to help content become an answer for users. Although the process and sources may evolve, the thinking and best practices are likely to remain. Brands and organizations considering voice search optimization would be wise to undertake efforts toward content and ecosystem updates to help position them as an authority in their category. If we’ve learned anything from traditional search it is that once someone has been deemed a “winner” by the search engines, they are hard to shake. Voice presents an opportunity to reshape your presence, but the window won’t be open for long.
What’s a Rich Text element?
The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.
Static and dynamic content editing
A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!
How to customize formatting for each rich text
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of"
nested selector system.