The author’s views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.
A decade ago, you could define SEO to a layperson by establishing the relationship between “search” and “text.” Fast-forward to present day, and a sizable chunk of web traffic and online purchases now come from searches initiated by voice prompt. Because users ask for content differently when they use Siri or Alexa — compared to when they type a search query into a browser — optimizing content to capture more of that traffic is going to work a bit differently.
Voice search is different than browser search
You have to make a distinction early on between voice searches that simply transcribe a voice prompt into a search bar and return a list of results, or a search action that triggers a specific command from a digital assistant-style platform. Most content isn’t going to be able to accommodate optimizations for both the Google search bar and an Alexa voice command at the same time, and some content can’t be engaged by voice-enabled devices at all, like a screen-free home smart speaker that can’t display an article or play a video. Rather, if you want to reach audiences while they interact with voice-enabled devices, you can think of voice-optimized content as another arrow in your quiver.
Not all content needs to be voice friendly
Creating content specifically geared to be findable and consumable via voice search is going to be more important for some users than others. As screen-free devices and voice-enabled search become more ubiquitous, some sites and pages would likely benefit from becoming more Alexa-friendly. For example, location-based businesses have huge opportunities to increase their foot traffic by optimizing their online presence to be discoverable via voice search. There are more users to capture every day who are likely to ask Siri or Alexa to “find a pizza shop nearby,” compared to those who might navigate to Yelp or Google Maps and perform a text search for “pizza delivery.”
That said, voice searchability isn’t necessarily what you should build your entire SEO strategy around, even for those users likely to benefit the most from high voice search rankings. That’s because voice isn’t exactly replacing text search — it’s supplementing it.
For example, Siri will update a user on the score of a game, but won’t narrate the action blow-by-blow. If you want a page to rank because you want to serve ads to users interested in sports commentary, then trying to optimize all of your content to accommodate voice may not be the most effective way to drive engagement.
However, if you want to boost foot traffic for a retail sandwich shop, then you can absolutely optimize the business listing to be easier to find when users ask for “lunch spots near me” via voice command while driving, and tailor your approach with that goal in mind.
Smart devices and voice search see usage grow, but not yet dominate
Voice search is arriving quickly but has not yet hit critical mass, creating some low-hanging fruit for early adopters with specific content goals.
In July 2019, Adobe released a study suggesting that around 48% of consumers are using voice search for general web searches. The study did not differentiate between digital assistants on smartphones or smart speakers, but the takeaways are similar.
In Adobe’s study, 85% of those respondents used voice controls on their smartphones, and the top use case for voice commands was to get directions, with 52% of navigational searches performed via voice. Consistent with Adobe’s findings, Microsoft also released a study in 2019 reporting that 72% of smartphone owners used digital assistants, with 65% of all road navigation searches being done by voice prompt.
A 2018 voice search survey conducted by BrightLocal broke out some common use cases by device:
58% of U.S. consumers had done a voice search for a local business on a smartphone
74% of those voice search users use voice to search for local businesses at least weekly
76% of voice search users search on smart speakers for local businesses at least once a week, with the majority doing so daily
Smart speaker adoption in US homes grew by 22% between 2018 and 2019 to an estimated 45% of homes having at least one smart speaker. Research released by OC&C Strategists projected the smart speaker to grow voice shopping into a $40 billion market by 2022, just in the US and UK alone.
But mass adoption of voice tech is still lagging, despite inroads made during the COVID-19 pandemic. While the 2020 Smart Audio Report by NPR and Edison Research found that consumption of news and entertainment using these devices increased among a third of smart speaker owners in early 2020, a two-thirds majority of non-owners were “not at all likely” to purchase a voice-enabled speaker in the next six months, and nearly half of non-owners who use voice commands felt the same. People who own smart speakers still perform lots of traditional text searches, in accordance with Microsoft’s 2019 study, and not everyone who has access to voice command tech likes to use it for every basic function.
Part of the delay in mass adoption may be attributed to unresolved trust and privacy questions that come with being asked to fill our homes with microphones. A majority of smart speaker owners (52%) and a majority of smartphone voice users (57%) are bothered that their smart speaker/smartphone is “always listening.” However, a silver lining is that roughly the same numbers of users for each respective device trust the companies that make the smart speaker/smartphone to keep their information secure.
Market share of digital assistants across search
There are four major smart assistants processing the majority of voice search requests at the time of publication, each with their own search algorithms, but with some overlap and data sources in common.
Understanding the market share for each assistant can help you prioritize your optimization strategy to your top growth objectives. Each of these digital assistants are tied to different hardware brands with a slightly different appeal and user base, so you can likely focus your analytics tracking efforts to just one or two platforms depending on the audience you’re targeting.
The Microsoft 2019 Voice Report asked respondents to list which digital assistants they had used before, which provides a broad idea of how much voice search traffic we can expect to come from each of these engines. Siri and Google Assistant tied for first place, commanding 36% of the market each. Amazon Alexa accounts for 25% of all digital assistant usage, while Microsoft Cortana ranked third place, powering 19% of devices.
An interesting thing to note here is that the engine powering Cortana leans largely on a partnership with Amazon Alexa. Cortana provides voice command functionality to laptops and personal computers, such as “Cortana, read my new emails”, while Alexa sees more smart-speaker requests like “Turn on the lights” or “Play NPR.”
Optimizing for voice search vs. voice actions
Voice commands actually fall into two categories — voice search and voice actions — and each looks for different criteria to determine which response will be returned first for any given voice request. It’s really important to define which one you’re talking about when assessing an SEO plan for voice search, because they process content very differently.
A voice search essentially just replaces a keyboard input with a spoken search phrase to return results in a browser, such as using the “OK Google” command in a smartphone browser. This may impact how you tailor your keyword phrases, based on the user’s tendency to phrase queries more conversationally when interacting with a voice AI.
Voice actions, on the other hand, are specific voice commands or questions from the user that trigger certain apps or automations, such as placing an order for takeout via smart speaker or checking the weather from your car. Screen-free devices like home smart speakers and some car assistants use voice actions. These commands don’t return a ranked page of results, but often a single spoken result, with a prompt for further action. If you ask an Echo Dot device for the weather, it will describe the weather out loud based on data pulled from a predetermined source. It can’t return a list of popular weather forecast sites, because there is no screen to display a Search Engine Results Page (SERP). This is an important distinction.
Smart assistants often pull data from secondary sites to return these vocal snippet results, like pinging WolframAlpha for mathematical conversions or Yelp for local business listings. One such use case would be a voice search for “order a pizza.” The AI would route the query to Yelp or Google Maps, and verbally return one result such as “I found a pizzeria nearby with five stars on Yelp. Would you like to call Joe’s Pizza to place an order or look up driving directions?” This is sometimes known as “position zero,” when a search engine returns an abstract or snippet from within the content itself to answer a direct question without necessarily sending the user to the page.
Achieving position zero depends on the device
Ranking position zero for a voice action prompt depends on where those results are being pulled from. Improving the voice search ranking for driving directions to a specific physical storefront, for example, is often a matter of improving that business’s visibility on listing sites like Google Maps and Yelp, which you may already be doing as part of your SEO plan anyway.
The data source depends on the platform running the voice search. Google and Android devices utilize Google Local Pack, while Siri crawls Yelp to return results when prompted for “the best” in any specific category, otherwise prioritizing the closest results. Since Alexa pulls local results from Bing, Yelp, and Yext, having filled-out profiles and robust listings on those platforms will help a business rank highly in Alexa search results.
Each assistant also pulls NAP identity (name, address, and phone number of a business’s online listing). NAP pulls profiles for location-based results from slightly different and sometimes overlapping sources:
Siri pulls local recommendations from the NAP profiles on Yelp, Bing, Apple Maps, and Trip Advisor
Android devices and Google Assistant pulls NAP profiles from Google My Business
Alexa pulls NAP profiles from Yelp, Bing, and Yext
Cortana, powered by Alexa, pulls from Yelp and Bing
Someone hoping to optimize their business page for voice search will want to max out their NAP profiles across all platforms by making sure that their listings at business.google.com, bingmapsportal.com, and mapsconnect.apple.com are completely filled out. This is also where a reputation management product like Moz Local can help businesses looking to improve their rankings.
Should you go after the voice snippet feature?
Again, many of the strategies you’d use to achieve first position on a text-based web search still apply to optimizing voice search. To improve voice performance specifically and appear in SERP features and voice snippets, on-page content should be structured so it’s easy to extract, basically reverse engineering the featured snippet you want to produce. But the question is, will it actually help you to rank well in that kind of search? That depends on your goal.
If the page you’re optimizing is built to sell more pizza to local customers, then yes, a featured snippet that pulls your NAP data from Google My Business and provides the pizzeria’s phone number to a hungry local parked nearby is a very good thing. But if the page in question is intended to serve sponsored content about diabetes management to drive clicks to an affiliate link for glucose monitoring strips, then you don’t necessarily want to build a page that helps Siri define Type II diabetes aloud to an eighth grader completing their homework.
Structuring the content headings with a question, followed by a concise answer in the paragraph below, makes it more likely that Siri will recite content from a given page when asked a similarly worded question by the user. The first answers a digital assistant gives when responding to a voice search query are typically the same type of snippets that show up in SERP features such as “People Also Ask” and Knowledge Graph results from Google.
In other words, Siri is unlikely to return your website to answer the voice prompt “What is the chemical composition of sugar?”, but you could rank highly with a featured snippet to answer a search like “Is sugar really bad for children with ADHD?”
The most valuable content for those seeking on-page visitors is the kind that addresses questions that are hard to answer with a single spoken response.
Rand Fishkin made his predictions on the role of the vocal snippet in search results as voice search was ramping up in 2016, and provided some advice on how you can plan your content around it in this Whiteboard Friday. According to Fishkin, it depends on whether you’re in the “safe” or “dangerous” zone for the content you’re trying to rank for, based on how easily a voice response can address the user’s query without sending them to your page.
“I think Google and Apple and Amazon and Alexa and all of these engines that participate in this will be continuing to disintermediate simplistic data and answer publishers,” Fishkin wrote.
He advises users to question the types of information they’re publishing, adding that if X percent of queries that result in traffic can be answered in fewer than Y words, or with “a quick image or a quick graphic, a quick number,” then the engine is going to do it themselves.
“They don’t need you, and very frankly they’re faster than you are,” Fishkin summarized. “They can answer that more quickly, more directly than you can. So I think it pays to consider: Are you in the safe or dangerous portion of this strategic framework with the current content that you publish and with the content plans that you have out in the future?”
Voice-enabled devices are gradually becoming more embedded in consumers’ daily lives, but that doesn’t mean we should prioritize our content as though voice is bearing down on the traditional search engine results page, threatening to replace text all together in the role of SEO. Even if smart assistants and voice-enabled devices continue to become more popular year over year, they still fill a relatively niche role in most consumers’ technical gadget ecosystem at this time. That could change as the voice AIs become more sophisticated and talking to our gadgets starts to feel more normal, but the industry is still grappling with some serious growing pains.
Voice search and voice action technology still has some really exciting applications looming on the horizon, and marketers are already finding clever ways to insert their brands into the hands-free experience. Optimizing content for voice search is just one piece of that puzzle.
Give us your hottest takes and wildest predictions on where voice search is headed in 2021 in the comments!