“A keyboard. How quaint.”
- Lt. Commander Montgomery Scott, Star Trek IV
In a new world of voice experiences, one of the biggest challenges is not so much the technology as it is convincing generations of keyboard users to keep using voice after they've tried it. After all, we’ve all been embarrassed by voice-to-text errors when we send text messages. That experience has made us a bit mistrustful of whether voice apps are up to the job. However, the new generation of voice AI is powered by machine learning and the quality of voice interactions is markedly better. Now, it becomes the job of the VUI designer to help increase the adoption of voice solutions through education, guidance, and a sharp eye on metrics.
Another challenge designers might face is timing. This is particularly true for products that haven’t previously required or provided voice experiences. If a user opens your app and is immediately asked to provide microphone access, they may be thrown off. While allowing access may worry some users, it’s a necessary step for a successful experience.
When it comes to actions, voice interfaces are often billed as “intuitive.” In theory, yes; but voice apps are not Jarvis or C-3PO, intelligent assistants who can understand any command; they’re more limited than that. And yet–the more human an interaction (and conversation is among the most), the more likely people’s expectations inflate. How do designers let users know all that is possible without overwhelming the user? And how do they prepare for everything a user might say? How do companies like Amazon and Google make discovering and enabling their voice apps easier?
When it comes to rewards, what kind of incentives will bring users back? Is the experience immersive, with high production values? What kind of voice products make the most sense on an (often) audio-only platform? And what kind of investments can users make on a voice-only platform? What can “lock users in,” in a positive sense?
The more people know about the capabilities of VUI, the more useful they’ll find it. Many people tiptoe into the world of voice by trying several apps and then settling on just a couple that they figure out on their own. That’s why it’s up to designers and writers to help users find and remember the full potential of voice interfaces and to guide them through a discovery process. The type of guidance voice users require is more than a typical six-to-ten week drip email campaign. It’s continuing education that highlights the features and capabilities of the VUI, reinforces what has already been introduced, and keeps users apprised of new functionalities as they come online.
A single source of education isn’t nearly enough. It has to come from everywhere. Some ways to continually educate users include:
Offer helpful tips: One way of calling out this new experience is through the use of tips. A tip allows you to both educate the user, as well as specifically point out where the microphone button lives. At Soundhound Inc., senior product designer Erik Bue builds these tips into the interface.“We provide tips with every session,” he said. “By providing different tips and hints each time someone visits the app, we help people discover new things that voice can help them with. Simple use cases like setting alarms and timers are pretty well known, but some of the more rich types of experiences aren’t as obvious to people and should be pointed out.”
Tips can appear within an app, as a voice message, in email, in text messages, or any number of places. Keep in mind that a tip should be actionable—something the user can try right away.
Another way to help users understand what to do during the listening experience is to provide specific examples of things they can say. Use simple and generic examples that could apply to a wide range of users. These example queries could be something like, “Black shoes for men” or “laundry detergent.” If possible, you could also utilize previous search terms from the user’s history to create a quicker and more personal connection. So, if the user has searched for “whitening toothpaste,” you might use this as an example. We’ve found that showing one example at a time is most effective so the user isn’t distracted by multiple items.
Additionally, if a user still hasn’t said anything, resulting in a silent query, you can transition to an educational screen with more instructions or other examples of things they can do. This will allow them to pause from the listening experience and take a few moments to familiarize themselves. Remember to provide a call-to-action that will allow them to quickly get back into the listening screen when they’re ready.
Onboarding walkthrough: If you want to be a bit more bold in introducing this new voice feature, you may want to use a walkthrough-type of experience in which you block the UI with a modal when the user opens the app. You can also wait until the user taps the microphone button or clicks on a tooltip. By showing it when the app is opened you guarantee that all users will at least have the chance of seeing it, but you risk annoying them by blocking them from their normal routine. When the user dismisses the walkthrough, you can use a tooltip to show exactly where to go to try what they just learned.
On the flipside, if you wait to show the modal until the user taps the CTA, you avoid blocking the UI, but you may not reach users that don’t notice the button or don’t feel comfortable trying something new. Depending on the amount of content you want to provide, this walkthrough flow can contain multiple pages (but try not to overwhelm the user).
Teach a little at a time: VUI and UX designer Bryan Sebesta teaches voice experience design at Utah Valley University, and he recommends educating users in small chunks so they don’t get overwhelmed and give up. “Don’t release new features all at once, or at least don’t advertise them all at once. We’re limited by our memory, which can only take in so much at once. Introduce things one at a time, and then remind users of them often, so that the features and abilities make it into long-term memory.”
“The content itself is the educational piece,” Bue continued. Within certain interactions, if I ask “show me restaurants nearby” we also provide suggested follow ups to that question. “You can ask which restaurants have free Wi-Fi. These little hints help you understand how to ask questions better the next time.”
Cooperate with the user and their memory:
A top priority for voice design should be to cooperate with the user. For our purposes, this means: be brief, don’t overwhelm people, and work with the constraints of people’s working memory. When you first open a skill, give a few suggestions. As the user continues to use the action or skill, you can taper, meaning you provide fewer cues (or provide different cues, suggesting different features). Other strategies, like providing overviews and keeping any list of options to fewer than four, can also be helpful.
Prompt users to ask or try other things:
Knowing what to say without overwhelming the user is an art, not a science. The goal is always to cooperate, and that includes not underwhelming the user, either. With voice, this often means providing clear overviews of how much information we’re about to present, and how it’s structured, being concise, and knowing when to break up a lot of data into several turns, and what’s most relevant. For example, an airline might present flight options by listing out four times. When a user selects the time that works best for them, only then might we explain what time works best.
Heidi Culbertson, founder and CEO of Marvee, a company dedicated to voice design for older adults, cannot stress the value enough of collecting user data and metrics. In fact, it’s paramount to providing the best user experience and increasing adoption and retention. “Voice AI is only going to improve based on the data that we can feed into it,” she explained. “And a lot of that comes from user data.”
Pandora’s Ananya Sharan, a product manager for the music streaming service’s voice mode, expanded on the idea. “The more a user engages with a VUI, the greater the volume of data you can get. When we learn more about what your listening habits tend to be, thanks to machine learning and AI, these can all be automated to deliver the right result to you.”
How should you use customer data? Conduct ongoing A/B tests to find out what your users prefer. Then, build a programmatic and systematic roadmap to increase user engagement. Through behavior, users will tell you what’s working for them. “So many mistakes can be avoided by talking out loud with other people early on,” Sebesta said. “This helps you catch the rhythm of conversation at the first stage, and keeps you from “writing for screens,” which is what most of us are trained to do. It also keeps you from saying too much at once. Talk out loud several times before committing anything to type.”
Measuring success isn’t purely determined by volume when it comes to the addition or creation of a VUI. Some of the metrics you’ll need to monitor include:
When it comes right down to it, what users want from their voice assistant is value. Make something that makes using a VUI worth it. It doesn’t mean what you make has to be complex or involved. The ability to ask “when is my next appointment?” and hear a short response can be exactly what a user needs. Voice apps do not need to have a massive range of use cases and abilities. Often, that can only make every feature harder to remember.
Sharan has other ways to measure value. Her customers could easily search for the music they want through the Pandora app, so she must deliver a compelling reason to use voice instead. “What is the incentive for the listener to use voice instead of going and tapping it on the app?” she asks. “Making it effortless and easy drives adoption. And keeping it easy means we have to get it right every time.” Not only that, but voice has to provide something that cannot be accomplished as easily if you were typing and swiping. “If I were to go into the app and search for ‘something relaxing,’ I could go into the browse modules, look at the chill stations, look at nature sounds, and choose something. But I’m doing all the work. With voice, I don’t have to do all that. I can just say ‘Oh, I like this song,’ and it’s automatically added to my playlist, and I’m done.”
In the next chapter, learn how to enable users to choose their own experience by adding some personalization. As we build rapport with our VUI, it should remember our preferences. Let the learning continue.