Common pitfalls to avoid when building a custom voice assistant
Nov 04, 2021
7 MIN READ

The Most Common Pitfalls to Avoid When Building a Custom Voice Assistant

Providing fast, accurate, and engaging voice experiences is increasingly becoming a priority for brands across industries. According to a study by Adobe, 91% of companies are already making significant investments in voice AI, and 94% plan to increase their investment in the coming year. Brands that currently don’t have a voice assistant strategy should seriously consider the benefits of voice experiences to meet consumer demand and to stay ahead of the competition.

With more companies investing in voice AI technology, brands are looking for differentiation. Just having an accurate, knowledgeable voice assistant may not be enough to create brand loyalty and encourage returning users. Avoiding common pitfalls during the development phase could save years of post-implementation adjustments and increase positive user experiences from the very first interaction. 

While building a custom voice assistant doesn’t happen overnight, it’s worthwhile to take the time necessary to ensure that the voice assistant your users encounter is the best version possible. Even before you begin, voice assistant design and development can be improved through data and user testing. 

Here are 6 common issues you’ll want to address in the initial design and development stages of your voice assistant: 

  • Branded experience 
  • Target audience
  • Connectivity options
  • Personality
  • Plan for iterations
  • Voice AI partner

Invest in your custom wake word

A wake word is your user’s first interaction with your voice assistant—creating first impressions, positive reactions, or frustrations. How the initial interaction goes could determine the future of the user relationship. To start off on a strong footing, you’ll want to invest time and money in creating a custom wake word with all the essential elements. 

Many best practices and strategies go into creating a custom wake word for your brand, including:

  • Easy to pronounce
  • Pleasing sounds
  • Avoid rhymes or associations
  • Test on various types of speech
  • 3-4 syllables

Once all these pieces have been put together, it’s time to collect data to train the model. Voice samples from a variety of regions, ethnicities, genders, ages, and cultures should be used to avoid biases. Data from noisy environments should also be used to ensure there are no false positives or negatives when there is background noise. 

Voice samples from a variety of regions, ethnicities, genders, ages, and cultures should be used to avoid biases. 

Without proper time and testing on a custom wake word, you might have to start from the beginning. Facebook recently received attention for announcing that it would abandon its wake word, “Hey Facebook,” because of the confusion it caused users. Further user testing and education may have benefited the company.

Know your voice assistant’s target audience 

It’s essential that your voice assistant understands your target audience, whether it’s their language or their accent. When developing your voice AI strategy, you’ll want to look at where your target audience is, which languages they speak, and which accents they have. If you’re evaluating voice assistant technology platforms, you’ll want to make sure the voice AI provider has multilingual and accented language capabilities that match your users. 

Multilingual voice assistants are vital for reaching a larger target audience and creating an exceptional user experience. Customers feel more comfortable speaking in their native language, and being able to give voice commands in their native language will create a deeper bond and more meaningful interactions. 

In addition, broaden your customer base by implementing a voice assistant that can understand accents. According to Translate Day, there are an estimated 160 dialects of the English language in the world. Teaching your voice assistant to understand accents is essential to reaching your target market with a voice assistant that understands them. It’s also something that should be started from the very beginning of the process. Waiting to add languages and accents requires starting over with the right training data—costing considerable time, money, and wasted resources developing the first version. 

There are an estimated 160 dialects of the English language in the world.

Translate Day

It should also be noted that although creating multilingual voice assistants that also understand accents is a process, it is also an investment in your greater voice AI goals.

Choose connectivity options

As voice AI technology matures, brands are faced with more choices about how much cloud connectivity they want or need for their unique use cases. Whether they choose full-cloud connection, no cloud connection (with embedded solutions), or a hybrid model that offers both embedded and cloud connectivity, will depend on individual product needs. Brands now have the freedom to choose the level of connectivity that will best meet their users’ needs and expectations. 

Brands now have the freedom to choose the level of connectivity that will best meet their users’ needs and expectations. 

Translate Day

Cloud connectivity allows for users to have access to a wealth of information on the internet. Users can stream music, ask for the weather, book a flight, order take-out, play a movie, and more. Cloud-only connectivity works best for devices that don’t require any embedded features, such as a mobile device, app, or smart speaker. 

Embedded voice assistants open a world of possibilities with their enhanced security, lower processing power, and reduced costs. Everything is stored locally on the embedded voice assistant with no access to the cloud, so privacy is increased, and voice control can still be used in areas without an internet connection. Embedded voice assistants work well for devices in healthcare, manufacturing, QSRs, hospitality, and the hearables and wearables markets. 

Hybrid voice solutions offer the best of both worlds—cloud connectivity and embedded functionalities. Users will have access to the internet to search and stream with the convenience of hands-free embedded operability, such as rolling up and down windows, preheating the oven, or closing the blinds. Voice-enabled cars and IoT devices are among those most suited for hybrid voice assistants. 

When starting on your voice-first strategy, it’s essential to evaluate your target audience, their needs, and their expectations to choose the voice AI connectivity option that will work best for them and for your voice assistant. 

Infuse personality into your voice AI 

In an increasingly growing voice AI market, it’s not enough to have high accuracy and speed to gain brand loyalty and user satisfaction. Brands need to differentiate their voice assistant from their competitors. One way to stand out from the crowd is to make your voice assistant likable and relatable.

Users can have more natural interactions with voice assistants by adding human-like qualities, such as conversational speech, good grammar, and appropriate intelligence. Nothing throws a conversation off more than hearing a voice assistant respond with inaccurate grammar, which leads the user to think there is something seriously wrong with it or that it’s too robotic. 

Users can have more natural interactions with voice assistants by adding human-like qualities, such as conversational speech, good grammar, and appropriate intelligence. 

Another tactic is to add personality through voice, tone, word choice, and humor. A voice assistant’s personality can form a deep bond with users who have been known to form a type of friendship with their voice assistant—further nurturing brand loyalty. When creating a voice assistant, brands will want to consider their voice assistant’s gender, tone, pitch, vocabulary, and level of humor through user testing to understand what will resonate most with their audience. 

Improve your voice assistant with data and testing.

A voice assistant isn’t a single, one-and-done program but an ever-evolving project that should be improved upon through data and user testing. At every important stage of building your custom voice assistant, user testing should be done to ensure that your voice assistant meets your audience’s needs and expectations. 

A voice assistant isn’t a single, one-and-done program but an ever-evolving project that should be improved upon through data and user testing. 

Once the voice AI is complete, user data will inform what is working and what needs to be improved upon, so future iterations can be completed. Voice user interfaces are a program that needs a long-term strategy. Technology will continue to evolve, and data will show what parts of your voice assistant need updating. Elements, such as a custom wake word, multiple languages, accented languages, and personality, are also challenges that may take many attempts before truly delighting the user.

Select the right voice AI partner

Choosing the right voice AI partner is another important step when building a custom voice assistant. You’ll want to carefully evaluate what the voice AI platform can offer and make sure it aligns with your and your users’ goals for the voice assistant. Brands will especially want to make sure the platform offers the capabilities we’ve discussed—a custom wake word, multilingual and accented language, connectivity options, personality, and control over data. 

Brands will especially want to make sure the platform offers a custom wake word, multilingual and accented language, connectivity options, personality, and control over data. 

Partnering with the wrong provider could result in poor user experiences and frustrations and the inability to access user data to see what needs to be improved upon. Many long-term costs of not owning your voice assistant should also be considered as this important decision is made. 

By having a branded experience, targeting your audience, choosing connectivity options, infusing personality, planning for iterations, and selecting the right voice AI partner, you’ll be able to avoid the common pitfalls when building a custom voice assistant. Whether you’re embarking on your voice-first strategy or looking to improve your custom voice assistant, make sure you consider these tips to create exceptional user experiences.

At SoundHound Inc., we have all the tools and expertise needed to create custom voice assistants and a consistent brand voice. Explore our independent voice AI platform at Houndify.com and register for a free account. Want to learn more? Talk to us about how we can help bring your voice strategy to life.

Kristen is a content writer with a passion for storytelling and marketing. When she’s not writing, she’s hiking, reading, and spending time with her nieces and nephew.

Interested in Learning More?

Subscribe today to stay informed and get regular updates from SoundHound Inc.

Subscription Form Horizontal