Keyvan Mohajer, CEO SoundHound Inc.
Feb 14, 2020

How to Talk to Your Voice-Enabled Car and Everything Else

In a recent Forbes article, Martine Paris interviewed SoundHound Co-Founder and CEO, Keyvan Mohajer, about his beginnings and the current and future state of voice AI. A Star Trek fan and serial entrepreneur, Keyvan always saw the world around him as something to be explored.

“I was a science fiction fan and realized there were a number of cool concepts that hadn’t been developed yet,” he told Paris. “There was teleportation where you could beam to any location, there was the Holodex which could turn any room into any environment, and there was the replicator that could make anything like food or devices. But what stood out most was voice AI. I knew 20 years down the road this would become our reality.” 

That’s when he partnered with classmates James Hom and Majid Emami to create the SoundHound music discovery app. Today, SoundHound Inc. is the leading innovator of conversational technologies, allowing people to interact with the things around them in the same way we interact with each other: by speaking naturally.

During the interview, Paris asked Keyvan about the state of voice AI and why he feels the Houndify independent Voice AI platform is smarter than the rest. 

Here is the transcript of their conversation:

Q: How are you able to compete against industry titans like Google, Apple and Amazon?

A: Google, Apple and Amazon have a certain vision of the world. They want their assistants everywhere and they want people to say their name, “Hey Google,” “Hey Alexa,” “Hey Siri.”

But imagine 20 to 30 years in the future, when 10 billion people are living among 20 billion robots, some are doctors, some are lawyers, some are teachers. Should they all be called Alexa?

That’s not what brands want. Brands want customers to say their name. “Hey Mercedes.” “Hey Honda.” Our platform allows for that kind of personalization.

We’re on a mission to bring voice AI to all things – cars, kitchen appliances, smart speakers, hotel rooms, wearables, cell phones, computers – and power some of the most popular brands in the world including Citroen, Deutsche Telekom, Samsung | Harman, HERE Technologies, Honda, Hyundai, Kia, Mercedes-Benz, Motorola, Pandora, and Peugeot.

Q: Would you say that your AI is smarter than Google, Alexa, and Siri?

A: Yes, our technology is superior. We use speech-to-meaning (not speech-to-text-to-meaning) which makes our IoT conversations faster and more contextual. We also use deep meaning understanding which is capable of processing complex sentences of arbitrary length, with compound criteria and multiple exclusions. This is different than standard NLU (natural language understanding) which uses hard-coded “entity detection” and can only understand simple queries like “Show me sushi restaurants in San Francisco.”

People have low expectations of AI’s ability to understand complex questions and converse with assistants with short, simple, keyword-based queries, but it shouldn’t be that way. Computers are better at computing than humans. With our technology, users can talk to their cars like they’re people and ask multiple questions across different domains of understanding. For example, “Hey Mercedes, show me five star sushi restaurants in San Francisco open after 9pm, but don’t include those without wifi, and please let me know if it’s raining.”

[Here is an example of the type of conversation which SoundHound’s AI was capable of in 2015. As the video progresses, the conversation becomes increasingly complex. The video went viral on YouTube in 2015 and currently has over 2 million views]

Q: Can your AI sense mood and emotions?

A: We’re working on it. In order to talk to devices the way we talk to each other there needs to be both the intelligence component and the emotional component.

Q: How much of the movie Her will become our reality?

A: In the near future, there will be a lot of smart devices and they’ll be part of our daily lives. We’ll talk to our alarm clocks, our coffee machine, then our car, at work we’ll talk to our computers and devices, and then we’ll go home and talk to our tv. AI will be everywhere, with an emotional element, and in time, people will come to accept it as its own being.

When I first saw the movie, “Her”, my team was in the midst of programming our platform to talk back and we were debating whether our AI should refer to itself as “I” when answering a request like “Show me some restaurants.” We wondered whether the response should be, “Here is what I found” vs. “Here are some restaurants.”

There was a deep philosophical split across the industry. Google avoided saying “I” while Apple was making Siri sound like a person.

I found the message of the film to be very powerful and it convinced me that this thing exists and deserves to refer to itself as “I” – today that’s the norm.

Successfully implementing a voice-first strategy is not without its challenges. To learn how top manufacturers and teams in a variety of industries are building greater brand recognition, creating exceptional user experiences, and proving real business value with voice AI, read our recent guide: Overcoming the Top 3 Challenges of Voice AI Adoption.

Interested in Learning More?

Subscribe today to stay informed and get regular updates from SoundHound Inc.

Subscription Form Horizontal