Intelligent Transcription in the Call Center
Oct 20, 2022

Why Real-Time Voice AI Transcription Will Boost Contact Center Performance

Consumers are impatient. They’ve grown accustomed to instant access to a variety of services from fast food delivery to online medical visits and instant online movies. 

And they are impatient with contact center support services. 90 percent of consumers rate an “immediate” response as important or very important when they have a customer service question, where immediate equals 10 minutes or less (Source: HubSpot Research)

HubSpot research about consumer impatience

Many contact center software providers are aware of this urgency and have already begun adopting AI, voice AI, and real-time transcription services. These include legacy contact center platforms, contact center as a service (CCasS), conversational AI platforms, and agent assist applications.

Contact Center Agents Hampered with Limited Capabilities

A key focus is how to make agents more productive. One key area is real-time transcription. For many years, companies used offline transcription that typically occured after an agent disconnected from a caller and functioned more as a recap and learning experience. 

In recent years, contact center leaders have adopted versions of “real-time” transcription but, in reality, these offerings are limited in key capabilities. First, they do not offer real-time processing, instead using a multi-step, sequential process to handle text-to-speech and then natural language processing. 

These steps lose valuable time during real-time conversations and can’t deliver the understanding and meaning that an agent requires fast enough. 

SoundHound does ASR & NLU in one step

Intelligent transcription at the “speed of speech” 

Instead of the typical two-step process used by many transcription vendors,  SoundHound’s Intelligent Transcription accomplishes both of these tasks in one step simultaneously (see diagram)— recognizing speech through automatic speed recognition (ASR), then creating meaning through natural language understanding (NLU). This delivers  faster, more accurate results and also lets SoundHound voice assistants go beyond sounds and words to better understand meaning and intent of human speech. 

Greater understanding is passed on to the call center agent who receives the real-time transcript formatted to include proper capitalization, punctuation, numbers, currency, dates, and more. Smart formatting and diarization combine to improve transcription usability with greater comprehension and accurate speaker identification—even during interruptions or when multiple speakers are in the same conversation.

Intelligent Transcription at the “Speed of Speech”

Intelligent transcription identifies topics and entities

SoundHound Intelligent Transcription’s fast, accurate recognition, and deeper understanding of conversations can also be combined with predictive analytics applications to suggest responses and next best actions across a broad range of topics. 

Topics are specific content that relate to common issues or concerns that a customer might contact a company about. In a contact center agent assist application,  if a customer tells the agent: “I’d like to ship this item back,” the SoundHound Intelligent Transcription service will understand and automatically tag this as a request for a return/exchange. It will then retrieve this content and present it immediately to the agent so that he or she has specific, relevant responses without latency.

Contact center agent assist topics might include: 

Complaint topics:

  • Complaint
  • Service outage
  • Service not working
  • Service slow
  • Cancel service

Service topics:

  • Service appointment (install, repair, and others)
  • Add service
  • Renew service

Billing topics

  • Billing error
  • Bill payment
  • Bill deferment
  • Refund

Return topics:

  • Returns (of products)
  • Unavailable (product)
  • Defective (product not working)
  • Damage
  • Repair (product)

Other examples of available intelligence capabilities: 

Caller says: “I’m on an annual plan and would like to change to monthly.”

System Identifies relevant topic: Change of service

Caller says: “I want to stop this service.”

System identifies relevant topic: Cancellation. 

Caller says: “Call me back later tomorrow, on 555….”

System identifies relevant entities: Call back date and phone number. 

Topic ID from Intelligent Transcription

SoundHound Intelligent Transcription service also helps establish meaning by offering  a suite of features that can identify common entities like social security numbers, phone numbers, date, time and currencies. 

Identifying entities is particularly useful with follow ups and line transfers. Imagine a customer was transferred from one agent to another or a call dropped and the customer called again. Because entities were tagged and also formatted properly (dates in 00/00/0000 format, social security #s as xxx-xx-xxxx), it’s easy and fast for the new agent to find specific information in the previous chat and quickly accommodate the customer. 

Limitations of Big Tech in the Call Center

While numerous software vendors have chosen a big tech vendor for their voice AI solution, many have been hampered by long deployment times and limited to no support. Voice AI often requires a more customized strategy and technical implementation and often software call center vendors are left on their own to figure it out.

Independent voice AI vendors often bring years of experience and expertise that help software vendors get up to speed faster.

For instance, an independent vendor can deliver more turnkey transcription solutions by providing templates and blueprints for more repeatable use cases and solutions, such as appointment management, which includes setup, confirmation, and rescheduling. With a pure transcription offer, for example, the vendor takes on the entire NLU challenge of assigning meaning, and identifying topics and entities. With an independent voice AI vendor such as SoundHound performing this, software vendors can get to market more quickly with new use cases.

At SoundHound, we have all the tools and expertise needed to create custom voice assistants and a consistent brand voice. Explore SoundHound’s independent voice AI platform at or speak with an expert or request a demo below.

Speak to an Expert.
Arvind Headshot

Arvind Rangarajan is a product marketer with a passion for creating and conveying great, targeted product messaging. When he is not working, he loves to take landscape photos, hike, and play with his dog.

Interested in Learning More?

Subscribe today to stay informed and get regular updates from SoundHound Inc.

Subscription Form Horizontal