How AI-powered speech enhancement is transforming global contact centers

In its most basic definition, communication is about the exchange of information. The degree to which the intended message aligns with the received message is a matter of clarity, and in a noisy world, clarity is critical.
That’s especially true in the contact center. Customers who call in search of support want to have clear, concise conversations that lead to timely resolutions of their original queries. From the customer perspective, it can be frustrating when an excess of background noise or a lack of agent confidence prolongs the support process, especially if the lack of clarity results in the customer being forwarded on to another agent, having to repeat themselves or being put on hold. When these moments of friction happen, the caller’s impression of the entire customer experience (CX) is likely to take a hit, and that can jeopardize hard-earned customer loyalty. A TELUS Digital survey revealed that 55% of respondents think “Nothing excuses a bad customer experience,” and 54% would rather get stuck in a slow-moving traffic jam than endure a bad customer experience.
Customers and agents both want the same thing: clarity. And it is the desire for clarity that is giving rise to the emergence of AI-powered speech enhancement. In order to support our global clients and agents, TELUS Digital has partnered with Tomato.ai, a company that delivers cutting-edge AI-powered speech enhancement solutions for contact centers and enterprises. In the words of Tomato.ai’s CEO, Ofer Ronen, their product “enhances spoken communication in real time by softening accents and improving clarity, without altering the speaker’s voice identity.”
Read on to discover how speech enhancement technology works, the benefits that it can yield and some common applications in customer experience.
How speech enhancement AI works
Speech enhancement AI leverages cutting-edge speech-to-speech models to transform audio in real time. These models directly modify the acoustic features of speech, preserving the speaker’s voice while improving clarity and reducing accent-related friction.
The AI works by first encoding the speaker’s voice into a high-dimensional representation that captures both linguistic content and vocal characteristics. The AI modifies only pronunciation-related features before decoding the speech back into audio. This approach allows the solution to address mispronunciations without altering the speaker’s identity or emotional tone.
Unlike systems that rely on continual learning, the solution operates in real time without needing to adapt or “learn” from new input, ensuring predictable performance and low-latency responsiveness — critical characteristics for contact center environments. Additionally, the enhancement pipeline integrates intelligent noise suppression to ensure clean, professional-grade audio even in challenging acoustic conditions.
Tomato.ai, in particular, takes a cloud-first approach that enables the speech enhancement software to run on a range of devices with minimal compute requirements. “At the core is a proprietary, cloud-hosted speech-to-speech AI model built specifically for low-latency, real-time communication,” explained Ronen. Unlike other speech AI applications that require several seconds to process verbal input, Tomato.ai has optimized their audio streaming pipeline to minimize latency.
The process begins the moment an agent starts speaking, with the system capturing audio input and making improvements instantaneously. Ronen explained how the process works: “When a call center agent speaks, their audio is routed through Tomato.ai’s cloud-based AI model, which instantly processes the voice stream to adjust pronunciation patterns and reduce heavy accents. This improved voice stream is then sent on to the customer, making the speaker sound clearer, more understandable and easier to connect with — while still sounding like themselves.”
That agents still sound like themselves is an important point of emphasis. Customers still want to speak with human agents, not machines. The aforementioned TELUS Digital survey found that if respondents could only get customer service in one way, 46% would want to speak to a real human — almost double the next most common response (which was "messaging/texting with a real human," selected by 27% of respondents).
Benefits of speech enhancement
It doesn’t matter if your contact centers are offshore, nearshore, onshore or on the moon — speech enhancement processes can bring a higher level of clarity to your customer care. And as a result of that clarity, there’s a number of additional advantages relevant to every aspect of the customer experience.
Customer benefits of AI speech enhancement
To understand how AI speech enhancement benefits the customer, let’s work through an example of a customer calling about the delivery status of an item they previously ordered online.
First, let’s say that the caller is connected with an agent based in a contact center not operated by TELUS Digital, and that does not leverage modern customer experience technology like speech enhancement AI. From the outset, the customer can tell that the agent they are connected with is clearly in a busy contact center. The customer can hear the background noise, the agent seems distracted by it and it’s reducing the quality of the exchange. To make matters worse, a poor connection seems to be reducing the intelligibility of the agent’s voice. The factors combine to create a situation where the customer has to repeatedly ask the agent to repeat themselves. As this goes on, the customer becomes increasingly frustrated and the agent feels more and more helpless. In the end, the agent apologizes and transfers the customer to another agent.
Now, consider a contact center that leverages effective speech enhancement technology in their contact centers. From the moment that the agent and customer are connected, background noise is eliminated from the equation. AI enhances each word the agent utters so that the customer can clearly comprehend everything. In fact, the only time that the agent had to repeat themselves was when confirming the order number and that they did indeed mean “O” and not “0” in the alphanumeric code — a harmless, and human, example of two people seeking clarity in communication.
From this example, the customer was able to clearly understand the agent, and as a result, is likely to indicate a higher customer satisfaction score (CSAT). The caller got to the answer they were seeking more quickly (lower call handling time) and without having to be forwarded on to another agent (first-call resolution).
Agent benefits of AI speech enhancement
Speech enhancement technology can create a virtuous cycle for customer service agents. In some cases, the fact that they are able to deliver their intended message clearly and without background noise means they’re less likely to frustrate customers, which leads to increased agent confidence. In other cases, the technology gives them more confidence in their ability to deliver support, meaning that customers are less likely to direct frustration toward them in the first place, further boosting their confidence. Either way, with more confidence, agents are put in a position where they can get more satisfaction from their work.
With increased agent satisfaction and efficiency, there’s every chance that they'll hit their performance targets and continue their careers with their current company, lengthening their tenure and opening up further growth opportunities. It’s also worth noting that a simultaneous increase in agent confidence and customer satisfaction can open up cross-sell and upsell opportunities. A customer is far more likely to hear more about an offer if their core needs are being met, and an agent is far more likely to deliver a compelling pitch if they are assured in their communication. This can correspond to higher conversion rates and average order value.
Importantly, this cycle retains the agent’s unique personhood and way of speaking, aligning closely with TELUS Digital’s Humanity-in-the-Loop principles for purposeful innovation. Robin Jakobsen, director of product strategy, CXM at TELUS Digital explained, “When done right, AI speech enhancement empowers agents to communicate more clearly while preserving the authenticity of their voices.” This means that agents can maintain their unique communication style and personality — crucial elements in building genuine connections with customers.
Business benefits of AI speech enhancement
Speech enhancement technology enables global brands to expand their contact center operations into previously overlooked locations, dramatically increasing the size of the available talent pool. This establishes the foundation for organizations to provide genuine human support across multiple time zones, better serving customers around the world. For companies currently outsourcing or considering it, speech enhancement opens up a wider array of cost-effective locations. As Jakobsen put it, “This empowers businesses to effectively and efficiently harness a global workforce, fostering seamless follow-the-sun operations."
Regardless of location, brands can also benefit from a number of operational efficiencies. This includes lower agent attrition rates, and cost savings associated with improved first call resolution and reduced handle times. These improvements not only save money, but they also strengthen brand reputation.
Finally, speech enhancement can lead to more accurate AI-generated call transcriptions, which become critical training resources for brands that use AI copilots to provide agents with both real-time guidance and post-call feedback. “What makes speech enhancement particularly powerful is how it amplifies the effectiveness of other CX technologies. Whether it's improving the accuracy of AI agent assist for interaction summaries and more precise response suggestions, or ensuring our CCaaS integrations deliver optimal voice quality — clear communication is the foundation that makes everything else work that much more effectively,” said Jakobsen.
Partner up to deliver better, clearer support
Customers who call for support want answers to their questions, and they want to form an understanding with minimal effort. Brands, and the customer service agents who represent them, want their customers to feel satisfied with their services and their support experience. This calls for effectiveness and efficiency, which are both unlocked by clarity.
There are certainly elements of the modern contact center dynamic that can diminish that clarity. But these challenges can be overcome — and indeed are being overcome — by AI-powered speech enhancement. “We understand the challenges that brands face communicating in a global market. As part of an integrated approach to customer experience management, speech enhancement technology helps bridge communication gaps, enabling agents to connect with customers more effectively and achieve better business outcomes,” explained Jakobsen.
If you’re ready to achieve exceptional contact center performance with AI-powered communication enhancement, contact our sales team today.