The collaboration announced on March 25, 2026 brings ElevenLabs’ premium Text‑to‑Speech (TTS) and Speech‑to‑Text (STT) services into IBM’s watsonx Orchestrate, the company’s agentic AI agents orchestration platform. The integration enables developers and enterprise teams to embed high‑fidelity, human‑like voice into AI agents that run across large‑scale business workflows.
Voice interactions have become a pivotal channel for both customer‑facing and internal AI applications, yet many deployments still rely on static prompts or robotic‑sounding speech. By plugging ElevenLabs’ neural TTS engine into watsonx Orchestrate, organizations can now generate speech in 70 languages, complete with regional accents and a library of over 10,000 synthetic voices. The same integration also adds real‑time speech recognition, allowing agents to understand spoken input and respond naturally.
This shift from text‑centric to voice‑centric AI is more than a convenience upgrade. It addresses a core pain point—long hold times and rigid call flows—by giving enterprises the tools to create conversational experiences that feel genuinely human while remaining under strict security controls.
Security and Compliance Built In
Enterprise deployments demand rigorous data protection. The combined solution inherits IBM’s enterprise‑grade safeguards, including PCI compliance for secure payment processing, a Zero Retention Mode that aligns with HIPAA‑required data handling, and configurable data residency options. These features aim to keep voice data private and compliant, a necessity for sectors such as finance, healthcare, and government.
“AI agents are becoming central to everyday work, and voice is where AI either earns trust or loses it,” said Mati Staniszewski, Co‑founder of ElevenLabs. “Together with IBM, we’re helping organizations replace robotic interactions with AI agents that people actually want to talk to, built with the security and compliance controls that enterprises require.”
How watsonx Orchestrate Leverages the New Capabilities
watsonx Orchestrate already provides a unified environment for building, deploying, and governing AI agents that automate complex workflows. With ElevenLabs’ speech services now native to the platform, developers can attach voice generation and recognition steps directly to orchestration pipelines. This eliminates the need for separate third‑party APIs and simplifies scaling, as the platform can manage high‑volume, concurrent voice interactions across global user bases.
“We’re bringing a voice to AI Agents in the enterprise. As clients increasingly deploy agentic AI that interacts with their customers and employees, they want these experiences to feel intuitive, responsive and accessible,” explained Nick Holda, Vice President of AI Technology Partnerships at IBM. “IBM’s open ecosystem approach offers clients the flexibility to choose the models and tools that fit their business, and our integration of ElevenLabs into watsonx Orchestrate is a powerful example of that – enabling enterprises to deploy AI agents that sound natural, scale globally, and address security, reliability and governance.”
Market Implications
The partnership positions both companies to capture a growing slice of the enterprise voice‑AI market, where demand for multilingual, secure, and scalable solutions is accelerating. By uniting ElevenLabs’ cutting‑edge speech synthesis with IBM’s orchestration and governance framework, the offering differentiates itself from pure‑play TTS providers that lack enterprise compliance features.
For developers, the integration means a single SDK and unified API surface for both speech generation and recognition, reducing integration overhead. For business leaders, the promise of a voice‑enabled AI that can handle sensitive data across multiple jurisdictions opens new use cases—from multilingual call‑center automation to voice‑driven internal support bots.
Outlook
ElevenLabs and IBM have indicated that the partnership will continue beyond the initial launch, with plans to deepen integration and expand voice capabilities further. As enterprises push AI agents deeper into customer and employee experiences, the ability to deliver natural, secure, and globally accessible voice interactions could become a decisive factor in technology selection.
- Statements regarding ElevenLabs’ and IBM’s future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.









