Cookie Preference Centre

Your Privacy
Strictly Necessary Cookies
Performance Cookies
Functional Cookies
Targeting Cookies

Your Privacy

When you visit any web site, it may store or retrieve information on your browser, mostly in the form of cookies. This information might be about you, your preferences, your device or used to make the site work as you expect it to. The information does not usually identify you directly, but it can give you a more personalized web experience. You can choose not to allow some types of cookies. Click on the different category headings to find out more and change our default settings. However, you should know that blocking some types of cookies may impact your experience on the site and the services we are able to offer.

Strictly Necessary Cookies

These cookies are necessary for the website to function and cannot be switched off in our systems. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms. You can set your browser to block or alert you about these cookies, but some parts of the site may not work then.

Cookies used

Performance Cookies

These cookies allow us to count visits and traffic sources, so we can measure and improve the performance of our site. They help us know which pages are the most and least popular and see how visitors move around the site. All information these cookies collect is aggregated and therefore anonymous. If you do not allow these cookies, we will not know when you have visited our site.

Cookies used

Google Analytics

Functional Cookies

These cookies allow the provision of enhance functionality and personalization, such as videos and live chats. They may be set by us or by third party providers whose services we have added to our pages. If you do not allow these cookies, then some or all of these functionalities may not function properly.

Cookies used




Targeting Cookies

These cookies are set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant ads on other sites. They work by uniquely identifying your browser and device. If you do not allow these cookies, you will not experience our targeted advertising across different websites.

Cookies used


This site uses cookies and other tracking technologies to assist with navigation and your ability to provide feedback, analyse your use of our products and services, assist with our promotional and marketing efforts, and provide content from third parties



Upcoming Events



Outsourcing - Tips & Best Practices



Here are some suggested Connections for you! - Log in to start networking.

Dwi Prasongko
Call Center Manager
Julie-Anne Hazlett
Head of Communications and Quality
Rippeon Fong
Programs Specialist

Article : Speech Self Service: Now Mainstream, But Still Tricky

Vice President of Technology
Add Contact

Speech self service is becoming mainstream. The value of these systems, if implemented well, is dramatic. Acceptance by end customers is rising. In fact, there are more than 3,000 deployed speech self-service implementations in place today.

These systems, though, still have very visible problems. Weak speech recognition leaves customers cold. Jokes about frustrating IVR systems trapping callers abound, and we have all seen commercials and comedy skits about these automatons. While these jokes are evidence that Speech is mainstream, they hit at a true sore spot. In short, most of these implementations are terrible.

Even if you're already using speech self service, there's a lot to consider. The next generation (either fourth or fifth depending on how you count) is actively being developed, with "adaptive" or "dialog-driven" systems coming out this year. Multimodal applications – combining voice and graphical user interfaces and allowing each to play to its strengths – are emerging fast, especially for handheld devices. A standards battle is quietly raging – with a 3.0 version of the VoiceXML standard driven by a broad community, a unilateral move by Nuance to push a competing standard called xHMI on the other hand, and Microsoft's SALT down but not out.

Their payoff is compelling, the promise of speech is so alluring…but it's still expensive and hard to field even mediocre systems. So will speech recognition realize this promise? Or are our expectations simply too high?

The Value And Risk Of Speech Self Service In The Call Center
There are several key benefits of using speech. The handsfree factor is huge – as anyone who drives knows. Speech is truly more natural than touchtone and usually shortens call times and increases customer satisfaction. Speech allows you to "flatten" complex menus – saving time and increasing call completion rates. And there are some things you just can't do with touchtones – like "change of address" applications.

At the same time, however, it can be costly and time consuming to deploy speech self-service technology correctly. Speech self-service is unlike any other technology deployed in the call center, and even "packaged apps" require special attention to ensure that they're usable and consistently perform well. And even the "poster child" implementations suffer from usability and performance issues from time to time.

What To Watch Out For
It's clear that speech self-service implementations are still tricky; there are some serious "gotchas" involved. Here's a few common pitfalls and some tips:

  • Performance: Caller satisfaction depends very strongly on the response time experienced, known as Customer Perceived Latency (CPL). Many projects, for example, have assumed they could use an existing Web service for an IVR—only to find that it was simply not fast enough, and to be forced to write a more traditional middle tier duplicating the Web service functionality. Componentizing and using multi-tier architectures adds delays; Web Services in particular are not easy to make fast. Performance and load testing and tuning are especially important for any large project.

  • VUI vs. GUI Usability: VUIs and GUIs are very different, and usability is a huge factor. Projects that try to mimic an existing Web page in speech invariably fail. Multimodal (both voice and graphical) UIs are a breed unto themselves and are also challenging to do well. There is a body of good knowledge and practices about VUI design, as well as a number of great specialists and consultants. Use them. Usability for VUIs and Multimodal UIs is a specialized art.

  • Expectations: Unrealistic expectations are too common for speech implementations, springing in part from "HAL". Widespread use of "personas" and prevelance of truly amazing demos can make this worse – callers encouraged to "say anything" often really do! Make sure your management understands that speech still has significant limitations before you undertake your next speech project.

  • The PLDs (Pesky Little Details): Every implementation has a host of "little" problems. Some example issues found during a recent performance test on a Speech/VXML environment:
    -Middleware errors retrieving valid member id when using DTMF input versus speech input
    -30 seconds dead air after greeting prompt
    -Call balancing issues
    -Dead air after call connection
    -Web server initialization delays after periods of quiescence
    -CTI server – unacceptable memory utilization and page faults
    Expect to deal with the PLDs; leave time to iron them out, and test early and often.

  • Tuning: Most speech applications require extensive tuning, which can increase automation rates significantly in the months following cutover. Tuning uses the experiences and data from real calls to adjust call flow, prompts, speech grammars, thresholds, and other parameters. It can be very expensive, and may need to be repeated as the application or the callers' behavior evolves. Plan to do tuning and set expectations that the initial deployment is not as good as it gets.

When VoIP And Speech Recognition Collide
Voice over IP brings a whole new level to speech implementations. VoIP enables much more flexible and lower cost configurations. It also makes options like hosting more practical. But the combination of VoIP and Speech Recognition can introduce some subtle interactions.

For example, some low-bit-rate codecs used in VoIP, such as G.723.1 and G.729, are optimized for human-perceived speech quality, not for speech recognition. Interactions between the Voice Activity Detection (VAD) used in gateways and the algorithms VAD speech recognition uses can cause issues. Most of these interactions are resolvable through carefully setting configuration parameters. But the time and expertise needed to get these parameters right can be impediments to smooth speech rollouts.

It's A Matter Of When, Not If
These problems may seem daunting, but they are surmountable – with big benefits. The key is to be aware of potential issues and prepare for them. With the right training, processes, and tools, speech self-service projects can be smooth and successful.

Importantly, deploying speech self-service technology is well within most call centers' grasp today. The current generation can serve you well – if implemented carefully and properly, with the right expectations set. The next generation is exciting – both more effective and more accessible – and you can expect continued improvement year after year. So I recommend looking hard at adopting speech self-service – just be careful not to leave your customers in the cold.

About Jeff Fried:
As Chief Technical Officer, Jeff works with customers, analysts and industry consortia to determine ways of better serving customer needs in the face of new business and technology trends. Jeff joined Empirix from contact center application firm Unveil Technologies. Prior to Unveil, he was co-founder and CTO of Teloquent, a supplier of advanced contact center software that has since been acquired by Syntellect.

About Empirix:
Company LogoEmpirix has been solving testing and monitoring challenges for contact centers, helpdesks and communication networks since 1992. Our solutions help our clients reduce risk, accelerate technology time-to-market, maximize customer retention and reduce operating costs. Our flagship technologies, Hammer and Hammer Cloud Platform, are used by companies to enhance user experience. By simulating real-world communications, we test the end-to-end interoperability of network services and applications validating every function of your voice and data network. Hammer Cloud is a single cloud-based platform designed to accelerate release cycles by supporting the cross-functional testing needs of development, QA and operations teams, providing self-service testing and test case design for voice applications and contact center technologies and facilitates the adoption of DevOps practices with a shared script library and script porting across the software development lifecycle.
Company RSS Feed   Company Twitter   Company YouTube   Company LinkedIn   Company Profile Page

Today's Tip of the Day - How To Choose Speech Recognition

Read today's tip or listen to it on podcast.

Published: Tuesday, February 21, 2006

Printer Friendly Version Printer friendly version

2021 Buyers Guide Messaging Systems


miSecureMessages is an encrypted messaging application designed as a pager replacement for healthcare organizations, call centers, and enterprise environments.

Startel Corp.

Contact Center Software
Startel is a leading provider of unified communications, business process automation and performance management solutions for contact centers. Since its founding in 1980, Startel has established a loyal customer base from a variety of industries, including contact centers, education, healthcare, insurance and telephone answering service. Startel's solutions are designed to enhance the customer experience, improve employee productivity, reduce operating costs, and increase revenues.
PH: 800-782-7835


Engage with your customers in Real-Time.

Connect personally with your customers, generating new revenues and boosting CSAT scores. Synthetix Chat is the fastest way to engage your users, with a Live Key-Press Feed, ID and Verification, simultaneous chats and AI-Powered predictive suggestions, Synthetix Chat reduces averages handling times by up to 50%.
PH: +44 1279 555 580

New 2021 Membership

About us - in 60 seconds!

Submit Event

Upcoming Events

Join Calabrio May 5th at 10am GMT in our first annual CX quiz that tests—and rewards—your expertise in arguably the most important aim of contact centres. 1 hour + 16 questions + multiple discussions = An unforgettable experience. You and your fellow... Read More...

Latest Americas Newsletter
both ids empty
session userid =
session UserTempID =
session adminlevel =
session blnTempHelpChatShow =
session cookie set = True
session page-view-total =
session page-view-total =
applicaiton blnAwardsClosed =
session blnCompletedAwardInterestPopup =
session blnCheckNewsletterInterestPopup =
session blnCompletedNewsletterInterestPopup =