Speech self service is becoming mainstream. The value of these systems, if implemented well, is dramatic. Acceptance by end customers is rising. In fact, there are more than 3,000 deployed speech self-service implementations in place today.
These systems, though, still have very visible problems. Weak speech recognition leaves customers cold. Jokes about frustrating IVR systems trapping callers abound, and we have all seen commercials and comedy skits about these automatons. While these jokes are evidence that Speech is mainstream, they hit at a true sore spot. In short, most of these implementations are terrible.
Even if you're already using speech self service, there's a lot to consider. The next generation (either fourth or fifth depending on how you count) is actively being developed, with "adaptive" or "dialog-driven" systems coming out this year. Multimodal applications – combining voice and graphical user interfaces and allowing each to play to its strengths – are emerging fast, especially for handheld devices. A standards battle is quietly raging – with a 3.0 version of the VoiceXML standard driven by a broad community, a unilateral move by Nuance to push a competing standard called xHMI on the other hand, and Microsoft's SALT down but not out.
Their payoff is compelling, the promise of speech is so alluring…but it's still expensive and hard to field even mediocre systems. So will speech recognition realize this promise? Or are our expectations simply too high?
The Value And Risk Of Speech
Self Service In The Call Center
There are several key benefits of using speech. The handsfree factor is huge – as anyone who drives knows. Speech is truly more natural than touchtone and usually shortens call times and increases customer satisfaction. Speech allows you to "flatten" complex menus – saving time and increasing call completion rates. And there are some things you just can't do with touchtones – like "change of address" applications.
At the same time, however, it can be costly and time consuming to deploy speech self-service technology correctly. Speech self-service is unlike any other technology deployed in the call center, and even "packaged apps" require special attention to ensure that they're usable and consistently perform well. And even the "poster child" implementations suffer from usability and performance issues from time to time.
What To Watch Out For
It's clear that speech self-service implementations are still tricky; there are some serious "gotchas" involved. Here's a few common pitfalls and some tips:
satisfaction depends very strongly on the response time experienced, known as
Customer Perceived Latency (CPL). Many projects, for example, have assumed
they could use an existing Web service for an IVR—only to find that it was
simply not fast enough, and to be forced to write a more traditional middle
tier duplicating the Web service functionality. Componentizing and using
multi-tier architectures adds delays; Web Services in particular are not easy
to make fast. Performance and load testing and tuning are especially important
for any large project.
VUI vs. GUI Usability: VUIs and
GUIs are very different, and usability is a huge factor. Projects that try to
mimic an existing Web page in speech invariably fail. Multimodal (both voice
and graphical) UIs are a breed unto themselves and are also challenging to do
well. There is a body of good knowledge and practices about VUI design, as
well as a number of great specialists and consultants. Use them. Usability for
VUIs and Multimodal UIs is a specialized art.
expectations are too common for speech implementations, springing in part from
"HAL". Widespread use of "personas" and prevelance of truly amazing demos can
make this worse – callers encouraged to "say anything" often really do! Make
sure your management understands that speech still has significant limitations
before you undertake your next speech project.
The PLDs (Pesky Little
Details): Every implementation has a host of "little" problems. Some example
issues found during a recent performance test on a Speech/VXML environment:
-Middleware errors retrieving valid member id when using DTMF input versus speech input
-30 seconds dead air after greeting prompt
-Call balancing issues
-Dead air after call connection
-Web server initialization delays after periods of quiescence
-CTI server – unacceptable memory utilization and page faults
Expect to deal with the PLDs; leave time to iron them out, and test early and often.
Tuning: Most speech applications require extensive tuning, which can increase automation rates significantly in the months following cutover. Tuning uses the experiences and data from real calls to adjust call flow, prompts, speech grammars, thresholds, and other parameters. It can be very expensive, and may need to be repeated as the application or the callers' behavior evolves. Plan to do tuning and set expectations that the initial deployment is not as good as it gets.
When VoIP And Speech Recognition Collide
Voice over IP brings a whole new level to speech implementations. VoIP enables much more flexible and lower cost configurations. It also makes options like hosting more practical. But the combination of VoIP and Speech Recognition can introduce some subtle interactions.
For example, some low-bit-rate codecs used in VoIP, such as G.723.1 and G.729, are optimized for human-perceived speech quality, not for speech recognition. Interactions between the Voice Activity Detection (VAD) used in gateways and the algorithms VAD speech recognition uses can cause issues. Most of these interactions are resolvable through carefully setting configuration parameters. But the time and expertise needed to get these parameters right can be impediments to smooth speech rollouts.
It's A Matter Of When, Not
These problems may seem daunting, but they are surmountable – with big benefits. The key is to be aware of potential issues and prepare for them. With the right training, processes, and tools, speech self-service projects can be smooth and successful.
Importantly, deploying speech self-service technology is well within most call centers' grasp today. The current generation can serve you well – if implemented carefully and properly, with the right expectations set. The next generation is exciting – both more effective and more accessible – and you can expect continued improvement year after year. So I recommend looking hard at adopting speech self-service – just be careful not to leave your customers in the cold.
About Jeff Fried:
As Chief Technical Officer, Jeff works with customers, analysts and industry consortia to determine ways of better serving customer needs in the face of new business and technology trends. Jeff joined Empirix from contact center application firm Unveil Technologies. Prior to Unveil, he was co-founder and CTO of Teloquent, a supplier of advanced contact center software that has since been acquired by Syntellect.
Empirix has been solving testing and monitoring challenges for contact centers, helpdesks and communication networks since 1992. Our solutions help our clients reduce risk, accelerate technology time-to-market, maximize customer retention and reduce operating costs. Our flagship technologies, Hammer and Hammer Cloud Platform, are used by companies to enhance user experience. By simulating real-world communications, we test the end-to-end interoperability of network services and applications validating every function of your voice and data network. Hammer Cloud is a single cloud-based platform designed to accelerate release cycles by supporting the cross-functional testing needs of development, QA and operations teams, providing self-service testing and test case design for voice applications and contact center technologies and facilitates the adoption of DevOps practices with a shared script library and script porting across the software development lifecycle.
Published: Tuesday, February 21, 2006
miSecureMessages is an encrypted messaging application designed as a pager replacement for healthcare organizations, call centers, and enterprise environments.
Contact Center Software
Startel is a leading provider of unified communications, business process automation and performance management solutions for contact centers. Since its founding in 1980, Startel has established a loyal customer base from a variety of industries, including contact centers, education, healthcare, insurance and telephone answering service. Startel's solutions are designed to enhance the customer experience, improve employee productivity, reduce operating costs, and increase revenues.
Engage with your customers in Real-Time.
Connect personally with your customers, generating new revenues and boosting CSAT scores. Synthetix Chat is the fastest way to engage your users, with a Live Key-Press Feed, ID and Verification, simultaneous chats and AI-Powered predictive suggestions, Synthetix Chat reduces averages handling times by up to 50%.
PH: +44 1279 555 580