Time to upgrade your membership
Take advantage of all the benefits of an executive membership - Click here to upgrade now
Article : Speech Comes Of Age
Till now speech recognition has been viewed as an experimental, technology, limited in application and difficult to implement. Though it promises a 'close to human' interaction at massively reduced costs, the let-down factor has been high, with problems of comprehension leading to consumer disenchantment. Those days are gone, says Jim Hennigan, Managing Director of speech provider, Eckoh. He assures us that the technology is now mature enough to support mass market, high volume usage; that dialogue design has overcome the comprehensibility challenge and that, enhanced with self learning capabilities, automated speech can now provide personalised services that remember and anticipate individual consumer behaviour. If that's true, the speech market may be about to explode, giving companies a powerful new tool with which to reduce operating cost while boosting service quality and customer choice.
The commercial history of speech automation has been chequered. Though pioneered by technologists working for the American Department of Defence during the Cold War years, allegedly to support international phone tapping activities, it wasn't until the late 1980's that the search for practical, commercial applications began. Unfortunately for speech recognition's reputation, the first commercial problem companies chose to address was one which, at this stage of its development, the technology was least equipped for. Telcos on both sides of the Atlantic looked to automation to ease the labour intensive burden of directory enquiry services. The relatively unsophisticated speech technologies of the time failed to deal with the almost infinite variety of personal and business names contained in the telephone directories of America and Europe's major conurbations.
The Seeds Of Disillusionment
This failure sowed the seeds of disillusionment, funding dwindled and progress slowed. Developments that did make their way into the public arena failed to impress. The crude nature of early recogniser technology led to the development of 'speaker dependent' services. In simple terms this meant each caller to any service had to complete a 15 minute exercise to 'train' the technology to accommodate his or her individual voice and accent. Success rates were low, with only around 30% of callers completing the training. One major UK bank – having invested substantial six figure sums in the development of such a system – abandoned it on the eve of launch, recognising that it's customers' patience would be quickly exhausted. Despite the fact that, using today's technologies, solutions can accommodate vocabularies of over a million words with accuracy rates in the high nineties, businesses with long memories have been understandably cautious in their approach.
Sponsor message - content continues below this message
2019 '14th annual' Global Contact Center World Awards NOW OPEN!
Enter your Center, Strategy, Technology Innovation, People and more into the ONLY TRULY GLOBAL awards program - regarded by many as being like the Olympics for the Contact Center World! Join the best from over 50 nations and compete for the most prestigious awards out there!
Content continues ….That caution is no longer justified. In a wide range of industries automated speech is proving that it can deliver cost savings of up to 95%, while increasing customer choice and access - delivering more services, with greater consistency, via more channels on a 24-hour no-queue basis than could ever be achieved using agents.
The Technological Turnaround
The technological turnaround in speech automation occurred in the early 1990's when 'speaker independent' recognisers were developed. The need to 'train the system' to individual voices was firmly removed. This, combined with the advent of cheap and plentiful computer processing power, meant large vocabularies and the diversity of spoken language could, at last, be catered for.
Today, according to Datamonitor, IVR - interactive voice response systems that use voice or touch tone to allow customers to select options – are used in 27,000 call centres around the globe to manage around 55 million calls a day. However, naïve – or downright sloppy – adoption of IVR, has led to widespread consumer frustration. Though the technology functions perfectly, poorly designed systems that fail to take into account the way people want to interact, drive customers to abandon calls.
The truth is; it doesn't need to be this way.
The Advent Of Common Sense
Speech automation has suffered because people allowed their enthusiasm for what technology can do to over-ride their understanding of what customers want. The companies that are succeeding with speech automation today take a commonsense approach; starting with a human situation and seeing how technology can improve it, rather than taking a technical capability and shoehorning people to fit. They're making use of the latest psychological understanding of the way people choose to communicate to design services that mirror natural human decision making patterns and negotiate the intrinsic but subtle differences between what people say and mean.
Conversation specialists, script writers and psychologists are involved in the development of solutions that complement rather than suppress natural human speech patterns and decision making behaviours. They begin by assessing the process the company is trying to automate and look at how successfully it is working at present. If the process is currently managed by a live-agent service, they'll listen to its calls and create a detailed map of the way callers behave; what questions do they typically ask, how and in what order? Even more importantly, they assess the motivation for the call – what does the caller really want to know; what's the difference between what they say and what they actually mean? For example, we've recently observed that when travellers call TrainTracker™, the train information service for National Rail Enquiries, to ask the time of the last train from Euston to Hemel Hempstead they really mean 'what's the last train I can catch before I'm stranded for the night', not 'what's the last train in this 24-hour period'. A train at 15 minutes past midnight will serve as well as one at 15 minutes to.
Having developed working scenarios, dialogue designers will test them in focus groups and pilots before roll out. And the wise will remember, too, that once a solution is deployed the learning doesn't stop. Careful monitoring is essential and, once again, the focus must be on the caller, rather than the technology. A word recognition success rate of 98% is worthless if 80% of callers fail to complete their transactions. The reasons for failure can have nothing to do with the technology. In a recent deployment for a financial services company we found a high proportion of callers to our pilot opting out of the call within the first 15 seconds or so. The reason, we discovered, was that the first information the system requested was their customer reference number. Because it was unclear which of the several numbers on their paperwork that was they simply gave up. Success rates increased exponentially when we added a simple but accurate prompt; 'that's the six digit number in the top right hand corner'.
The Introduction Of Intelligence
If the humanisation of technology is the secret to making speech automation work, then recent implementations, which have seen the introduction of intelligence and adaptive personalisation, are a giant leap forward. By combining speech recognition technologies with CLI (Caller Line Identification) systems it's possible for automated solutions to recognise callers and offer them personalised information. In the stock quotation services that have begun to proliferate in the financial services industry, for example, it's possible to recognise callers, remember their stock portfolio and proactively offer information on their losses and gains. Equally, in travel related services, details of previous journeys can be remembered and recalled.
It's this degree of personalisation, coupled with the ability to cope efficiently with complexity and ambiguity, that is heralding speech automation's arrival as a viable mass market solution.
The Volume Issue
There is one hurdle which only technology can overcome; the issue of volume.
If the promise of automation is uninterrupted and immediate access to services for callers then it has to be able to cope with large and highly variable call volumes. The answer is theoretically simple but commercially tough; the technology on which the service depends must be robust enough, and the number of phone lines linked to it big enough, to accommodate the maximum number of callers ever likely to contact the service at a single time. Where call volumes are volatile, such as in the utilities or travel industries the cost of buying and maintaining such an infrastructure is, quite simply, cost prohibitive.
Newly emerging hosted services, in which the telephony and technology infrastructure is centrally owned and managed by a third party, provide a cost effective alternative. Hosting provides other useful benefits too. Because speech automation is still a nascent technology and today's vendor environment is characterised by high levels of consolidation and still-to-be-agreed standards, the investment decision for any company looking to adopt speech recognition can be a risky one. Hosting presents an attractive low-risk option.
So, if the technology is now ripe for the mass market and providers are emerging that can offer that technology in combination with customer-centric solution design, then surely the moment for speech automation has finally arrived? Only one question remains; do consumers have the appetite for it? My experience tells me yes. We're living in an age where individuals value the ability to 'serve themselves'. It began when we started helping ourselves at the petrol pump and accelerated when the Internet encouraged us all to search online for information and goods; propelled by practical and pragmatic automated speech solutions self service is about to become ubiquitous.
Today's Tip of the Day - Future Ready
More Editorial From Eckoh
About Jim Hennigan:
Jim, 43, has always worked with and around computers, beginning at Unilever in 1980. Jim joined Texaco Oil UK in 1981, where he worked his way up to become Network Control Manager on both a national and international scale. In 1986, Jim moved to Marks and Spencer and became Senior Communications Analyst, implementing and managing communications strategy to meet specific business requirements.
Eckoh is the UK’s largest provider of hosted speech recognition services, with experience in successfully deploying self-service solutions. These allow our clients to efficiently manage their contact centres by replacing the more repetitive calls with an effective and intuitive automated service. Our carrier-grade platform has the scalability to handle over 650,000 calls an hour and up to 8,000 simultaneously. Our platform has the capacity to manage even the most dramatic and unexpected call peaks.
Published: Monday, March 27, 2006