Cookie Preference Centre

Your Privacy
Strictly Necessary Cookies
Performance Cookies
Functional Cookies
Targeting Cookies

Your Privacy

When you visit any web site, it may store or retrieve information on your browser, mostly in the form of cookies. This information might be about you, your preferences, your device or used to make the site work as you expect it to. The information does not usually identify you directly, but it can give you a more personalized web experience. You can choose not to allow some types of cookies. Click on the different category headings to find out more and change our default settings. However, you should know that blocking some types of cookies may impact your experience on the site and the services we are able to offer.

Strictly Necessary Cookies

These cookies are necessary for the website to function and cannot be switched off in our systems. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms. You can set your browser to block or alert you about these cookies, but some parts of the site may not work then.

Cookies used

ContactCenterWorld.com

Performance Cookies

These cookies allow us to count visits and traffic sources, so we can measure and improve the performance of our site. They help us know which pages are the most and least popular and see how visitors move around the site. All information these cookies collect is aggregated and therefore anonymous. If you do not allow these cookies, we will not know when you have visited our site.

Cookies used

Google Analytics

Functional Cookies

These cookies allow the provision of enhance functionality and personalization, such as videos and live chats. They may be set by us or by third party providers whose services we have added to our pages. If you do not allow these cookies, then some or all of these functionalities may not function properly.

Cookies used

Twitter

Facebook

LinkedIn

Targeting Cookies

These cookies are set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant ads on other sites. They work by uniquely identifying your browser and device. If you do not allow these cookies, you will not experience our targeted advertising across different websites.

Cookies used

LinkedIn

This site uses cookies and other tracking technologies to assist with navigation and your ability to provide feedback, analyse your use of our products and services, assist with our promotional and marketing efforts, and provide content from third parties

OK
[HIDE]

Here are some suggested Connections for you! - Log in to start networking.

EXECUTIVE MEMBER
Ikhwal Sidiq
Assistant Manager Trade and Remittance Services
408
MEMBER
Richard Roberts
Adviser and Consultant
18
EXECUTIVE MEMBER
Selin İcer
Quality - Training & Academy Director
29
MEMBER
Thamer Noori
Director of Industrial Security and Safety Dept.
13

Article : Talking Text To Speech

You're making a call for information on cinema listings. The voice that answers sounds like an automated Jonathan Ross, right down to the trademark lisp. You're intrigued. Surely a machine couldn't lisp? Or could it?

The simple answer is yes. In fact, it could answer in any accent, language, style or manner you can think of. Casual or formal, precise or imperfect, the automated voice on the other end of the phone is capable of mimicking a wide range of human pronunciations and styles of speech.

Technology has brought us a long way from the Dalek-style delivery typical of early efforts at speech synthesis. While most systems won't fool you for long that you're listening to a real person, they are intelligible and they're getting more and more natural.


Ian Coleville
Product Manager,
Aculab

With the help of text-to-speech (TTS) technology – which can read any text out loud - it is now possible to access textual information over the telephone. Texts might range from simple messages such as cinema listings or local event information to huge databases.

Text-to-speech technology is now moving ahead fast, with human-sounding voices replacing the early robotic delivery of information. We now have a usable and useful technology at our fingertips that's ideal for systems handling large amounts of rapidly changing information.

Without doubt, there has been a huge improvement in the quality of synthetic speech over the last few years. In fact, it is generally expected that within 20 years TTS will be an accepted part of our daily routine, with voice-enabled homes, cars, domestic appliances and computers interacting with us in increasingly natural ways, using artificial intelligence, automatic speech recognition and TTS technologies. Some of this is with us now as we already have voice controlled PCs, where speech recognition is used to control the desktop instead of the conventional mouse.

But we will have to wait before the type of artificial intelligence that allows for failsafe interaction between man and machine becomes a reality with completely interactive homes and appliances. For business and commercial purposes, advances in speaker verification and natural language understanding mean that new applications can be realised. For example, in the fields of banking, insurance and share dealing, these new technologies enable over the phone transactions to be carried out with appropriate levels of identification and security.

But all of these stop short of us being able to have a meaningful conversation with the machine. Automatic speech recognition and artificial intelligence have not yet reached that level. The area where TTS can be used effectively now, however, is the telephone. We can play information over the phone such as local information, train timetables, stock prices, weather reports, etc. If we need input from the caller, we can get them to push the buttons on their phones – or use connected word advanced speech recognition (ASR).

Instead of just listening to pre-recorded information, callers can select what they want to hear. They can navigate e-mail accounts, select and listen to messages, book and order tickets and listen to web-content over the phone rather than reading it off a rolling screen. TTS makes economic sense too. Any dynamically changing information source can benefit from TTS. There's no need for information to be re-recorded, to hire professional speakers or to pay studio costs. Just update the input text file and the job is done.

Most call centres currently operate through the use of human agents rather than virtual agents. Economic factors, however are providing the impetus for more automated solutions. For a start, TTS technology allows call centres to save money whilst improving their service to customers. It is ideal in situations where information to be read out is either frequently updated or too extensive to record or re-record.

Costs can also be saved when updating information that customers get access to over the telephone, as there's no need to employ a professional speaker for recorded prompts or regularly take up the valuable time of a member of staff to record and re-record the updated information. Information can be readily updated in real time, presenting the customer with the most up to date data all of the time. This is ideal for stock quotes, web pages, product information, customer data, spares inventories, goods delivery times and time limited special offers.

It's also suitable for menus for restaurants that can be booked on line, weather reports, traffic reports, news bulletins, sports results, cinema showings and betting odds – the list of applications is almost limitless. And bear in mind too that customers often prefer TTS technology as it allows them to access information without feeling under pressure from sales people.

From the call centre perspective, TTS technology offers access to more customers. Because this provides another and alternative channel for customers to use to contact them, companies make themselves more readily accessible to their customers, so they have the opportunity to sell more - and generate more revenue. The technology also enables call centres to increase productivity and efficiency for a relatively low increase in costs. For example, companies could use the technology to respond to (typically) 30% of callers – and those calls will cost 90% less to handle.

That doesn't mean that companies are likely to get rid of agents, however. Instead, they're more likely to use existing staff to provide higher levels of service, removing the mundane tasks, increasing morale and especially staff retention, a high priority for call centre managers. So, TTS sounds a great application for call centres. But what should you look out for when considering today's product offerings? There are basically two core issues to bear in mind:

Firstly, is the speech quality good? Quality is subjective, but to be honest, if people can't understand what's being said, the TTS is absolutely no use at all. This is especially true for more prolonged passages of text. So, it is consistency of output that is important, without placing undue stress on the listener to comprehend what's being said. Obviously, there are natural limitations of speech over the telephone, which affect real human speech as well as synthesised speech, and good quality TTS needs to be tuned to cope with the narrow telephony bandwidth and noisy listening conditions.

Secondly, channel density. If lots of people can't access the information at the same time, the application's worthless. Equally, for those seeking to develop and sell applications using speech technology like TTS, channel density is also important in terms of margins and ultimately value for money to their customers.

Take into account the licensing fees charged by the majority of the speech recognition and TTS companies for using their technology and the hardware costs of the systems or server platforms required and there are more parts to this equation than just the TTS software. Don't forget the telephony and speech processing requirements on the hardware side too.

At Aculab, we have recently released new TTS software that features five major languages, including our new addition Latin American Spanish as well as the world's first variants in voice styles. Providing several pre-configured versions for each voice, Aculab's TTS technology allows software developers to choose from six or eight stylistic variants. These include formal, casual, and Aculab's world first "international" English language voice, which has been carefully designed to give clearer and less noticeably regionalist output, making it wholly acceptable and intelligible to all English speaking customers. In fact, Aculab's TTS resources played a vital role in the development of the world's first interactive dialogue system that can interpret and pronounce Maori.

Developed by Sydney-based VeCommerce, a partner of both Aculab and Nuance, the VeCab system has been created for New Zealand's leading taxi company, Auckland Co-op Taxis, which has 75 call centre staff handling approximately 250,000 calls per month.

The VeCab system allows customers to place bookings over the phone, without the need for an operator to enter the details into the taxi despatch system. When customers call the VeCab system, their voice is recorded and simultaneously sent to the speech recogniser for processing. The system uses a series of prompts to obtain information, such as the name of the caller and the destination. If the computer doesn't understand something, it automatically prompts the callers to repeat that part of the message. TTS technology from Aculab is then used to confirm the requirements such as pick up location back to the caller.

Because the Maori language uses many pronunciations not found in English, New Zealand place names can be a problem for speech technology systems. VeCommerce were able to use Aculab's LexMan dictionary manager, which allows developers to create, update and extend multiple TTS lexica for custom pronunciations, providing a unique phonetic vocabulary for the Maori pronunciations and making them available to the application. TTS really comes into its own here, as recording every street, road and suburb name in New Zealand would have been uneconomical. The result is the world's first combination of natural language speech recognition and TTS synthesis application that can interpret and pronounce Maori and other unique names correctly. For Auckland Co-Op Taxis, the system will reduce call handling costs, increase the ability to handle busy periods more effectively, reduce the need for complex rostering and enable the company to expand its booking capabilities without increasing labour costs.

Less than five years ago, TTS conversion was little more than an amusement for most people. Now the systems are intelligible and they are becoming more and more natural. Although you can't expect to hear your local cinema listings read out to you by an exact replica of Jonathan Ross, the system is capable of lisping…………… it's becoming more and more human.

Not everyone welcomes new technology and it's important to provide callers with the option to speak to a real person. Tomorrow's best call centre solutions will be those which can neatly combine the best of all media, and TTS plays a vital role in that solution.


About the Author
Ian Colville is a product manager at Aculab and has contributed to technical documentation, including product literature and several published articles, mainly regarding the use or development of speech technologies. He has broad industry knowledge gained during a number of years employed in a variety of management roles by a major telecommunications manufacturer. Ian's industry experience spans marketing, sales and customer service and project management.

About the Company
Through a customer-focused approach to development, Aculab has produced a computer telephony (CT) product portfolio that satisfies the speech resource and global digital connectivity requirements of developers and system integrators. CT applications utilising Aculab's components can handle real-time telephony, through an extensive range of resources and signaling systems.

Today's Tip of the Day - Keep Cost In Perspective

Read today's tip or listen to it on podcast.

Published: Monday, December 2, 2002

Printer Friendly Version Printer friendly version

2024 Buyers Guide Recording

 
1.) 
Premium Listing
Call Center Studio

Call Center Studio
Call Center Studio is the world’s first call center built on Google and is one of the most secure and stable systems with some of the industry’s best reporting. It is one of the most full-featured enterprise grade systems (with the most calling features, one of the best call distribution, outbound dialing features and integrations—including IVR, AI Speech Recognition, blended inbound/outbound calling and includes Google’s new Dialogflow and Speech API. Call Center Studio is the absolute easiest to use (with a 10 minute setup), and is the price performance leader with lower equipment cost and less setup time.


2.) 
Teckinfo Solutions Pvt. Ltd.

InterDialog UCCS
InterDialog UCCS enables organizations to adhere to all compliances with its inbuilt call recording software and also has an option for screen recording. With centralized repository of all voice logs, its easy to maintain & retrieve all voice files and have a central control in case of multiple branches set up.
 

About us - in 60 seconds!

Join Our Team

Industry Champion Award Leaderboard

Most active award (top 10) entrants in the past 48 hours! - Vote for Others / About Program
Submit Event

Upcoming Events

The 19th AMERICAS Annual Best Practices Conferences are here! Meeting Point for the World's Best Contact Center & CX Companies Read More...
 31813 
Showing 1 - 1 of 3 items

Newsletter Registration

Please check to agree to be placed on the eNewsletter mailing list.
both ids empty
session userid =
session UserTempID =
session adminlevel =
session blnTempHelpChatShow =
CMS =
session cookie set = True
session page-view-total = 1
session page-view-total = 1
applicaiton blnAwardsClosed =
session blnCompletedAwardInterestPopup =
session blnCheckNewsletterInterestPopup =
session blnCompletedNewsletterInterestPopup =