Nuance - ContactCenterWorld.com Blog
There has been a lot of noise around customer engagement lately. New channels to use for customer service, an increase in self-service, decline in traditional channel usage, a robot take-over - but what does all of that actually mean?
With new channels comes more choice. Customers have more options to reach out to a brand than ever before. Gone are the times where you could only get an answer if you walked into a store or picked up the phone. Nowadays, we can yell at our fridge to get customer support or we can use our mobile phone while on the way to work (car is parked, of course).
Why is there an elephant in my contact center?
In the end, the presence of more channels means more consumers trying to connect with businesses. At first thought, that’s a good thing as it allows companies to sell more. The problem is that contact centers are not capable of scaling. There is a point where hiring new agents is not an option anymore. Yes, technology can be improved to route incoming messages more efficiently, agents can be trained to handle concurrent conversations, and they can be enabled with all sorts of information. But, there will be a point when customer satisfaction drops because consumers have to wait too long for an answer.
Automation seems to be the holy grail. Let’s use a bot to handle all the incoming conversations. Thanks to AI they are smart enough, right? Not quite. Current technology is capable of handling certain things. Bots and the more sophisticated virtual assistants can help enterprises with their work load. But even if they are fed with hundreds of chat logs and have been trained by professionals, they can only handle so much. Businesses will always have some topics that have to be handled by an actual human. Hint: That’s the elephant!
Balancing customer engagement
A McKinsey survey stated that 94% of customer care executives believed they would need to hire new agents or train current ones with new skills.This is not too far away from the truth. Nuance best practice for customer engagement is the following balance between technology, automation and humans:
- Enterprises need technology that handles interactions as efficient as possible, from intelligently routing incoming messages to providing an interface that simplifies the everyday tasks of agents and their supervisors, to meaningful and actionable insights that help optimize the experience.
- Agents transform into two different directions. They are either used to train the virtual assistant until it has a certain confidence to handle the majority of the repetitive tasks, or they become a customer advisor for the complex inquiries. That means contact centers now need more highly skilled, paid and competent people.
- Using today’s AI technology, businesses can use virtual assistants for a variety of use cases that will all help augment the customer engagement processes, for example collecting all needed information upfront, pointing users to a knowledge base article or landing page, helping compare specific products or services, and much more.
It’s time to get rid of the elephant
This paradigm shift requires enterprises to rethink their approach to training and managing the agent workforce. New skills must be developed, new processes put in place and interfaces should be redesigned to ensure an always successful contact center. No need to panic, though. This is not an impossible task. There are experts out there that know exactly what contact center managers are going through and that have tested several options on how to transform your contact center. Let’s tackle that elephant together and create an experience your customers will love.
Publish Date: March 15, 2018 5:00 AM
Last month in Florida, I was absorbing some great discussions with our Executive Client Council (ECC), a group of C-level executives from the nation’s leading health systems. I was closely following a dialog rich in insight when I was struck by two simultaneous thoughts: (1) we should do this more often, and (2) why don’t more marketers do this more often?
“Listen to your customers” is unquestionable for any marketer. Most make at least some attempt to do it. Some do it well, and others not so well.
So that leaves my second thought: Why don’t more marketers do this more often? If you ask the question, the common reply will be “we can’t find the time”. If you push beyond the surface answer, you would also find an unease about what you will hear and an uncertainty if you want to hear it.
If it’s about time, think of the time and resources spent developing products and services that weren’t quite right or were just plain wrong. Then think about the time and money spent digging out of the resulting hole. Can you afford not to take the time? If it’s about being afraid of what you’ll hear because it might disrupt some major product development, then be prepared for the results you get in return — knowing that what you don’t know can hurt you.
We meet with our ECC regularly and it’s fantastic every time. We’ve learned a lot from this group of executives, and adjusted direction on product and technology development based on their coaching. We are committed to making this a more frequent dialog. Our intensive focus on clinical virtual assistant development is one direct outcome from listening to our customers.
Successful businesses are designed around what customers need. So, at the end of the day go beyond just thinking that you’re listening to your customers no matter how well you think you’re doing it. Instead, honestly ask why you aren’t spending more time with them.
Publish Date: March 9, 2018 5:00 AM
New voice and language solutions continue to impact productivity at every level – from improving workflows for document-intensive industries, to simplifying daily tasks.
The automation of these tasks – whether at home or where we work – rely on a set of intelligent systems; all of which use complex algorithms driven by machine learning to take a human ability, like language, touch or a simple gesture, and transform it into an action.
In fact, this was the premise that Dragon Speech Recognition was built upon over twenty-years ago; to take the everyday task of typing and transform it into a simpler process by voice.
The first iteration of Dragon Speech Recognition was the “smartest of smart” for its time – taking incoming streams of sound and interpreting them into dictation. What used to take hours to do; namely typing a document, turned into literally minutes, all simply by speaking into a computer.
Advances in Deep Learning technology has turned the complex algorithms our engineers scratched out on their white boards back then into reality. Today’s intelligent speech recognition engines not only interpret dictation, but also understand its context; distinguishing between words like homophones (for example: to, two, and too). They recognize the difference between dictation and a command, like “open Microsoft Word.” And the technology learns and adapts the more it’s used, learning the subtle nuances of spoken words. It can even distinguish, and parse out, background noise.
The power of all of this built-in functionality has propelled documentation productivity further than we could have ever imagined twenty years back.
Hundreds of police departments, whose officers are spending 3 to 4 hours each day typing incident reports, are improving their documentation processes tenfold. Law firms, whose clients are becoming ever more tech-savvy, are embracing voice-powered documentation solutions to shift labor-intensive tasks, such as searching documents for information, automating e-discovery, to writing case files and briefs, into seamless workflows. And financial services firms are using the accuracy that speech recognition offers to mitigate risk and improve compliance, in the face of expanding rules and regulations.
For the document-intensive industries that we work with daily, seeing the transformative impact our voice and language solutions are having is just as exciting today as it was twenty years ago when we first started automating talking into text.
Publish Date: March 5, 2018 5:00 AM
Oh! I know! I know! – Proactive voice interfaces: Part III of “What’s left to tackle in voice technology”
Everyone knows that kid in class that knew everything. Constantly raising their hand when the teacher asked a question, silently insisting, “I know! I know the answer! Call on me!” If you don’t know this kid - maybe you were this kid. Well, here he is again, in your adult life. The smart speaker.
The next thing we’re going to see in smart speakers is proactive engagement in conversations. Alexa, Google Home, and all the variants will have an option for them to be proactive and helpful in conversations. They’ll start listening to what people are talking about and get involved in the conversation when it makes sense. Imagine a typical Saturday morning conversation in the kitchen:
Mom: “What time is your lacrosse practice today, Ted?”
Ted: “Uhhh, I don’t know… It’s going to rain all day so it’ll probably be cancelled.”
Alexa: “Ted’s practice is on Google calendar at 1pm today at East Field. The forecast does show heavy rain and thunderstorm around 12:30pm.”
Mom: “Thanks, Alexa.”
Alexa: “No problem.”
This is an example of a smart speaker being helpful when it jumps into a conversation. Smart speakers will use affirmations like “Thanks, Alexa” to decide whether it’s interrupting too much or if it should offer more info.
“The Clippy Balance - A Cautionary Tale”
Clippy was the user interface agent that came bundled with Microsoft Office starting in 1997. It remains, arguably, the best known and most hated user interface agent in computer history. It would pop up and try to predict what you wanted to do and most of the time miss the mark. When our smart speakers become proactive and start conversations or interject themselves into our conversations they’re going to have to strike a balance - Let’s call it “The Clippy Balance”. Smart Speakers will need to listen for affirmations that they’re being helpful and also listen for the negative feedback from users like “Shut up Alexa!
Publish Date: March 2, 2018 5:00 AM
The tech world seems agog with the idea of building everyone’s new virtual best friend. After generations of science fiction work depicting a future when machines with pleasant, reassuring voices easily answer any question and blithely fulfill any request, technology has finally reached the point where this fantasy could soon become reality. Some might argue that soon is right now.
Planners often talk about how our lives move between three primary environments: home, work (or school) and on-the-go. This makes sense for most of us and is useful to consider during product ideation and marketing development. It helps creators imagine how their solutions will solve problems unique to each environment.
The “connected car” period of the last years is quickly morphing into the “smart car” era
Automobiles fall under “on-the-go” for hundreds of millions of people around the world. Technologic evolution in them over just the last five-to-ten years has been truly remarkable. Just one decade ago, even the most advanced vehicles lacked the “intelligence” we commonly see in the market today. They were already mechanical and electronic marvels that could perform many impressive functions, but the application of artificial intelligence (AI), machine learning, dynamic driver personalization, and external data exchange capabilities were still conceptual. Since then, however, advanced driver assistance systems, the Internet, and new human-machine-interfaces (HMI) have proliferated in vehicles at all segment levels. The “connected car” period of the last years is quickly morphing into the “smart car” era.
AI makes cars and vehicles assistants smart
The key element to making cars “smart” is an AI platform that thoughtfully integrates the car’s HMI with vehicle sensors, a panoply of virtual assistants and cloud content, and adapts to the environment and users’ individual preferences and habits. Smart cars must possess an automotive assistant that can seamlessly link and make sense of a variety of inputs and data, from both on-board and off-board. Its value will be judged on how elegantly it communicates with people using speech and how well it understands natural language, while using data from disparate “expert” sources to deliver the right information or action at the right time. And, because it’s optimized for the automotive environment, it knows things such as the fuel or battery charge level and how far it is to the nearest places to top off; or what the most popular local restaurants will be along your route when it’s lunchtime in a couple hours.
Connectivity and interoperability increase effectiveness
To be incredibly effective, the automotive assistant must access the most appropriate systems for any given situation. It needs to be interoperable with a wide range of them both within and beyond the car. Specialized bots, virtual assistants and connected devices are increasing rapidly as part of the Internet of Things (IoT) revolution. An automotive assistant that can interface with them will drive incredible value for auto manufacturers, dealers and consumers alike.
Interoperability is a logical end-state that the full IoT ecosystem will eventually need to embrace to meet users’ needs and be successful. Assistants and bots will benefit from communicating together because consumers will ultimately decide they don’t want to be forced to choose — they want to have all options and to enjoy a frictionless experience. One way to think about the relationship between the automotive assistant and others in the virtual realm could be as that of a general contractor and sub-contractors on a construction project. The general contractor may possess the skills to, for instance, design an electrical system or install plumbing, but their primary role is to manage the overall project to ensure it is completed as efficiently and effectively as possible. To accomplish this, the general contractor will leverage their relationships with many specialized contractors who can be brought in at the right time to perform specific tasks expertly and quickly. Similarly, the automotive assistant, while highly capable itself, delivers the best experience for users by intelligently orchestrating all pieces of the smart car ecosystem.
Learning systems optimize user experience
The automotive assistant greatly improves user experiences using two other very important capabilities of advanced AI reasoning: “personalization” and “contextualization.” Personalization concerns learning users’ particular habits and preferences and using this knowledge to make informed recommendations that better support them as individuals. Contextualization concerns the conditions and circumstances that surround the user at a given moment — inside and outside the car — because aspects of both might affect the decision for or against a certain option. For instance, your automotive assistant might learn that you tend to stop for fast food on evenings when you leave your office, have a client meeting on your calendar and don’t have enough time to visit home first. Eventually, when these conditions occur, it may proactively present a few quick-service restaurant options along the route to your meeting, using previous stops at this type of place over time to narrow the recommendations to ones you are most likely to prefer. It effectively solves your problem: “What and where can I grab something I’d enjoy eating, on my way, and in the time I have?”
Taken together, interoperability and advances in AI reasoning enable automotive assistants to support the needs of people on-the-go in their vehicles better than any other virtual assistant possibly can. They will provide access to the most relevant and timely information — from the car, the environment and the cloud — to help users make better-informed and faster decisions, thereby enhancing their productivity, comfort and safety.
Publish Date: February 16, 2018 5:00 AM
Punxsutawney, a small town in Pennsylvania, draws thousands of visitors every year on February 2nd when Phil the Groundhog predicts the weather. On February 2nd, 2018 it will be the 132nd time. According to a legend, if Punxsutawney Phil sees his shadow, there will be six more weeks of winter weather. If he does not see his shadow, there will be an early spring. This legend originates from the German-speaking areas where a badger used to forecast the weather.
But no matter which animal is used, those predictions are rarely accurate. In fact, Phil is only right 39% of the time. In order to predict the weather, or anything else for that matter, you need data. And the only data animals have is how much longer they want to sleep. That’s why actual weather forecasts utilize data from the past to predict the upcoming changes for the wind direction, likelihood of rain, etc.
This is very similar for brands who want to utilize predictive targeting for their customer engagement. Without historic data there is nothing on which a prediction can be based. Before thinking about adding a prediction engine to customer service, brands have to take a close look at the data they have available. This can be recordings from calls that have been transcribed, chat scripts, customer journey data, etc. The more the better. This allows the prediction to be more accurate over time.
If there is not much historical data available, brands can use current information from their customer engagement tools. For example, implementing a virtual assistant and a live chat in several digital channels allows the brand to gather new data and insights. These can then be leveraged to improve the prediction over time.
The best way to create a great customer engagement experience is to continually gather customer data. Every bit of information that can be received during conversations can be utilized for valuable and meaningful insights, which then result in a better optimization process that then allows the brand to predict customer behavior better and better.
This information loop can be augmented by humans who help analyze the data, adjust it to put into the right context and, in addition, help the prediction engine and the underlying machine learning algorithm to learn what to look for. The combination of both automation and humans drives higher accuracy and a better experience for the user.
This said, I’ll still be hoping that that Phil doesn’t see his shadow.
Publish Date: February 2, 2018 5:00 AM
Working in Marketing requires us to read about what’s going on in the market. This can be challenging at times, sifting through the massive volumes of content, especially when you deal with technology that is a hot trend right now, like AI. But every now and then you stumble over great pieces like an article by VentureBeat’s Blair Hanley Frank, who states, “People don’t go and buy two quarts of AI. They buy a product to solve a problem […]”
We couldn’t agree more. The problem with delivering a great customer experience has always been the same, AI or no AI: understanding how technology addresses customer needs while solving a business problem. Unfortunately, technology can be blinding. It promises so much but, if used incorrectly, can come with a lot of sorrows. One of them is that your customers may not like it, thus won’t use it, which most likely will result in their seeking alternatives from you or, worse, seeking alternatives from your competitors.
Instead of running down the rabbit hole, let’s take a step back and think about the actual business problem. What do your customers want? Most likely they want a fast answer if they have a problem. They also want efficient customer service. No matter if they want to buy something new, add a new feature to their plan or ask a question about the latest bill, they want it done in the easiest way possible.
Taking the right steps
The first step to addressing either of these is to ensure that connecting with you is easy; therefore, letting your customer search for a phone number or an email address for too long won’t help.
Step two is making sure that existing data about your customers is fully utilized. For example, if you know that your customer called you about the same question three weeks ago, don’t make them repeat the question. Instead delight them by using that knowledge to streamline their engagement. And if you have to transfer them, for example from the IVR to an agent, also transfer the context. Strong integration with your CRM system (or any other system that you use to store customer data) is a must across all your automated or human-assisted interaction channels.
Enter the psychic
Now comes the fun part, the one that you’ve probably heard before: using artificial intelligence to improve the customer’s experience. One of the most common scenarios is predicting a customer’s intent. It’s like having a personal assistant that tells you exactly what you need in the moment it is needed. Let’s say you receive a notification telling you about a roaming upgrade to your phone plan (because the system realized that you are going to Europe next month). You call the number that is displayed in the outbound notification, and the IVR greets you with:
“Hi Chris, are you calling about upgrading your plan for your Europe trip next month?”
“Great! How long are you staying in Europe?”
“About three weeks.”
“We can add the roaming option for you and automatically remove it once you’re back. Do you want to add the option with a start date of February 4th and end date February 25th?”
“Yeah, that would be great.”
“You’re all set, Chris. Enjoy the trip.”
Several things will change as soon as this technology is implemented. First, your customer will view your customer service as both fast and efficient. No need for them to remember to call you - you will proactively reach out ahead of time. Kudos, for sure. In addition, it will help streamline your contact center operations as callers won’t need to take time working through IVR menus or being transferred to other departments. Or better still, they may not even need to call at all. Both of which mean less cost for you. Finally, your own CRM system will become smarter by learning what does and doesn’t work with customers, driving even further speed and efficiency improvements in the future.
Does this all sound like something from the far future? It’s not as far away as you think. The technology exists today to put these solutions into action. Let us show you how we can use AI to improve customer service, streamline your contact center, and create more efficient digital channels. And, of course, become the psychic your customers will love.
Publish Date: January 19, 2018 5:00 AM
A recent Finnish university study on voice biometrics has been making headlines – and most of those news stories have been inaccurately summarizing the results with concerns as in our title above, leading many to believe that cyber crooks can compromise even the best speech recognition systems.
Before commenting on the article and the study, I feel it is important to highlight that Nuance’s voice biometrics solutions have secured over five billion transactions to date, and not once has an impersonation attack been reported. We have conducted several voice impersonation attacks with famous voice impersonators in the US and the UK, and none proved successful.
So why are the news stories missing the mark? The real story? Let’s start with the conclusion.
“The results indicate a slight increase in the equal error rate (EER). Speaker-by-speaker analysis suggests that the impersonations scores increase towards some of the target speakers, but that effect is not consistent.”
So how could the researchers write that “Voice impersonators can fool speaker recognition systems”? To understand that, you need to dig deeper into the study. Here are the actual data points:
So what does this data mean? Let’s start with some definitions.
Text Independent- This is passive voice biometrics where a voiceprint is created from listening in on a normal conversation and that voiceprint is compared to a voiceprint on file.
Same Text- This is active voice biometrics where the user is given a specific phrase to repeat. (Often it is “My voice is my password”.) Once enrolled the user is asked to speak the phrase and then this new voiceprint is compared to the voiceprint on file.
False Accept Rate- This is the percentage of times a system incorrectly matches to another individual’s existing biometric. Example: fraudster calls claiming to be a customer and is authenticated.
False Reject Rate- This is the percentage of times the system does not match the individual to his/her own existing biometric template. Example: customer claiming to be themselves is rejected.
Equal Error Rate or EER- The EER is the location on the graph curve where the false accept rate and false reject rate are equal. In general, the lower the equal error rate value, the higher the accuracy of the biometric system. Note, however, that most operational systems are not set to operate at the “equal error rate”, so the measure’s true usefulness is limited to comparing biometric system performance.
GMM-UBM; i-vector Cosine; & i-vector PLDA- These are three different algorithmic approaches to voice biometrics. Notice that the latest technology, Deep Neural Networks, is not tested.
Now that we have that, the data showcases the following:
- In one instance (text-independent GMM-UBM) the EER is decreased with impersonation – meaning that the imposters were less successful at generating a false accept than a random individual not attempting any voice mimicry.
- In another instance (same text i-vector PLDA) the EER is virtually identical between the impersonation testing and random attacks. In other words, imposters have the same performance via mimicry as a random individual not attempting to modify their voice.
- In four instances, there is an increase to the EER rate, but given the small sample size (60 voices) the results are not statistically relevant. In other words, a test performed with a larger sample may showcase opposite results.
Finally, and maybe most importantly, the researchers did not perform the tests with Nuance voice biometric technology. This is evident by the very high EER rates reported by the study as a “baseline” result, ranging from 4.26% EER to 10.83% EER. No tests were conducted on deep-neural-network based voice biometric algorithms, the technology used by Nuance and deployed through scores of enterprises worldwide.
In conclusion, although this topic does merit additional research, Nuance will continue its focus on improving our ability to address actual fraud attack vectors, including brute force attacks, voice imposters, and recording attacks while continuously improving the voiceprint and also improving mitigating strategies for future attack vectors that we believe will eventually be used by fraudsters such as synthetic speech attacks.
Contact us if you would like to learn more about the great strides Nuance has made in Voice Biometrics.
Publish Date: January 9, 2018 5:00 AM
1. Your voice will be your password
2017 was a record year for hacks of personal customer details. These breaches give fraudsters access to our identities including the answers to those annoying security questions. One thing the fraudsters can’t do much with? Voice data. And that is why banks and telcos are increasingly replacing security questions with biometrics.
With a few words of speech, voice biometrics can confirm you are who you say you are at accuracy and security levels better than pins, passwords and security questions. And it knows how to detect recordings from real, live speech – rendering the data useless to fraudsters in the case of a breach.
2. You will use a virtual assistant (VA) for customer service, and it will work.
Conversational AI breakthroughs have led to a new generation of VAs specific to your bank, your telco and your pizza ordering, all providing personalized, concierge-like service. In 2018, this generation of VAs will be made even more effective, through technology called HAVA (Human Assisted Virtual Assistant). HAVA adds a human-in-the-loop capability, first to help answer new questions the VA may not know, but more importantly to provide a learning loop that updates the VA’s “brain” in real time.
3. You will add a brand as your messaging “Friend” – and you will mean it.
In 2017, Facebook Messenger, Line, Kik and more added capabilities for their users to “friend” organizations and companies, and late in the year, Apple announced Apple Business Chat, which will do the same for Apple Messages. In 2018 you will start engaging brands in the same way you talk to friends – in your messaging app, through SMS and even inside your banking and telco apps. And AI will allow each brands’ VA engine to respond to you in a personalized way, referencing past engagements you have had across other channels.
4. Prediction will let brands anticipate your needs
Customer service creates a ton of data. In 2018 this data will be harnessed more than ever to fuel new AI engines. Predictive customer service will let brands anticipate what you need or may do, before you even know, by analyzing and detecting the patterns of billions of customer engagements over time.
5. The “800” number will enter early retirement
Digital customer engagement combined with mobile devices, tablets and data lines will lead to less calls. A lot less. In 2018 you will engage with a virtual assistant and if they can’t resolve an issue, you will be seamlessly texting with a live contact center agent. If the issue is really complicated and can’t be resolved through messaging, you still won’t call the 800 number. In 2018, that step will be integrated through advanced technologies like WebRTC and IVR-to-digital, allowing the contact center agent to connect with you by voice or video within the app, on your laptop, even through your TV screen or smart speaker.
Publish Date: January 5, 2018 5:00 AM
Over and Out – Moving beyond the walkie-talkie voice interface: Part II of “What’s left to tackle in voice technology”
“Over” was short for “over to you” indicating that its your turn to talk on a short wave radio or walkie-talkie (or any half duplex comm tech for you nerds out there). Smart speakers are super cool and a step forward in voice – but they’re still half duplex, klunky, unnatural voice interfaces. We’ll all look back one day and remember how quaint today’s smart speakers were - like we remember morse code, tape players, and VCRs. Try turn taking a face-to-face conversation or conference call sometime and you’ll get a feel for what smart speakers, and all voice interfaces for that matter, are missing out on. There’s a whole field of study around the protocols and rules of human conversation called “Pragmatics” that study how humans interact one to one, one to many and many-to-many.
For example – I’ll say, “Alexa, play ‘Fool in the Rain’ by Led Zeppelin on Spotify,” and wait the requisite 3 seconds of silence so Alexa knows I’m done talking (might be easier to just say “over”). Then Alexa says, “I’m sorry, I can’t find ‘Fool in the Rain’ by Led Zeppelin on Spotify.” I’ll remember I cancelled Spotify and try to correct myself by speaking over Alexa “No, play it on Amazon Music.” It’s natural to do this – a person wouldn’t miss a beat having the same conversation.
In addition to the half duplex limitation – Alexa also can’t understand multiple speakers. Even the best user interfaces today employ turn taking to manage the conversations and don’t work at all with more than 2 speakers in a conversation. For example, if my children interrupt Alexa while she’s playing ‘Fool in the Rain’ and ask her to play “Space Unicorn“, a song that can make you insane after hearing it for the 400th time, I typically respond by shouting “Laa Laa Laa Laa!!” to confuse Alexa and keep her playing good music.
Managing the turn taking in a conversation with multiple speakers is no simple task. It requires that you listen while you talk and also respond to visual queues (in a face-to-face conversation). For example, Japanese speakers often produce backchannel expressions such as un or sō while their partner is speaking. They also tend to mark the end of their own utterances with sentence-final particles, and produced vertical head movements near the end of their partner’s utterances. See Turn-taking - Wikipedia for a long description of the complexity. The listen and talk problem gets exponentially worse when you add more speakers. A bot will need to know whether its having a friendly conversation and should wait until the person is done talking or if the bot is arguing and should cut into the rant. For more detail on that complexity read this article.
Recognizing these short-comings is the first step in over-coming them. Nuance R&D is working on these problems and others to transform the way people interface with technology.
Stay tuned for parts 3 and 4 as we catalog the technical problems when telling your customer to “talk to the IVR like you would a human”.
- Part 1 – Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo: Part 1 of What’s left to tackle in voice technology
- Part 3 - Sentiment and Emotion in Voice - “Your customer seems angry - ummm - now what?”
- Part 4 - “I heard you talking about the train schedule - can i help?” - voice interfaces as proactive participants in conversations(without being rude
Publish Date: January 3, 2018 5:00 AM
Today, many (if not most) companies face many different challenges in managing IT infrastructure, including systems, applications and hardware. This is especially true when it comes to their fleets of printers, scanners, and multifunction printers (MFPs) as well as any related software or workflow solutions.
To be more specific, these challenges can include:
- Inconsistent user interfaces: We have all probably experienced this firsthand – the case where we are forced to adapt to a poorly designed, inconsistent or non-intuitive user interface (UI). It can be a significant issue: Bad UI experiences can lead to low adoption rates, more reliance on training services, decreased productivity and even a disengaged workforce.
- Different technologies, inefficient workflows: Large companies today tend to have many different brands of printers and MFPs, each with their own operations, interfaces and ways to use them. This alone can make it difficult for users to perform important tasks, such as document capture and workflow, where an employee might copy an important document and automatically send it to a central repository. As a result, the entire organization misses a significant opportunity to increase productivity, efficiency and users’ morale.
- Security and compliance risks: Printers actually pose a significant security risk, and now, companies of all sizes are under new pressures to comply with stringent data privacy regulations, such as GDPR. At best, it’s a difficult, time-consuming process. At worst, security breaches can lead to fines, lawsuits and long-term damages to the company’s reputation.
Or if companies are still relying on outdated technology – especially older printers and MFPs – they may have a much hard time managing and securing their entire IT environment. As a result, they may be subjecting themselves to even more security risks and potential compliance issues.
A whole new (user) experience
The good news is that there are extremely effective ways to overcome all of these challenges, and in doing so, provide better user experiences, workflows, and security.
For example, external terminals, such as the new Nuance® Edge™ for Copitrak terminal, already provide a much better UI and enhance the speed, functionality and quality of related processes, such as scanning.
It’s an important advantage, especially when you consider terminals like this unify the overall experience, which could be different – and confusing – from device to device. For example, according to recent research from Salesforce.com, 83 percent of users report that “a seamless experience across all devices” is extremely important to them.
By giving employees a unified – and much better – UI, external terminals no longer “force” users to adapt to the different screens and steps they’re sure to find in a mixed-MFP environment. This alone helps employees work much faster, smarter and more effectively.
Better, more intuitive UI can also help employees with critical work task such as scanning. For example, today’s external terminals can provide powerful tools to fine-tune resolution, DPI, contrast, brightness, auto-color correction and more. These features and functionality help minimize time lost in steps like post image processing, saving time and freeing employees to become much more productive.
And when it comes to security, external terminals such as the Nuance Edge come installed with the latest Windows 10 operating system and other security tools. This helps any organization administer the latest network security polices to improve overall security and compliance efforts.
Publish Date: November 29, 2017 5:00 AM
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo: Part 1 of What’s left to tackle in voice technology
Me: “Alexa - what’s the temperature going to be today?”
Alexa: “Right now the temperature is 56 degrees today in with cloudy skies. Today you can expect clouds and showers with a high of 60 degrees and a low of 44 degrees. ”
Me: “What about tomorrow?”
Alexa: [blank stare]
Me: “Ugh - Alexa - what will the temperature be tomorrow?”
Voice as a computer interface has come a long way, but it’s still clunky and nothing like talking to another person. Our amazement with how far the technology has come since voice recognition in IVRs came on the scene in the 1980s can make us forget the remaining problems we have to tackle to get to human-level interactions. In this blog series, I’m going to take each remaining hurdle and talk about where we are today, where we’re going and how Nuance is leading the way.
Part 1: Automatically generating dialog for conversations is a complex problem to solve.
“Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.” Believe it or not, this is a grammatically correct sentence and illustrates why automating natural language processing and conversation is hard. If you’re wondering what the Buffalo sentence means you can click the link and read about it (helpful tip - take an Advil). The tl;dr (too long; didn’t read) version is that the word “buffalo” can be a proper noun, noun, or a verb, so the sentence translates to something about how buffalo from Buffalo bully (aka buffalo) buffalo, etc…
This is obviously an extreme example, but it just goes to show that there is plenty of meaning and “nuance” hidden in the words people choose that computers haven’t been “taught” to understand yet.
Here’s an example that may resonate more with English speakers:
SHE never told him that she loved him. (but someone else did)
She NEVER told him that she loved him. (zero times in their entire relationship)
She never TOLD him that she loved him. (she showed it but never said it out loud)
She never told HIM that she loved him. (but told everybody else)
She never told him that SHE loved him. (but that someone else did)
She never told him that she LOVED him. (only that he liked him and thought was funny)
She never told him that she loved HIM. (she said she loved someone else)
As a live, English-speaking human, you would catch the subtle changes in meaning just by placing inflection on different words. However, artificial intelligence would have to be taught that kind of nuance.
Another great illustration of the complexity of language can be seen in a video of physicist Richard Feynman, apparently being condescending to his interviewer: Richard Feynman Magnets - YouTube. The interviewer is simply asking Dr. Feynman to explain magnetism to him, and Dr. Feynman refuses and dismisses the question, saying that the interviewer won’t understand. The net of the video is that Dr. Feynman can’t explain magnetism in a meaningful way without a shared frame of reference – and he and the interviewer don’t share one. The interviewer doesn’t have the degrees that Dr. Feynman has, so he equates it to explaining to an alien why his wife is in the hospital with a broken leg. Well, she slipped and fell. Why did she slip and fall? Well, she was walking on ice. Why is ice slippery? …etc., on down into deeper and deeper levels of complexity – for seven minutes – and never answers the magnetism question. (One viewer posted, “This is why no one talks to you at parties.”)
This complexity is at the core of the problem we need to solve for computers to “learn” how to converse with humans. Nuance is making great advances in automating conversation. Currently the state of the art in this area is still Simple Question Answering (essentially Enterprise Search front-ended with Natural Language Understanding). See Paul Tepper’s Post on advances in automating conversation. Nuance is working internally and with research partners on encoding the general knowledge that computers need in order to decipher the buffalo sentence and to have a frame of reference to converse with humans.
So, just in case you didn’t have a frame of reference when reading this blog post, go back and read the Wikipedia entry on the buffalo sentence and watch the Dr. Feynman video. Then you’ll understand the monstrous task we have in bringing voice technology up to human-level interactions.
Next time: Part 2: Sentiment and Emotion in Voice – “Your customer seems angry – umm – now what?”
Publish Date: November 21, 2017 5:00 AM
It seems intuitive that an IVR that features a user-friendly, speech-enabled menu would deliver improved performance and customer experience over antiquated touch-tone systems. Well now we have the research to prove it.
In the past year Nuance hired a third-party research ﬁrm to evaluate the IVR customer experience offered by 50 leading companies in the Fortune 250 to see how well their IVRs perform. The results are surprising and unsurprising. Spoiler alert: the unsurprising part is how well speech-enabled IVRs worked compared to touch-tone.
The research criteria
Using a rating scale of 1–5, third-party researchers evaluated each of the 50 companies across six key criteria to assess the state of their IVR and their ability to help customers resolve issues quickly and painlessly. The six criteria were:
- Ease of use – how easy was the experience?
- Speed – how quickly was the call resolved?
- Speech recognition – if speech enabled, how well did the IVR understand the caller?
- Conversational dialogue – does the IVR engage in personal, back-and-forth dialogue?
- Caller intent – how well did the IVR determine why someone is calling?
- Audio quality – were the menus clear and easy to hear?
Researchers compiled all the results across these six criteria and generated an average score for each IVR.
Across all 50 IVRs evaluated the average score was 2.3, indicating that there is much room for improvement in how IVRs support callers.
The unsurprising results
No surprise to us is that the IVRs in the top ﬁve performing industries below scored a whopping 35% higher than the bottom ﬁve. Is your company in one of these industries?
What makes the leaders stand out? No surprise that the reason for their improved IVR performance was that they invested in speech-enabling their IVRs at a much higher rate. 67% of the top performing companies adopted speech-enabled IVRs.
Speech-enabled IVRs — whether standalone or combined with Dual-tone Multi-frequency signaling (DTMF) — provide higher quality experiences than DTMF systems alone. As shown in the graphic below, companies with speech IVRs had significantly higher average scores:
The surprising results
The data above was no surprise at all. But what was surprising? Two things stand out:
First, 53% of the companies still employ an old-fashioned touch-tone IVR. Yes, DTMF and “Press 1 for Service” still lives on in the majority of companies we called. That’s great news for anyone who loves the 80’s but not so great for everyone looking to move into the future.
Second, industries that are heavily reliant on their IVR and contact center fell into the lower performing category. Companies that sell Insurance (Financial Services), Healthcare, and Health Insurance all scored significantly below the average. Given how important the phone is for engaging customers in these industries, it is curious to see them perform below average.
Is your company in the bottom tier? Still rely on DTMF? Then please read on!
Say “Yes” to speech
With the rise of voice-activated smart assistants in our phones, cars, and homes, the power of voice is on the rise with no sign of slowing down. So why have your customers greeted with technology from 1988? Your IVR is one of your most important channels, and it makes sense to start the move to speech today. Today’s modern, conversational IVRs use powerful speech recognition and natural language so callers can engage the IVR and simply say whatever they’d like – in their own words – and be directed to the right resource. Imagine your customers’ delight when they can stop pushing buttons and start using their own words.
Check out the full research infographic to review the results in more depth, and then contact Nuance to see how we can help you be a top performer.
Publish Date: November 14, 2017 5:00 AM
It was a month like we’d never seen before. As we watched Hurricanes Harvey, Irma, Jose and Maria impact the US and Carribbean, and a massive earthquake hit Mexico City, a series of questions may have run through our heads. How can we help those people? How do they rebuild? How can we better prepare? As a society, we’ve talked a lot in recent years about upgrading our infrastructure, and that goes beyond roads, bridges and power grids. It’s likely owing to my profession, but I believe the modernization of communications that can be leveraged during disasters like these can literally be life savers for those communities threatened by situations like these.
The timing seems right – what used to be unsophisticated outbound technologies like “robo calls” are now going through a renaissance as more advanced vendors orchestrate multiple proactive engagement channels like text messaging, push notifications, email and automated voice, coordinating with IVR and digital through an omni-channel fabric and improving ease-of-use through cloud platforms. Using outbound notifications before, during and after an emergency like tornado or flooding should be seen as the first line of defense for local governments. Often, the first thing that happens as regions gird themselves for a disaster is a massive increase in inbound calls to customer service lines. Citizens are demanding timely answers about what they should do, and call centers can quickly become overwhelmed as wait times grow.
By combining voice, text and other channels in an integrated fashion, residents get the information they need through the channels they prefer, extending the reach of critical messages like incident preparedness, evacuation routes and shelter locations.
We all know this type of communication is important – and may become increasingly vital – so, what should we look for in an outbound communications platform?
- Security and compliance: Security must be the number one priority. Be sure your vendor is protecting sensitive data and staying in regulatory compliance with PCI Level 1, ISO 27001 certification and HIPAA-compliant data centers.
- Self-service messaging: You’ll need an intuitive user interface so you can easily record a message, use text-to-speech or choose from pre-recorded messages.
- Omni-channel contact strategy: It’s critical that your solution works across channels and supports voice, email, text and smartphone push notification. The ability to transfer to a representative or call center directly from a voice message is a bonus.
We don’t know when or where “the next one” is coming, but we do have concrete steps we can take to limit loss of life and property. Now is the time to take that first step.
Publish Date: November 7, 2017 5:00 AM
Halloween is a time of frights and scares. Zombies, goblins, witches, and monsters are let loose on the public to scare and haunt them. And a good scare is tons of fun this time of year and makes us scream in delight. Good scares get our adrenaline going. But on the flipside, bad experiences that cause us to scream with anger get our blood boiling. That’s never good, but unfortunately it happens every day when consumers receive frustrating and ‘scarily bad’ customer service experiences.
Read on, if you dare, for three of the scariest customer service experiences we believe are guaranteed to make any customer scream.
“Push 1 for mortgage enquiries…”
There it is. Popping out of your phone like a monster bursting from behind a wall. The outdated IVR menu. (Cue the Jamie Lee Curtis in “Halloween” scream!) In a world of cool, new voice-enabled applications and assistants, the old fashioned IVR terrifies your customers.
Nobody wants to wade through endless mazes of touch-tone based options and push buttons like it’s 1978. Customers will scream in frustration. Surprisingly there are still many enterprises still using this old-fashioned technology today. Our research into 50 of the Fortune 250 IVRs shows that a scarily high 53% of the companies still employ an old fashioned touch-tone IVR. Hard to believe and yet so easy to fix.
Today’s modern conversational IVRs delight callers with powerful speech recognition and natural language so callers can simply say whatever they’d like – in their own words – to get directed to the right resource. Satisfaction goes up, and frustrating screams go down.
What was your second-grade teacher’s dog’s name?
Nothing sets me off quite like the random challenge questions to prove I am who I say I am. Most of the time they are asking me something I answered five years ago and promptly forgotten, or worse, something that is not hard to find out like Mother’s maiden name. Of course, now that I am on the phone with an agent, I am the one who looks stupid. “I don’t know… Spot?”
Fortunately for everyone PINs, passwords and challenge questions are the way of the past. Call centers, IVRs and virtual assistants all over the world are adding secure biometrics to ensure the person is who they claim to be. With secure voice biometrics customers can simply state a pass phrase they don’t have to remember, or even be recognized just from a normal conversation. In addition new biometric modalities enable people to use their face, fingerprint, iris and even unique behaviors to prove their identity. All the time not having to memorize anything! “I just remembered. Rover!”
“What’s happening with my flight/package/credit card?”
Too many times a customer must proactively call a company to enquire about an issue they are having. And nothing causes greater frustration and a maddening scream like a customer service agent acknowledging, “Oh yes. I see your flight is delayed.” Huh? So they knew about it? Well, then why didn’t they let the customer know in advance and prevent the phone call?
It doesn’t need to be this way. We live in a world of powerful push notifications through multiple channels where sending a text or email costs a fraction of a penny. Why don’t more companies get onboard with proactive outbound communications? Many do but only for limited scenarios like overdue bills or appointment reminders. They fail to connect the whole customer experience due to siloed service channels.
A proactive outbound platform connected to the inbound IVR platform ensures customers are notified in advance of issues like flight delays or suspicious charges on their credit card. A well-timed text or email ensures the right outcome and also increases customer satisfaction by preventing them from calling your contact center, which reduces operational expenses. Today’s consumers want to be notified proactively; they opt in for communications that help reduce their effort. New technology allows organizations to both notify consumers and engage in a two-way conversational text dialogue using smart, natural language understanding.
Beams, not screams
Being scared and having a good scream is fun – in the right situation. Calling service channels should not elicit a response best reserved for a Friday night horror flick. With the right investments and planning, organizations can offer their customers a service experience that leaves them beaming - not screaming.
Publish Date: October 31, 2017 5:00 AM