Sunil Kappal - ContactCenterWorld.com Blog
As per the largest market research firm MarketsandMarkets the speech analytics industry will grow to USD 1.60 billion by 2020 at a Compound Annual Growth Rate (CAGR) of 22% from 2015 to 2020. Today the omnichannel world consists of voice, email, chat, social channels, and surveys, and each channel has its own importance.
Therefore, it becomes inevitable for any customer centric organization to ignore the information that can be glean out of these customer interactions.
This article talks about some cutting edge usage of Speech Analytics output coupled up with a computerized mathematical technique that allows organizations to account for risk which is called as Monte Carlo simulation. For the purposes of this article I will be focusing on the healthcare industry which has reported (The Economist May,31,2014) a staggering $275 billion swindle.
To use this technique (Monte Carlo simulation) in conjunction with the Speech Analytics output we will use the “Stochastic Model” for the simulation which involves probability or randomness.
Application of Monte Carlo Simulation to identify probability of fraud by Service Providers
The expected output of this simulation is to identify the likelihood of a fraudulent activity based on the key customer interaction that indicates potential “Fraud Outcomes”.
Identifying the fraudulent interactions
As we know that speech analytics allows its users to query the media files to identify the emerging topics. The above scenarios can be created within any speech analytics application. The user can also utilize the provider related metadata (additional information about a particular customer interaction) to understand the interaction distribution of the above scenarios by a particular provider.
How it works? Creating the Model
Scenario 1 + Scenario 2 + Scenario 3 + Scenario 4 + Scenario 5 = Fraud?
Let’s say we have over a million customer interactions with a combination of 5 scenarios (refer to the scenario grid), and we arbitrarily identify the interactions to decide if that interaction has a high likelihood of being a fraudulent scenario. No two scenarios will have precisely the same number of fraud manifestations. However, if we have an idea of the range of occurrences for each situation, then we can create a Monte Carlo simulation to better understand the probability of a fraud scenario.
The image below shows the simulation that I created in Excel that illustrates how the model was created using 1000 fraud simulations where each simulation is equally likely to happen.
The above simulation was done for multiple providers based on their fraud scenario % (it will be advisable to pick the outliers by keeping the fraud indicator scenario % in mind). Once the simulation for the top provider were created. I was able to showcase the providers who are prone to get into fraud related discussions with their customers.
Note: Above frequency graphs are based on the Monte Carlo Simulation that gives a probabilistic perspective for the Fraud Indicator conversations that might lead to an actual fraud incident. The above outputs are based on the 1000 simulations where each simulation is equally likely to happen.
By looking at the above results one can easily isolate those providers or scenarios that can results in a potential fraud incident before it happens and mitigate a potential risk to the consumer, the brand and the overall reputation of any healthcare service provider.
Author: Sunil Kappal - Senior Analytics Manager
Publish Date: January 25, 2017 5:31 PM
In my past article "Detecting Healthcare Fraud Using Speech and Data Analytics" I have described at length about the two most robust and common techniques used to identify and predict fraud. However, this article focuses purely on the most common Data Mining method CRISP-DM "Cross Industry Standard for Data Mining Process" that can help organizations to mine the data using some common techniques to identify fraud.
Before we proceed we should understand what is Data Mining?
“The process of discovering meaningful new relationships, patterns and trends by sifting through data using pattern recognition technologies as well as statistical and mathematical techniques.” – Gartner Group
Let’s look at some very common and basic fraud detecting techniques used based on the business motivation of either:
- Predicting or classifying a fraud
- Grouping or finding affinities/associations
- Techniques used to predict or classify a fraud
- Regression algorithms: These algorithms predict a numeric outcome. Most commonly used algorithms are:
- Neural Networks
- General Linear Modeling
- Classification algorithms: These algorithms predict symbolic outcomes. Most commonly used algorithms are:
- Logistic Regression
- Techniques used to group and associate fraudulent transactions/events
- Group and find association algorithms: K-means, Factor Analysis are amongst the most common and popular clustering and grouping algorithms used to detect fraud.
- Association algorithms: Apriori algorithm is one of the most popular association algorithm used in the healthcare industry for frequent item set mining and association rule learning over the transaction database.
Use Case – 1: Regression
- Predict the expected claim value, compare it with the acutal value of the submitted claim.
- Cases falling outside the expected range should undergo rigorous scrutiny.
Use Case – 2: Decision Trees
- Create characteristicts or provider profiles with fraudulent behaviors and indicators.
- Extract cases meeting the historical characteristics of fraud
Use Case – 3: Clustering and Association
- Group transaction by providers or trnasaction types
- Find group events by using the association algorithms
- Perform outlier identification analysis for further scrutiny
Benefits of Fraud Detecting using CRISP DM
- Lends itself as a systematic tool and methodology set to detect and prevent fraud
- Helps to maximize the investigative efforts in auditing fraudulent transactions
- Results in higher recoupments
- Continually updates the model to identify new emerging abuse patterns
Publish Date: February 17, 2016 7:08 PM