Is your enterprise capitalizing on this year’s biggest trends? Evaluate your competitive edge with the 2022 Virtusa Trend Almanac. View report


Predictive analytics in insurance claims

Published: October 13, 2018

Introduction to predictive analytics

Insurance is one of the oldest industries that has used statistics and modeling. Actuarial science that evolved from the acclaimed work of John Graunt and Edmund Halley in 1693, determined the age at death on the basis of Breslau statistics.

Predictive analytics in insurance uses historic data to identify patterns and trends to predict future unknown outcomes. The competitive landscape and technology trends have forced insurers to apply predictive modeling to various processes for more profitable and efficient operations.

Although predictive analytics can be applied across all value chains, we will focus on claims, as 80% of premium revenue is spent on claims.

It has been a debate whether predictive analytics will become automated commodities or will need to be analyzed with human decision making. Nevertheless, it is universally accepted that predictive analytics will impact the way insurers conduct their business. Some of the use cases across insurance lines are elaborated to highlight how predictive analytics helps in business decisions.

Predictive analytics in insurance

The insurance industry is rich in data. Data is processed as useful information to identify patterns and answer some fundamental questions about the business. The authors of "Analytics at Work" have put it very succinctly. The questions are across two dimensions, namely time frame and innovation.

Time frame: Are we looking at the past, present or future?

Innovation: Are we working with known information or gaining new insight?
Once patterns are identified, they are interpreted as a function of variables. The function is analyzed over the data set. This is essentially modeling the behavior and making inferences. Extrapolating the behavior over a larger set or simulating the behavior under certain criteria lead to predicted outcomes. Predicted outcomes empower organizations with better insights and foresight to make business decisions.

Example insurance claim scenario and analytics

Here's an example from the insurance industry that highlights how claims is 'the moment of truth.'

Many auto accidents happen on late Friday nights.

The drivers of these cars are of a certain age group. Specific areas of the city are more prone to such accidents. Mostly, these accidents happen on cloudy or rainy nights when the visibility is low. Certain make and models of cars have more damage than others in such accidents.

This is understood by drawing relations between different variables such as day of the incident, time, age group, and associating it with other external information such as location, behavior patterns, weather information, vehicle types, etc.

Establishing association between the variables, understanding the pattern, modeling the pattern as a function of these variables, simulating the pattern on a larger data set to observe the emerging inferences and using these inferences in the decision making is the role of predictive analytics. Until this point, inferences are drawn from structured data.

There is a whole gamut of unstructured data that provides vital information. What if the driver was returning from a party and was already taking treatment for vision problems and was not supposed to drive when visibility is low. What is the nature of claims payouts in these accidents? What is the expected payout and processing time for similar claims? What does this information indicate to the actuarial team that would help policy underwriting? These are the associations that help insurers associate information intelligently and make business decisions with insights and foresight.

Webinar - Accelerate Your Healthcare Data Outcomes and Enable Innovation

Key performance indicators (KPI)

For insurers embarking on an analytical journey, a specific business problem is a good target. It is important to look at the existing data to identify potential business problems that impact the KPIs. The KPIs of insurance claims:

  • Reduce claims cycle time
  • Increase customer satisfaction
  • Combat fraud
  • Optimize claims recovery
  • Reduce claim handling costs

Reduce claims cycle time

A cursory look at one of the insurer's data revealed that the claims cycle time is abnormally high in one of the geographies. This data was compared to identify patterns across insurance lines. Even though all the claim types had a significantly higher cycle time in a particular geography, a closer dissection of the data revealed that the cycle time is almost twice for a particular type of claim. This led to a few interesting questions.

  1. Is this pattern consistent across all claims or is there any specific characteristic of a claim that increases the cycle time?
  2. What are the sub-activities within the claim processing that consume a longer cycle time for this particular type of claim?

By investigating the claim characteristics and sub-activities within a claim, a pattern was observed. Even though claims of a certain type consumed more days for closure, not all the claims of that type had a longer cycle time. This revealed that there are certain characteristics about the claims that led to longer cycle time and thereby increased costs and decreased customer satisfaction.

Collision claims could have multiple features like collision only, bodily injury, glass only, other damage, etc. The pattern was observed that whenever a bodily injury had occurred in a claim, the cycle time increased. Also, the cycle time increased proportionately with the number of bodily injuries in a claim. This led to the understanding that by addressing the root cause of the cycle time increase in bodily injury claims, a significant decrease in the average cycle time can be achieved. Also, there are best practices in other geographies that can be emulated.

Analytics has led to the insight that the following questions have to be explored to take corrective and preventive actions.

  1. How many adjusters are available to assess that particular type of claim?
  2. Are the adjusters utilized efficiently? What is the skill set of the adjusters? Are there any gaps that can be addressed by training or knowledge base?
  3. What is the idle time among adjusters? Are there any activities that let adjusters to wait for information?

Increase customer satisfaction

A byproduct of efficient claims handling is increased customer satisfaction in addition to reduced claim handling costs. With the current technologies and social networks, bad publicity spreads much faster, hitting at the top line of insurers— their customer retention and revenue.

In a scenario where a stronger correlation exists between the claim payout and the cycle time, the primary inference is that huge or complex claims consume a long time. In this sample, we found that even though there was a stronger correlation between claims payout and cycle time across geographies, there was a weaker or no correlation within geography across claims of the same type. This observation confirmed that there are other characteristics of a claim that impact the cycle time and that does not depend on the claims payout.

This brought in a few other questions outside of claims too. Do customer retention and repeat business depend on the cycle time? How much of cycle time is affordable to be competitive? When to decrease the cycle time and when to tolerate the existing cycle time?

Combat fraud

Anyone with an eye for investigation can find many hidden clues within data. The claimants, physicians, adjusters, body shop personnel are all individuals who have vital characteristics about their behaviors and the other individuals or entities they associate with.

Analyzing the data across claims could reveal multiple identity patterns, a very significant characteristic of a fraudster. Multiple identities could exist due to typos by the data entry personnel during claims intake or it could be a potential fraud. Separating entities from individuals and analyzing the data helps in narrowing down to the individuals who repeatedly claim under various policies.

This gets even more interesting when insurers get on to the social networks and leverage vast amounts of data about their policyholders, vendors, associates and their associations and interactions. Many of the insurers have leveraged structured data to a certain extent and there are a lot of opportunities to exploit the unstructured data around us.

Optimize claims recovery

Insurers have this dual role of keeping up with the promises to pay and also ensure that unwarranted payments are recovered appropriately. In auto insurance (both personal and commercial), there are opportunities for salvage, subrogation, and reinsurance to claim back the payments partially or fully. The data on the claims history, vendor data, and claim costs have important information that helps insurers to identify which vendors are cost-effective for what kind of services.

How can the vendors be assessed on their performance? What impact does the vendor service have on customer satisfaction?

It is important to have this information early in the claims life-cycle to reduce overall costs.

Reduce claim handling costs

Controlling the Loss Adjustment Expenses (LAE) has been a major area of focus across geographies and insurance lines. A snapshot of LAE ratio of the top P&C insurers in the US is given below:

This is a very good example as we can see that though insurers have decreased their combined ratios over the years they have become less profitable. This is due to a decrease in interest rates as the investment portfolios have returned a lesser profit. This is the reason insurers were more profitable in the last decade in spite of the higher combined ratios. The analysis can be done within the insurer's data identifying patterns, activities that need to be optimized or can even extend to include external parameters that impact the outcome.

Finally, bringing this all together with a simulation model and a "what if analysis" would provide valuable insights into the future as well. With this analysis, insurers can predict future outcomes in terms of loss ratio, combined ratio, and profits based on claims volume, types of claims, adjuster efficiency and similar parameters.

How can Virtusa help?

Virtusa brings together a depth of technology coupled with the breadth of industry knowledge to partner with clients in looking into the future. The methodology, tools, and techniques that are part of Virtusa's predictive analytics landscape are detailed in this section.


Virtusa has a V-PREDICT model that is used to identify, plan and implement the predictive analytics solution in any business area of the insurer. Any predictive analytics journey is as good as the data underlying the analysis.

V-PREDICT comes with a set of tools for data extraction and cleansing that prepare the data for further steps in the analysis.

A typical architecture encompassing the complete gamut of possibilities with analytics and big data is represented below. Though it is completely feasible to implement it end-to-end, Virtusa applies a pragmatic approach to utilize only the required elements of this architecture. Example: For an insurer where the structured data has to be analyzed before getting into the social/big data, Virtusa would bring in only the business analytics and data visualization components.


Predictive analytics in insurance is a clear differentiator for insurers to be competitive and expand their market share. Virtusa's unique blend of technology and business has manifested in the insurance claims solution. The differentiator is in letting the insurers retain the autonomy of their business processes and technology choices and still gain insight into the future.

Insurance cloud solutions

Accelerate time-to-market and achieve true customer-centricity

Related content