Applying Machine Learning to Actuarial and Pricing Workflows - Actuarial Edition

By Akur8

Applying machine learning to insurance pricing is escalating with advances in technology and greater access to machine learning platforms. Actuarial judgement can assist machine learning outcomes to produce better results in insurance pricing. As the application of machine learning in actuarial analysis increases, we recommend these guidelines to maximize the benefit of incorporating these techniques into insurance pricing applications.

Please note: This piece is written by Akur8 and published by the Actuaries Institute Australia. The articles aim to stimulate discussion on important, emerging issues related to ICA2023. Opinions expressed in this publication are the opinions of the articles' author and do not represent those of either the Actuaries Institute Australia, the International Actuarial Association or the 2023 International Congress of Actuaries Organising Committee, or its members, directors, officers, employees, agents, or that of the employers of the authors.

Limited Data and the Intended Purpose of Machine Learning

Resistance to using modeling or machine learning techniques can often be blamed on a lack of sufficient data. This resistance is sometimes based on the assumptions about machine learning algorithms and the intended purpose of their output. Actuarial Standard of Practice No. 23 Data Quality 4.4.1[1] states that an actuary should disclose “any limitations on the use of the actuarial work product due to uncertainty about the quality of the data [...]” This, combined with the understanding that “Generalized Linear Models (GLMs) assign full credibility to the data”[2], mean that the model output may have limitations as available modeling datasets may not be fully credible.

However, penalized regression techniques do not have this assumption of full credibility. The link between penalized regression techniques and standard credibility models is well established [3]. This means that credibility-based machine learning techniques can limit the noise from thin data and can be used on datasets that were previously out of scope due to their small size. Additionally, the purpose of a machine learning analysis can be expanded from creating final implementation recommendations to providing teams with the best data-driven point to start additional analysis. Making this adjustment allows machine learning techniques to be applied to everyday pricing work and provides teams with the best data-driven starting point for analysis. The increased ease of applying these credibility-based machine learning techniques allows pricing teams to adopt these tools without significant investment. This enables machine learning algorithms to be used even when a pricing team has a small dataset and can reveal opportunities for those teams to expand their analysis when more data is available. 

For small datasets, the purpose of applying machine learning algorithms may be to identify the starting point for continued pricing analysis. For example, machine learning can quickly determine the characteristics, directionality, and approximate magnitude of the changes required to correct for poor experience in a rating plan. The analyst can then investigate the most significant characteristics identified by the model to determine if the most appropriate intervention is a pricing, marketing, underwriting, or other change. The algorithm will provide a shortcut by quickly identifying potentially hidden problem areas without hours of dashboard analysis. A business can save valuable resources by relying on machine learning to identify the best data-driven starting point for analysis.

As lines of business grow, analysis of medium-sized datasets can also evolve. The same process that identified best data-driven starting point for analysis on a small dataset will turn into actuarially supportable changes on a medium-sized dataset. This multivariate analysis can be automated to create a robust rating plan monitoring structure. By starting when a program is small, this evolution will come naturally as actuaries notice their results stabilizing over time as the insurer’s experience grows.

Finally, with large datasets, machine learning will fully support the traditional creation of a new rating plan from scratch. This is typically the stage that people think of when considering the application of machine learning techniques, but it is not the starting point! Businesses should not wait for this large amount of data to begin using machine learning for actuarial analysis. Once an insurer does reach this critical mass of data, the next natural step is to perform the analysis on large data before dividing it into medium or small datasets for additional insights at a more granular level.

Actuarial Standard of Practice No. 56: Modeling refers to the intended purpose of an analysis many times. By expanding the intended purpose from creating a final pricing algorithm to producing insights and identifying problem areas, actuaries can use machine learning on increasingly small datasets while following the guidance of ASOP 56.

Using Machine Learning for an Efficient Frontier

As an extension of the exploration of the intended purpose of machine learning models, an actuary should consider expanding their expectations of what a machine learning model can produce. Likely, an actuary beginning to work with machine learning algorithms would think that the output of a machine learning process is always a single best estimate - but this is not necessarily true.

Machine learning techniques can incorporate different levels of constraints to achieve different outcomes. When business decisions need to be applied to a model, it may be best for the machine learning technique to produce an efficient frontier of outputs. For example, a machine learning algorithm can build the best pricing model with any number of variables (10, 11, 12, 13 …). Each variable incurs either a customer quoting question or a vended solution cost to obtain its characteristic for each risk. The decision between complexity and predictive power is not necessarily a machine learning problem, but can be a business decision for actuaries as each variable may incur a quoting, regulatory, or other cost.

Efficient Forntier of Models Akur8 article

Business decisions are rarely black and white. Usually, there is a range of reasonable actions that can be implemented to address a problem, and the “best” solution is a matter of opinion. For example, a reserving analysis will result in a range of reasonable reserves - not a single point estimate of reserves that must be adhered to. Similarly, the efficient frontier approach provides teams with the opportunity to identify a “range of reasonable models.” Within this range, actuarial and business judgment can be applied to determine the best course of action. A single model output does not allow for this judgment to be easily applied.

The Changing Actuarial Experience

Moving from a manual process to a machine learning process will change the working experience for an actuary. Often, this results in a perceived - but not actual - loss of control due to automation. The separation of purely data-driven decisions and business decisions is key when adopting a machine learning approach.

For example, the manual process of deciding where to limit a characteristic such as policyholder age based on low credibility is often a judgmental decision. However, the optimal placement of this cap is a purely data-driven decision that could b e automated to remove human bias from the process. The data-driven result enables further refinement where the user can change the selection for business decisions if necessary. This is not a lack of control - but rather a more informed starting point to exercise control.

The majority of the day-to-day actuarial experience will shift from manual creation of data analysis to interpreting the output of automated data analysis. This is not to say that actuarial teams will require less technical knowledge - to fulfill the requirements of ASOP 56 3.1 by ensuring that the “Model [is] meeting the intended purpose”, an actuary will still need significant technical skills. The removal of this function will allow an actuarial team to use their technical skills in more meaningful and beneficial ways.

The Necessity of Transparent Algorithms for Actuarial Analysis

This paper discusses the necessity of interpreting automated decisions and creating an efficient frontier of actions to evaluate, but has so far neglected to mention their prerequisite, which is transparent machine learning. A transparent model like a Generalized Linear Model (GLM) or Generalized Additive Model (GAM) provides an easy-to-understand calculation of its estimate, and its formula for calculating a prediction can be easily edited if judgment needs to be applied. A black box model like a Gradient Boosting Machine (GBM) is not transparent, and it is not easy to understand the calculation of its prediction. Changes can be made to the output of the model, but not effectively to its inner workings. Black box models can be useful in situations like deciding which ad to show a particular user or suggesting which TV show they should watch next. The use of black box algorithms is significantly more difficult to justify when the algorithm and its output must be audited for actuarial soundness.

When selecting which of two models is the most actuarial sound, a user needs to understand the models within the context of current business considerations and conditions. Selection between similar models is quite difficult with black box models, as it may not immediately be clear what is happening within the black box to cause the models to produce different scores for similar risks. When using black box models, an actuary must also take into consideration the possibility for hidden behaviors that are unintuitive at best and dangerous at worst. Within a black box model, it is nearly impossible to prove that unintuitive behavior is not happening. An actuary would need to perform sufficient testing to satisfy ASOP 56 3.2.c by “[understanding the] limitations of data or information, time constraints, or other practical considerations that could materially impact the model’s ability to meet its intended purpose” and additionally satisfy the considerations of ASOP 56 3.6 which requires “evaluation and mitigation of model risk” to ensure that the likelihood or effect of unintuitive behavior is immaterial or at an acceptable level of risk. This is a time-consuming and necessarily imperfect process that can be avoided with transparent algorithms.

Due to the less understandable nature of black box methodologies, the likelihood of identifying an unintuitive behavior late in the work cycle increases. Additionally, black box methodologies can be difficult to explain to internal and external stakeholders, where questions on the calculation of an insured’s score may not have an intuitive or helpful response. Transparent machine learning algorithms allow an actuary to easily interpret the actuarial soundness of a model and increases the internal and external approval process.

machine learning

Equity, Fairness, Bias, and Transparent Algorithms

Machine learning cannot be discussed without also acknowledging the topic of algorithmic bias. How to consistently avoid disparate impact while setting equitable and competitive rates is a current discussion and research topic4 in many insurance and regulatory environments, and this paper will not propose a solution. Transparent machine learning technology will not miraculously solve this problem, but it will allow rates and models to be comprehensively analysed and their reasoning fully understood. For each rating variable, the exact impact is understood and easily measured. “Why did this happen?” is an easy question to answer with transparent models.

However, black box machine learning techniques make the already difficult problem of equity and bias even harder to solve by obfuscating the calculation and justification for rates. There are increasingly sophisticated ways to look at the exports of a black box, but in the end, a modeler needs to trust that they’ve explored everything. What happens when an unanticipated edge case appears in practice? How can we be sure that the black box is unbiased in both the aggregate and at all individual levels? With transparent models like penalized Generalized Additive Models (GAMs), questions on model behavior are easy to answer. With Gradient Boosted Machines (GBMs) on the other hand, the explanation of a rate calculation is opaque.

While it is uncertain how the industry will move forward to address issues of equity and disparate impact, transparent machine learning clearly has an advantage over black box methodologies as transparent algorithms can be explained and validated. Transparent machine learning algorithms will still be as biased as the data going into them - but this bias is not hidden behind black box computations. The effect of all variables will be visible to the actuary and regulator and can therefore be properly addressed. Transparent machine learning algorithms enable insurance companies and regulators to understand and address possible biases in insurance models.


Machine learning has found its place in the insurance space, and actuaries can benefit greatly from its adoption in pricing processes. Expanding the use and purpose of machine learning analysis will open exciting new opportunities. Adopting an easy-to-use platform for machine learning is essential to allow actuaries to use machine learning to its fullest potential. An analyst will not frequently use machine learning techniques if they need to set up a codebase from scratch for every application and maintain the codebase across multiple platform updates. Machine learning algorithms should be as accessible to actuaries as spreadsheets.

The proper application of machine learning will meaningfully change a team’s work experience, but it will not dilute the importance of industry knowledge and actuarial judgement. Machine learning algorithms will quickly provide data-driven insights so users can apply their knowledge and judgement to the results and help drive company strategy. Machine learning is no longer just a forward-looking buzzword - it is now ready to be broadly applied to pricing analysis.


[2] “Generalized Linear models for Insurance Rating” Second Edition, Mark Goldburd, FCAS, MAAA, Anand Khare, FCAS, FIA, CPCU, Dan Tevet, FCAS, Dmitriy Guller, FCAS, Section 2.10.1

[3] “Credibility and Penalized Regression” Mattia Casotto, Marco Banterle, Guillaume Beraud-Sudreau,

[4] CAS Approach to Race and Insurance Pricing:

About Akur8

Akur8 is revolutionizing insurance pricing with Transparent Machine Learning, boosting insurers’ pricing capabilities with unprecedented speed and accuracy across the pricing process without compromising on auditability or control.

Our modular pricing platform automates technical and commercial premium modeling. It empowers insurers to compute adjusted and accurate rates in line with their business strategy while materially impacting their business and maintaining absolute control of the models created, as required by state regulators. With Akur8, time spent modeling is reduced by 10x, the models’ predictive power is increased by 10% and loss ratio improvement potential is boosted by 2-4%.

Akur8 already serves 50+ customers across 20+ countries, including AXA, Generali, Munich Re, Tokio Marine North America Services (TMNAS); specialty insurer Canopius and MGA Bass Underwriters; consulting partners Xceedance and Perr & Knight; and insurtechs Manypets and wefox. Over 700 actuaries use Akur8 daily to build their pricing models across all lines of business. Akur8’s strategic partnerships include Milliman, Guidewire, Duck Creek and Sapiens.   

Explore more at