Australian mortality and COVID-19 experience in 2021 LifeInsurance_IconHealth_Insurance_IconDataAnalytics_IconGInsurance_Icon

Jennifer Lang, Zhan Wang, Richard Lyon, Dr Han Li, Karen Cutter, Angelo Andrew, Mengyi Xhu


This paper from the COVID-19 Mortality working Group looks at a number of ways COVID-19 mortality has been measured around the world and in Australia by examining various datasets and discussing different statistical techniques observed. We then take a closer look at the mortality experience of Australia during 2020 and 2021, including experience by Cause of Deaths (including COVID-19), by age group, and experience during different COVID-19 waves . And finally, we take a brief look at the impact of COVID-19 on long term illness, based on studies from around the world.   

Capital Levels in Super and Members Best Interest Superannuation 

Melinda Howes, Michael Dermody, Tim Gorst


What is the right level of capital for Super funds. For not-for-profit funds this question is difficult as you balance whether to return capital to members versus having sufficient capital to invest in the future.

Our holistic capital management approach helps funds understand how to approach this question now and how to use the framework to manage the fund going forwards. It introduces rigour to business cases and helps articulate how the fund is supporting the members best interest lens.

Pathways to Homelessness Data Analytics

Hugh Miller and Laura Dixie


Homelessness is a significant and growing problem in Australia. About 1% of the Australian population accessed homelessness support services in 2019-20, Further, people who are homeless or at risk of homelessness are much more likely to have other vulnerabilities such as low incomes, mental health issues or being victims of family and domestic violence. However, the understanding of how homelessness services fit into the broader picture of support for vulnerable people has been largely unexplored.

Pathways to Homelessness is a research project we carried on behalf of Homelessness Strategy team within the NSW Department of Communities and Justice. To help develop new support and early intervention programs aiming to prevent homelessness they wanted to better understand:

  • Key risk factors for homelessness and precursor services accessed, in order to support early identification of groups at higher risk
  • The elevated government service use and associated costs following homelessness to inform investment in initiatives.

To provide this understanding we used linked cross-sectoral government data to examine pathways to homelessness. This data is well suited to this task. For many people experiencing homelessness, the experience follows an extended period of financial hardship (which many associated factors like mental illness). This often means they have made heavy use of government support services over the preceding period. This group are potentially well suited to targeting for prevention programs, and linked government service use data provides a very informative lens.

The linked dataset created for this project is one of the most comprehensive datasets related to homelessness in Australia, covering over 625,000 people across 19 NSW and Commonwealth services. The study population is formed using a case and comparison design. The dataset is large enough to be able to meaningfully analyse homelessness risk across the entire NSW population. This is in contrast to existing research which tends to consider small cohorts who all have experiences of homelessness, so there is no ability to provide comparisons to a baseline.

Using this data we carried out a series of analyses:

  • Descriptive statistics to understand the key characteristics
  • Predictive modelling to identify groups at high risk of homelessness
  • Two-way pathway analysis to compare potential intervention points and estimate the elevated costs across government services for people experiencing homelessness
  • Additional analysis on vulnerable cohorts.

This presentation will cover the:

  • Value of linked government data and the rich picture it can provide.
  • The design of analysis to support policy development. Risk-modelling of homelessness translates fairly naturally between an insurance premium setting context. Providing comparisons of intervention points has less history in the actuarial context but is of interest in claims management.
  • Analysis undertaken and key results, including providing interpretable results for policy makers.

Insurance Underwriting in an Open Data Era - Opportunities, Challenges and Uncertainties GInsurance_IconData Analytics

Chris Dolman, Kimberlee Weatherall, Zofia Bednarz


Exchange of information is a critical part of insurance pricing and underwriting. Traditionally, this is in the form of mandatory question sets, which the prospective insured person must answer to a suitable level of reliability before obtaining a quote for cover. In Australia, the Insurance Contracts Act sets out some rules around this, and other analogous systems exist in various other countries around the globe. The traditional manner of data collection had inherent practical limits. Questions had to be easily understood by laypeople, readily answerable by them, and not so extensive as to be off-putting. With the advent of open data regimes around the globe, many of these traditional limitations may be reduced or removed. By a mere press of a button, consumers may be able to share extensive and unprecedented data with an insurer, in order to automatically and accurately answer detailed questions that they might not necessarily understand or be able to answer if asked directly. In this paper, we analyse whether open data regimes can be used in this manner to replace existing underwriting questions or to create new ones. We then examine the impact that this change may have on various cohorts of customers, particularly considering the potential impact on those without access to data, who may be more likely than average to be otherwise vulnerable or disadvantaged. We suggest thematic areas to consider for further guidance or reform, based on our analysis.

Customer Churn Prediction using Natural Language Processing (NLP) Data AnalyticsBanking_iconGInsurance_IconHealth_Insurance_IconLifeInsurance_Icon

Afaz Uddin Ahmed


Predicting customer churn is an important consideration for any business, including financial service businesses, because costs of acquiring new customers far outweigh costs of retaining existing ones. Our daily interactions with Siri, Alexa, Hey-Google, and Bixby, which are Natural Language Processing (NLP) based automation systems are currently treated as just another cool feature in our everyday lives. Imagine using this cool feature to solve a fundamental problem for a business – preventing customer churn. Different customers exhibit different behaviours and preferences and cancel their subscriptions for a variety of reasons. Most existing models predict customer churn by using demographic and transactional data of customers, which may not contain a full reflection of customers’ intentions. In this paper, a customer churn prediction model is developed using NLP by extracting features and patterns in unstructured data available against customer policies. These unstructured datasets are typically text, calls, and notes, and thanks to the advancement of NLP technology, these datasets can now transform into key information from which we can infer intention for churn. Existing commercial NLP models which predict customer intention based on text, are still in their infancy and researchers are still investigating how to improve such models. With fast emergence of new NLP features, many have become outdated. A case study is presented in this paper to predict customer churn and reason for intention to churn using call data which may not be possible using structured data alone. The model proposed in this paper uses recently available NLP tools and features to develop a customer churn prediction model. The model uses keyword matching to mine expressions of interest and profiles of people corresponding to customer criteria. The proposed model takes advantage of available pre-trained NLP models to perform sentiment analysis. A set of reference sentiments are manually generated and compared with the customer conversation to find the similarity as an index. This index is used as a threshold for a classifier model to identify the reasons for churn for any conversation. The performance shows that NLP has the potential to provide a detailed understanding customers’ churn behaviour including why a customer chose to churn.

Anti-discrimination Insurance Pricing: Regulations, Fairness Criteria, and Models DataAnalytics_IconGInsurance_IconGlobeIconPINKLead&Professionalism_Icon

Xi Xin, Fei Huang


On the issue of insurance discrimination, a grey area in regulation has resulted from the growing use of big data analytics by insurance companies – direct discrimination is prohibited, but indirect discrimination using proxies or more complex and opaque algorithms can be tolerated without restrictions. This phenomenon has recently attracted the attention of insurance regulators all over the world, and stricter insurance discrimination regulations are being discussed and considered by regulators. Meanwhile, various fairness criteria have been proposed and flourish in the machine learning literature with the rapid growth of artificial intelligence (AI) in the past decade, which mostly focus on classification decisions. In this paper, we introduce the fairness criteria that are potentially applicable to insurance pricing as a regression problem to the actuarial field, match them with different levels of potential and existing antidiscrimination regulations, and implement them into a series of existing and newly proposed anti-discrimination insurance pricing models, using both generalized linear models (GLMs) and Extreme Gradient Boosting (XGBoost). Our empirical analysis compares the outcome of different models via fairness-accuracy trade-off and shows the impact on customer behavior and solidarity.

Supply Side Resistance to Lifetime Annuities Superannuation_Icon

Anthony Asher


The Annuity Puzzle is that lifetime annuities are not utilized in retirement as often as might be expected. Both academic literature and industry publications provide demand side explanations: bequest motives, liquidity preferences, crowding out by social security and family insurance, unattractive investment returns, poor money’s worth particularly for those with lower life expectancies, and concerns about the solvency of the annuity provider. On closer examination, however, none of these have been found to adequately explain the puzzle. The most widely accepted view is therefore that the reluctance to buy annuities is largely a result of behavioural biases and misunderstanding.

On the other hand, supply side limitations to alternative products have barely been explored; particularly the possibility that it too is due to behavioural biases and misunderstanding on the part of participants in the industry and regulators. Evidence will be presented that the superannuation industry (fund trustees, financial advisors and service providers) are resistant to the idea of offering lifetime annuities. This resistance can partly be explained by the interests of trustees and advisors in ensuring that lifetime annuities to not reduce their funds under management and opportunities to charge fees. Their interests can perhaps explain how the demand side explanations continue to be repeated thus confusing potential buyers of lifetime annuities and influencing regulators.

Even if trustees were to enthusiastically promote annuities, they do face the difficulty of overcoming the behavioural biases and misunderstandings that are prevalent.

This will require a significant developments in in the provision of financial advice. The tension between ensuring that advice is both appropriate and affordable is widely recognised, with the current focus in Australia being on simplification and financial technology.

The paper concludes by identifying challenges to trustees, advisors, regulators and academics.

The Great Housing Bubble Invest&WM_IconLead&Professionalism_IconRiskManagement_Icon

Richard Lyon


It is a little while since I presented on the topic of house prices, with my previous presentations being Fair’s Fair (2017), All I Want is a Home Somewhere (2018) and Safe as Houses (2019).

My past analysis has shown that, in the long term, median house prices follow incomes growth and mortgage interest rates, with an additional trend growth overlay of 1½ % per annum that can largely be explained by quality improvements. While powerful short-term forces can drive prices well away from this trend at times, I found that Australia was not in a bubble in early 2019.

Since then, things have changed, and median house prices have risen dramatically. Meanwhile, as we learn to “live with Covid”, we are doing so against a backdrop of a high recent household savings rate and low interest rates, but with concern over household debt levels and inflation. In other words, we appear to have a housing bubble and a high risk of a sharp bust rather than a gradual deflation.

In this paper, I will revisit my long-term predictor and test this bubble theory, exploring the implications.

De-Risking Automated Decisions - Guidance for AI Governance Data AnalyticsLead&Professionalism_Icon

Tiberio Caetano, Jenny Davis, Chris Dolman, Simon O’Callaghan, and Kimberlee Weatherall


Many organisations are using AI to make consequential decisions, such as deciding who gets insurance, a loan, or a job. When humans delegate decisions to AI, problems can happen because AI lacks elements often required for good decision making, such as common sense, moral reasoning, and a basic understanding of the law. Many incidents have made it clear that AI has the potential to produce unlawful, immoral, discriminatory outcomes for individuals through opaque and unaccountable decisions. This includes issues such as AI discriminating against women and minorities. What can organisations do to reap the benefits of using AI for decision-making while preventing these issues from happening? This new report provides general guidelines for organisations to reduce the risks of using AI for automated decision making. It explains some novel risks introduced by AI, provides illustrations through case studies, and suggests a range of preventative, detective, and corrective actions to reduce and manage those risks.

Approaches to Better Utilising Machine Learning Models for Efficient Modelling and Pricing Delivery 
Data AnalyticsGInsurance_IconBanking_icon

Zhijing Xu


Retail insurance pricing has been a prolonged and complex process involving many technical and practical considerations. The rapid market changes require insurers to not only have an established framework to conduct pricing reviews, but also the capabilities to translate data into market pricing responses in an accurate and efficient manner. One particular challenge faced by the insurers is how to deal with data from most recent period. The latest experience may include valuable insights into emerging market trends yet is often underdeveloped. Another challenge is the GBM modelling results are not directly implementable since the rating engine can only accommodate the GLM-like rating tables.

In this paper, we propose an intelligent pricing approach to better utilizing machine learning models to improve insurers’ pricing capabilities, which could be well integrated into insurers’ existing pricing algorithms. The approach aims to overcome the previously highlighted two challenges and to enable an efficient risk modelling and pricing delivery process, by directly leveraging the machine learning modelling results. A case study is presented to demonstrate the viability and highlight the advantages of the intelligent pricing approach using actual insurance claim data. The accuracy and efficiency nature of the pricing solution is expected to largely boost insurers’ pricing capabilities under rapid changing market conditions.

SPLICE: A Synthetic Paid Loss and Incurred Cost Experience Simulator Data AnalyticsGInsurance_Icon

Benjamin Avanzi, Greg Taylor, Melantha Wang


Recent years have seen rapid increase in the application of machine learning to insurance loss reserving.

These machine learning methods are hungry for data. While the ultimate objective of these methods is application to real data, the availability of synthetic data is important for at least two reasons: (i) real data sets, especially of granular nature and of large size, are in short supply in the actuarial literature for reasons of confidentiality, (ii) knowledge of the data generating process (impossible with real data) assists with the validation of the strengths and weaknesses of any new methodology.

Against this background, we introduce a simulator of individual claim experience, called SPLICE (Synthetic Paid Loss and Incurred Cost Experience). On a high level, SPLICE consists of a paid loss unit (claim payments) and an incurred loss unit (case estimates). SPLICE simulates individual claims at the transactional level, i.e. key dates associated with a claim (e.g. notification and settlement dates), individual claim payments and revisions of case estimates. An individual claim’s transactions are generated in a manner that is intended to reflect transactional sequences observed in practice.

Our simulator is publicly available, open-source (on CRAN), and fills a gap in the non-life actuarial toolkit.

The simulator specifically allows for desirable (but optionally complicated) data features typically occurring in practice, such as superimposed inflation and various dependencies between claim features. For ease of use, SPLICE comes with a default version that is loosely calibrated against a specific real CTP portfolio and that has a structure suitable for most lines of businesses with some amendments. However, the modular structure of SPLICE ensures that the user has full control of the evolution of an individual claim (occurrence, notification, timing and magnitude of individual partial payments and revisions for case estimates, closure).

Indeed, thanks to this flexibility, SPLICE may be used to generate a collection of data sets that provide a spectrum of complexity. Such a collection may be used to present a model under test with a steadily increasing challenge.