Data Science Superpowers: Pay it Forward

By MNC

Data science is the new superpower. The insurance sector can harness it to create social good if public and private data custodians join together to share their data securely and responsibly. The key is trust and a shared goal with the best interests of the people, whose data is being utilised, at the forefront.

Please note: This piece is written by MNC and published by the Actuaries Institute Australia. The articles aim to stimulate discussion on important, emerging issues related to ICA2023. Opinions expressed in this publication are the opinions of the articles' author and do not represent those of either the Actuaries Institute Australia, the International Actuarial Association or the 2023 International Congress of Actuaries Organising Committee, or its members, directors, officers, employees, agents, or that of the employers of the authors.

01 INTRODUCTION

Data science is the new superpower. The insurance sector can harness it to create social good if public and private data custodians join together to share their data securely and responsibly. The key is trust and a shared goal with the best interests of the people, whose data is being utilised, at the forefront.

Public services such as health care providers and charities are custodians of data. They responsibly protect it and for the most part, keep it away from large financial corporations. After discussions with some of these institutions, their perception is that in these corporations’ hands it would result in their data providers being disadvantaged or even excluded from participating in the financial markets. These people, whose data they protect, are often the most vulnerable should the worst happen.

The insurance industry has data for its policyholders, but they are only a subset of the entire population. Without understanding the rest of the population’s health or their full medical history there is a risk that those deemed substandard are disadvantaged (through higher premiums or exclusions) or completely excluded through misunderstanding rather than actual proof of poorer expectations of health (for example).

02 Example: Mental Health in Australia - Overview

Mental Health is an area where there is much to be gained from a public/private relationship built on trust.

In Australia there is a similar pattern to other developed nations, a growing epidemic of mental health conditions fuelled by the impacts of the pandemic, a growing crisis in care provision, lack of appropriate services for children and young people and a lack of co-ordination between touch points within the mental health care sector.
Picture 1

Picture2

In the insurance sector there has been a growing trend in mental health claims due to increasing diagnosis, incidence, and prevalence of mental health conditions. Artificial Intelligence (AI) is a powerful tool to determine when mental health is deteriorating.

In an ideal world, we could:

  • Identify people “at risk” of mental ill health as early as possible and signpost them to appropriate care
  • Direct mental health resources to the areas where they have most impact
  • Provide evidence-based research to assist funding proposals
  • Improve overall experience of having mental ill health
  • Offer superior insurance products appropriate for people who experience mental ill health.

But this utopia feels a long distance away.


03 Example: Mental Health – Scoping the probleM

Scoping the problem will be the first stage.

Picture 3

As part of our work with a University of New South Wales (UNSW) research scientist, we learnt best practice techniques for (a) an environmental scan and (b) a literature review. For (a) the aim was to understand the different indicators and measurements of mental well-being across various research datasets - essentially telling us which indicators had been commonly tested and how they had been measured. This showed us that only 7% of Australian datasets had been analysed.


The most common mental health indicators tested were feelings, depression, physical health and anxiety, whilst the most common demographic indicators were age, education, ethnicity, relationship status, socio economic status, work and gender.

Picture 4

For the literature review (b) we built a model using machine learning to screen a pool of literature from 22,000 articles to 300 articles that were then reviewed to determine if there was any overlap with MNC’s proposed mental health project topics. This diagram shows the machine learning pipeline that was developed in-house.

Picture 5

As a result of this process, it is more evident what has been historically researched and what has not, and the common indicators used to determine whether people had mental ill-health. Such analyses can help to instil confidence that the definition of project scope fits the proposed purpose and does not duplicate work that has already been completed.

One outcome from the research reviews is that there has been little research on the correlation between social media activity and the diagnosis of mental health conditions and/or utilisation of current mental health services. This is a particular concern among parents of teenagers and is thought by many to be a ticking time bomb.

In Australia, some interesting initial work has already been completed. In 2014, Data61 collaborated with the Black Dog Institute and Amazon Web Services, WeFeel to analyse the words from millions of tweets to display a real-time view of users’ emotions. They can show how the population’s emotions fluctuate over time due to changes in socio economic or environmental factors.
Picture 6

Source: Data61 website1

WeFeel looks for up to 600 specific words in a stream of around 27 million tweets per day and maps them to a hierarchy of emotions which include love, joy, surprise, anger, sadness and fear using natural language processing (NLP), to look at the words people type in their posts before mapping them in a hierarchy, or ‘wheel of emotions’.  WeFeel enables the exploration of emotional trends on a minute-by-minute time scale across locations and genders around the globe.

Imagine what could be achieved if this could be supplemented with other, more private, sources of data, with appropriate controls and protections. For example, any insights could ultimately direct potentially life-saving services to different geographical areas across the nation and help inform government decision making. This is data science at the forefront of mental health research. It seems that our technology is much more developed than our ability to share and handle sensitive data.

04: Example: Mental Health in Australia – Data Handling

Once we have the agreement to share data then data scientists can work their magic on the datasets. Clearly there are ethical challenges, which could be overcome by appointing an independent panel to consult on decisions relating to how sharing data can used for the benefit of society as a whole. Regulators must also play their part.

In addition, we must pay attention to the integrity of the data, that the results produced are good quality, that the data is linked appropriately, that the timing of the datasets match and the systems are in place to store the data appropriately.

Machine learning is particularly valuable in cleaning data. The algorithm is trained to identify commonalities in the data that would lead to it being treated in a particular way. Since many financial services organisations face efficiency challenges, MNC has built a machine to clean data in a faster and more reliable way.

Picture 7

For example, input a large number of life insurance claims and rationalise the claims causes into ICD10 categories quickly, allowing ease of analysis. The model caters for spelling errors, partial data entries and other anomalies. This can then be extended to taking a claim cause, such as a back injury that could be defined either an accidental cause or as a musculoskeletal condition and using any additional text available within the claim record to define the most appropriate category.

Manually it would take a long time to go through each of these claims and correctly allocate them to a claim cause, but a machine learning model can be trained to identify commonly used words to allocated claims to one particular category and complete this process in a matter of minutes. This is easily extended to general insurance to categorise claims for home insurance, car insurance etc into aggregated claims causes.

05: Conclusions

Data science is a superpower, and, with today’s skillsets and technology, there are many ways in which actuaries and data scientists can benefit society as a whole.

This article considers only one example of what could be achieved. The Institute of Actuaries UK has “The Insurance as a Force of Social Good Working Party” that aims to explore an aspect of what it means for actuaries to work in the public interest in the insurance sector and suggests areas that merit consideration. We can all strive to put social responsibility at the heart of our business decisions.

But as AI forges ahead, the issues of data protection, privacy and sharing struggle to keep pace. Until we can solve that, aren’t we missing opportunities to help society?

We are happy to discuss any of these models or topics with you. Please contact us via our website Contact Us | Matt Noyce Consulting, book a meeting directly or come and see us at the 2023 International Congress of Actuaries, Booth 11.