Preparation and Modelling
Data Mining Approaches to Modelling Insurance Risk, Dr. I Kolyshkina and Dr. R.G. Brookes (9th Accident Compensation Seminar Oct 2002)
Presentation investigating different data mining techniques for a customer value model for health insurance. Includes discussion of decision trees and the MARS approach to stepwise continuous regression modelling. Predictive Modelling and Data Mining for Actuaries, Dan Steinberg and Mikhail Golovnya (Insights Session)
Presentation from Salford systems on data mining. Takes a fairly broad perspective on nonparametric models including trees, MARS, random forests, and treenet (boosted trees) validation.
Audio and Video
Actuarial Control Cycle in Pricing - Using Data Mining Techniques to Enhance Monitoring, David Isaacs (General Insurance Pricing Seminar, 2008)
Talk on data mining techniques in a General Insurance monitoring context.
Making better use of Scheme Data, Natalie Pocock AIAA, Katie Rogers and Aaron Cutter (Injury Schemes Seminar Nov 2013)
A paper investigating claims triage approaches using Victorian CTP data from TAC. Includes regression, clustering and some text mining techniques.
How 7 companies bring power to Hadoop big data applications, Thor Olavsrud (cio.com Jul 2015)
High level description of (mainly US) companies who have pushed into Hadoop cloud computing over the last few years. CIOs Have to Learn the New Math of Analytics, Kim S. Nash (cio.com Feb 2015)
Good description of data driven algorithms and their implications for business. Ideal for managers.
What Every Manager Should Know About Machine Learning, Mike Yeomans (Harvard Business Review Jul 2015)
(Another) good description of how algorithms can be harnessed to solve new problems with computers. Ideal for managers.
The Elements of Statistical Learning, Trevor Hastie, Robert Tibshirani and Jerome Friedman (Stanford 2008)
A free textbook covering a wide variety of modelling techniques for supervised and unsupervised learning. Ideal for an interested reader comfortable with basic statistics and mathematical notation.
Courses and eLearning
Kaggle Tutorials (Kaggle, free)
More links for those wanting to get involved more directly in the programming side of data science. Machine Learning online course (Stanford, free)
A course lectured by Andrew Ng, covering a broad range of methods and algorithms. Suitable for someone interested in the theory and programming, with time to devote to video lecture and exercises.
Tools and Resources
A fun destination for creating word clouds from text