Burning Cost - Open-source pricing tools for UK insurance teams

The problem we solve

The missing piece is not technical skill. It is tooling that bridges the two.

Most UK pricing teams have adopted GBMs but are still taking GLM outputs to production. The GBM sits on a server outperforming the production model, but the outputs are not in a form that a rating engine, regulator, or pricing committee can work with. The model never makes it to rates.

Each library here solves one specific problem in the pricing workflow. Actuarial tests are included. Outputs use the formats pricing teams already recognise: factor tables, Lorenz curves, A/E ratios, movement-capped rate changes.

sklearn-compatible where it matters. Documented by people who have sat in the same sign-off meetings you have.

See it in practice

Three lines to a factor table. Five to validated splits.

Real API calls from the libraries. Not wrappers around wrappers. Each one does the specific thing a pricing team needs.

from shap_relativities import SHAPRelativities

sr = SHAPRelativities(model, X_train)
factors = sr.fit_transform(X_test)

# Returns multiplicative factor tables in GLM format
# Same structure as exp(beta) from your Emblem model
factors.head()
#  vehicle_age  relativity  ci_lower  ci_upper
#  0            1.000       0.982     1.018
#  1            0.912       0.901     0.923

print(f"Reconstruction R² = {sr.reconstruction_r2:.4f}")
# Reconstruction R² = 0.9973
Factor tables, confidence intervals, exposure weighting, reconstruction validation. Output goes straight into a pricing committee pack.

from insurance_cv import InsuranceTemporalCV
from sklearn.model_selection import cross_val_score

cv = InsuranceTemporalCV(
    n_splits=5,
    ibnr_buffer_months=6
)
scores = cross_val_score(
    model, X, y,
    cv=cv,
    scoring="poisson_deviance"
)

# Walk-forward splits - no future data leaks into training folds
# IBNR buffer prevents immature periods contaminating validation
print(f"CV deviance: {scores.mean():.4f} ± {scores.std():.4f}")
Walk-forward splits with configurable IBNR buffers. Temporally correct: no future data leaks into training folds. sklearn-compatible API.

from rate_optimiser import RateOptimiser

opt = RateOptimiser(
    current_rates,
    technical_rates,
    exposure
)
result = opt.optimise(
    max_movement=0.10,
    target_lr_improvement=0.03
)

# Efficient frontier as a linear programme
# Respects ±10% movement cap per segment
print(f"LR improvement: {result.lr_delta:.1%}")
# LR improvement: 2.8%  (within movement constraints)
Formulates the efficient frontier as a linear programme. Respects movement caps per segment, targets aggregate loss ratio improvement.

Who this is for

Built for people who know the problem from the inside

These libraries assume you understand insurance pricing. They do not explain what a GLM is.

Pricing actuaries moving from Emblem or Radar to Python

You know the techniques. These libraries give you Python equivalents that produce outputs in the same formats you already use: factor tables, A/E ratios, Lorenz curves.

Data scientists joining an insurance pricing team

You have the ML skills but lack the actuarial context. These libraries encode that context: correct cross-validation for IBNR, credibility-weighted factors, fairness tests that map to FCA requirements.

Pricing managers evaluating modern tooling

You need to know what is production-ready and what is a research prototype. Each library here has actuarial tests, a clear scope, and outputs a pricing team lead can explain to a committee.

Academic researchers working on insurance pricing methods

We implement recent literature: Manna et al. (2025) on conformal prediction, BYM2 spatial models, variance-weighted non-conformity scores. Reproducible, documented, testable.

Training course

Modern Insurance Pricing with Python and Databricks

Eight modules written for pricing actuaries and analysts at UK personal lines insurers. Every module covers a real pricing problem, not a generic data science tutorial adapted to insurance. You work through real Databricks notebooks, on synthetic data that behaves like the real thing.

Full course + all tools

£295

All 8 modules · all future updates · all Burning Cost tools

One-time payment — no subscription

See the full course →

01 Databricks for pricing teams
02 GLMs in Python: the bridge from Emblem
03 GBMs for insurance pricing
04 SHAP relativities
05 Conformal prediction intervals
06 Credibility and Bayesian pricing
07 Constrained rate optimisation
08 End-to-end pipeline capstone

From the blog

Practitioner articles on insurance pricing

07 Mar 2026

Your Rating Factor Might Be Confounded

When exp(beta) from a GLM is not what you think it is. How omitted variable bias and confounding distort rating factor estimates, and how Double Machine Learning produces cleaner causal estimates.

Read article →

07 Mar 2026

Your Pricing Model Might Be Discriminating

How to detect and correct proxy discrimination in UK insurance pricing models. Using SHAP and the insurance-fairness library to identify protected characteristic leakage under FCA Consumer Duty.

Read article →

07 Mar 2026

Your Pricing Model is Drifting (and You Probably Can't Tell)

PSI and aggregate A/E are not enough. A three-layer monitoring framework - feature drift, segmented calibration, and a formal Gini test - that tells you whether to recalibrate or refit.

Read article →

07 Mar 2026

Your Demand Model Is Confounded

Naive price elasticity estimates from insurance quote data are biased - risk drives both premium and lapse. The insurance-demand library implements Double Machine Learning to fix this, covering the full conversion, retention, elasticity, and demand curve pipeline with FCA GIPP compliance built in.

Read article →

07 Mar 2026

From CatBoost to Radar in 50 Lines of Python

An open-source Python library that distils GBM models into multiplicative GLM factor tables for Radar, Emblem, and other rating engines. The first open-source solution for the most common deployment problem in UK pricing.

Read article →

07 Mar 2026

Your NCD Threshold Advice Is Wrong at 65%

A Python library for NCD/bonus-malus systems, experience modification factors, and schedule rating. Includes the non-obvious finding that optimal NCD claiming thresholds peak at 30% NCD, not 65%.

Read article →

07 Mar 2026

Demand Modelling for Insurance Pricing

How to build a demand model for UK personal lines pricing: conversion, retention, price elasticity, and demand curves. Covers FCA GIPP requirements and the tools that make it tractable.

Read article →

07 Mar 2026

How Much of Your GLM Coefficient Is Actually Causal?

GLM coefficients measure association, not causation. How Double Machine Learning isolates the causal effect of rating factors from confounding, and why this matters for FCA-compliant pricing.

Read article →

06 Mar 2026

Why Your Cross-Validation is Lying to You

Standard k-fold cross-validation is wrong for insurance pricing models. How temporal leakage and IBNR contamination inflate CV scores, and how walk-forward validation fixes both problems.

Read article →

All articles →

Your GBM outperforms.
Your GLM is still live.

The missing piece is not technical skill. It is tooling that bridges the two.

Three lines to a factor table. Five to validated splits.

Built for people who know the problem from the inside

The full pricing workflow, covered

Modern Insurance Pricing with Python and Databricks

Practitioner articles on insurance pricing

Your GBM outperforms.Your GLM is still live.

The missing piece is not technical skill. It is tooling that bridges the two.

Three lines to a factor table. Five to validated splits.

Built for people who know the problem from the inside

The full pricing workflow, covered

Modern Insurance Pricing with Python and Databricks

Practitioner articles on insurance pricing

Your GBM outperforms.
Your GLM is still live.