Module 4: SHAP Relativities

Part of Modern Insurance Pricing with Python and Databricks.

The problem this module solves

You have a CatBoost model that outperforms your production GLM. The loss ratio lift is real. The problem is that neither your pricing committee nor Radar can work with a black-box gradient boosting model. They need a factor table: one row per (feature, level) combination, relativity relative to a base level, with confidence intervals. The same format as exp(beta) from a GLM.

This module teaches you to get that table from the GBM using SHAP values. The approach is mathematically sound, not a heuristic. The output is reviewable by a pricing actuary, submittable to the FCA, and importable into Radar or Emblem.

What you will build

A CatBoost Poisson frequency model and Gamma severity model on synthetic UK motor data, with correct exposure handling and MLflow tracking.
SHAP-derived multiplicative relativities for categorical features (area, NCD) and smoothed curves for continuous features (driver age, vehicle group).
A factor table Excel workbook formatted for a pricing committee.
A Radar-compatible CSV export.
A Delta Lake history table for rate version control.

Prerequisites

Comfortable with GLM frequency-severity pricing. You do not need to know the econometrics, but you should know what a relativity is.
Basic Python. You will be reading and modifying code, not writing it from scratch.
Access to a Databricks workspace. Databricks Free Edition is sufficient for the exercises.

File	Description	Estimated time
`00-overview.md`	Module overview, objectives, prerequisites	10 min
`01-why-shap-relativities.md`	The production problem and why SHAP solves it	30 min
`02-setup.md`	Installation, notebook setup, dataset	20 min
`03-training-the-gbm.md`	CatBoost freq and severity training	45 min
`04-extracting-relativities.md`	SHAP extraction pipeline	45 min
`05-regulatory-tables.md`	Committee formatting, proxy discrimination, IBNR	30 min
`06-radar-export.md`	Radar/Emblem export, version control, drift monitoring	30 min
`07-exercises.md`	Five exercises with worked solutions	45 min

Key technical decisions

CatBoost. CatBoost handles categorical features natively (no ordinal encoding needed), has built-in SHAP support that is faster than the generic shap library, and its Poisson objective handles exposure via a proper log-offset rather than a sample weight. These are practical advantages, not preferences.

Exposure as offset, not weight. In a Poisson frequency model, exposure enters the log-linear predictor as an offset: log(exposure) is added to log(lambda). It is not a sample weight on the likelihood. Setting both baseline=log(exposure) and weight=exposure simultaneously double-counts exposure and produces wrong predictions. This is covered in section 3 and demonstrated in Exercise 1.

Polars for data manipulation. All DataFrame operations use Polars. Conversion to pandas happens only at the CatBoost Pool boundary. The shap-relativities library accepts Polars DataFrames natively.

SHAP on original features, band aggregation separately. Continuous features like driver age are passed to SHAP as continuous variables (what the model was trained on). Banding for the factor table is a post-hoc aggregation step on the SHAP values, not a re-specification of the model. Passing a banded feature to an explainer trained on the continuous feature produces wrong SHAP values.

The `shap-relativities` library

This module uses shap-relativities, an open-source Python library for extracting multiplicative rating relativities from GBMs. Install via:

uv pip install 'shap-relativities[catboost]==0.1.0'

Source: https://github.com/burningcost/shap-relativities

The library outputs Polars DataFrames with columns: feature, level, relativity, lower_ci, upper_ci, mean_shap, shap_std, n_obs, exposure_weight.

Part of the MVP bundle

This module is included in the £295 MVP bundle alongside Module 1 (Databricks for Pricing Teams), Module 2 (GLMs in Python), and Module 6 (Credibility and Bayesian Pricing). Individual module: £79.

Module 4: SHAP Relativities

Module 4: SHAP Relativities

The problem this module solves

What you will build

Prerequisites

Contents

Key technical decisions

The shap-relativities library

Part of the MVP bundle

The `shap-relativities` library