Meridian Changelog

Model version history and quarterly test results.

v6.13

16 June 2026 Current Production

Retrain on cleaned training labels. v6.13 removes 9,791 non-arm’s-length sale records (~0.11% of the 8.7M training set) where the recorded price was not the arm’s-length market value — transfers of part, related-party sales, fractional-interest transfers, and the like. These wrong labels were teaching the model that a house’s location, type and size were worth far less than they actually are, pulling every comparable-based inference near them downwards. The release applies the rule that diagnostics #779/#780/#781 closed: regional price floors below which a sale is treated as junk for training, plus a local-type outlier proxy. The South relies on the outlier-only rule (its sub-floor stock is genuine cheap flats, not junk).

Same training window, same features (49), same FSD/band convention. Point valuations differ from v6.12.1 because the weights moved; confidence has been recalibrated on v6.13 predictions so served FSD/bands are honest.

H2-2025 out-of-sample bulk test — before/after

Segment	v6.12 PE10	v6.13 PE10	Δ
Floor <£150k	47.54%	47.91%	+0.37pp
£150–300k	68.18%	68.44%	+0.26pp
£300–500k	72.03%	72.25%	+0.22pp
£500k–1M	65.48%	65.88%	+0.40pp
£1M+	49.49%	50.35%	+0.86pp
Cat A headline	67.08%	67.40%	+0.32pp
Blended	65.44%	65.74%	+0.30pp
MdAPE	6.69%	6.62%	−0.07pp

The denoising claim was that cleaning the labels would lift accuracy across the whole book, not just the floor; the table is that claim measured rather than asserted. The 3-part acceptance gate — floor improves, core bands hold or improve, genuine cheap-North stock intact — passed on the H2-2025 census.

Band B headline tile is now 70.4% PE10 at 66% of the test set (was 70.3% / 66.2% under v6.12 — coverage stable, accuracy slightly tighter under the new calibration).

A separate ablation arm tested whether ppd_category (the Land Registry Category A/B flag) earns signal on the cleaned labels. Cat A PE10 lifted +0.39pp, but the outer tail worsened (PE20 −0.19pp, sub-£150k PE10 −1.15pp, Cat-B PE10 −2.22pp). The provisional machine verdict was “earns signal”; on review the heterogeneous picture led to dropping ppd_category for good rather than pursuing it as a future release.

v6.12.1

15 June 2026 Superseded by v6.13

Headline accuracy reported on standard market sales (Land Registry Category A) (n = 131,173 of 147,785 Q2-2026 sales)

Q2-2026 out-of-sample bulk test — by benchmark basis

Basis	PE5	PE10	PE15	PE20	MdAPE	Bias	n
Category A — standard market sales (headline)	26.4%	49.3%	67.0%	79.0%	10.2%	-1.74%	131,173
Blended — full set (incl. Category B)	25.7%	48.0%	65.3%	77.2%	10.5%	-0.16%	147,785
Category B — non-standard only	—	38.3%	repossessions, buy-to-lets, corporate transfers				16,612

What changed in 6.12.1

Headline benchmark basis. From 15 June 2026, headline accuracy is reported on standard market sales — Land Registry Category A: PE10 49.3%. On the full set including non-standard sales — repossessions, buy-to-lets and corporate transfers (Category B) — it is 48.0%. Category B sales transact under conditions the model is not built to price, which understates accuracy on the standard market sale the model targets; the blended figure is retained in the technical document.

No change to valuations, bands or FSD — point valuations are byte-identical to v6.12 (same v6.11 weights). 6.12.1 changes only which sales the headline is measured against; the model engine, confidence bands and FSD are unchanged.

v6.12

14 June 2026 Superseded by v6.12.1

Tested against 147,785 Q2-2026 Land Registry transactions

Q2-2026 out-of-sample bulk test

PE5	PE10	PE15	PE20	MdAPE	Bias	Mean FSD
25.7%	48.0%	65.3%	77.2%	10.5%	-0.16%	30.8%

What changed from v6.11

Recalibration only. Point valuations are byte-identical to v6.11 — overall PE10 (65.4%) and MdAPE (6.7%) are unchanged because the model itself was not retrained. What changed is what FSD/band mean: a single calibration pipeline now produces both the calibration cells and the serving lookup, equal-count decile boundaries are stored and read at serving, evidence density (sparse vs. dense comparable evidence) is now a calibration dimension, and per-cell dispersion uses a robust (IQR-based) measure that is no longer thrown off by outlier tails.

The visible result is band re-allocation. Band B expanded from 49% to 66% of the test at unchanged PE10 (~70%) — same precision, broader coverage. Band D tightened from 9% to 0.2%, now reflecting genuinely uncertain cases rather than acting as a catch-all. Band A remains honestly empty: no cell meets the FSD ≤ 0.05 threshold on a robust dispersion basis.

Bias is now reported on a robust basis. From v6.12, bias is the median signed percentage error (Option A / EAA ESSVM convention). v6.11 figures were the mean signed percentage error. This is a measurement-definition change — it can move the number, and occasionally flip its sign, without any change to the model’s predictions.

v6.11

8 March 2026 Superseded by v6.12

Tested against 295,026 H2-2025 Land Registry transactions

H2-2025 out-of-sample bulk test

PE5	PE10	PE15	PE20	MdAPE	Bias (mean)
39.6%	65.4%	79.8%	87.6%	6.7%	+1.92%

Accuracy by FSD (Fitch) band

Band	n	% of test	PE10	MdAPE	Bias (mean)
Band A	0	0.0%	—	—	—
Band B	143,546	48.7%	70.4%	6.0%	−0.7%
Band C	123,787	42.0%	63.9%	7.0%	+1.7%
Band D	27,693	9.4%	46.4%	11.2%	+16.5%

What changed from v6.10

Added spatial comparable pricing from nearby similar properties, weighted by similarity and adjusted for house price growth. +0.55pp PE10.

Also in v6.11: per-property FSD mapped to Fitch classification bands (A–D). FSD lookup table segmented by confidence decile, property type, and price band. Per-property SHAP contributions replacing global feature importance. FSD lookup recalibration against 100,938 backtest properties.

June 2026: proprietary confidence-tier overlay retired; the FSD/Fitch bands are now the sole confidence expression.

v6.7

7 March 2026

What changed

Replaced national ONS House Price Index with a hyperlocal index — postcode-district and property-type-stratified quarterly index built from Land Registry transactions. Largest single accuracy gain in model history: +2.3pp PE10.

v6.6

6 March 2026

What changed

Added predicted bathroom count as a dense feature. National predictor providing 99.4% coverage where EPC records lack bathroom data.

v6.3

6 March 2026

What changed

Added predicted bedroom count and plot area from EPC records. +1.7pp PE10 combined.

v6.1

5 March 2026

What changed

Dynamic model metrics system. Website accuracy stats now driven by quarterly bulk test results, not training metrics.

v6

1 March 2026

Headline metrics

PE10	MdAPE
60.2%	7.8%

What changed

School proximity features (distance to nearest outstanding, good, and any school). Full spatial feature set including flood zones, crime rates, station proximity, and Census 2021 neighbourhood statistics. First version exceeding the Kirchmeyer 50% PE10 minimum.

v5

28 February 2026

Headline metrics

PE10	MdAPE
59.3%	7.93%

What changed

Census 2021 neighbourhood statistics (median household income, employment rate, qualification levels). Income became the single most important feature.

v4

28 February 2026

Headline metrics

PE10	MdAPE
57.1%	8.4%

What changed

Crime rates (total, violent, burglary), station proximity, and flood zone data. First version with full spatial feature set.

v3

1 March 2026

Headline metrics

PE10	MdAPE
59.4%	7.9%

What changed

Previous sale price with HPI temporal adjustment. Single largest feature contribution in model history at this point. PE10 jumped from 45.1% to 59.4% — previous sale price proved to be the most powerful predictor of current value.

v2

February 2026

Headline metrics

PE10	MdAPE
45.1%	11.3%

What changed

First production model. EPC data (floor area, energy rating, construction age), Land Registry transaction history, and basic spatial features.

v1

February 2026

Proof of concept. Basic hedonic model. Not deployed to production.