Meridian Changelog
Model version history and quarterly test results.
v6.13
16 June 2026 Current ProductionRetrain on cleaned training labels. v6.13 removes 9,791 non-arm’s-length sale records (~0.11% of the 8.7M training set) where the recorded price was not the arm’s-length market value — transfers of part, related-party sales, fractional-interest transfers, and the like. These wrong labels were teaching the model that a house’s location, type and size were worth far less than they actually are, pulling every comparable-based inference near them downwards. The release applies the rule that diagnostics #779/#780/#781 closed: regional price floors below which a sale is treated as junk for training, plus a local-type outlier proxy. The South relies on the outlier-only rule (its sub-floor stock is genuine cheap flats, not junk).
Same training window, same features (49), same FSD/band convention. Point valuations differ from v6.12.1 because the weights moved; confidence has been recalibrated on v6.13 predictions so served FSD/bands are honest.
H2-2025 out-of-sample bulk test — before/after
| Segment | v6.12 PE10 | v6.13 PE10 | Δ |
|---|---|---|---|
| Floor <£150k | 47.54% | 47.91% | +0.37pp |
| £150–300k | 68.18% | 68.44% | +0.26pp |
| £300–500k | 72.03% | 72.25% | +0.22pp |
| £500k–1M | 65.48% | 65.88% | +0.40pp |
| £1M+ | 49.49% | 50.35% | +0.86pp |
| Cat A headline | 67.08% | 67.40% | +0.32pp |
| Blended | 65.44% | 65.74% | +0.30pp |
| MdAPE | 6.69% | 6.62% | −0.07pp |
The denoising claim was that cleaning the labels would lift accuracy across the whole book, not just the floor; the table is that claim measured rather than asserted. The 3-part acceptance gate — floor improves, core bands hold or improve, genuine cheap-North stock intact — passed on the H2-2025 census.
Band B headline tile is now 70.4% PE10 at 66% of the test set (was 70.3% / 66.2% under v6.12 — coverage stable, accuracy slightly tighter under the new calibration).
A separate ablation arm tested whether ppd_category (the
Land Registry Category A/B flag) earns signal on the cleaned labels.
Cat A PE10 lifted +0.39pp, but the outer tail worsened (PE20 −0.19pp,
sub-£150k PE10 −1.15pp, Cat-B PE10 −2.22pp). The provisional
machine verdict was “earns signal”; on review the heterogeneous
picture led to dropping ppd_category for good rather than
pursuing it as a future release.
v6.12.1
15 June 2026 Superseded by v6.13Headline accuracy reported on standard market sales (Land Registry Category A) (n = 266,525 of 295,026 H2-2025 sales)
H2-2025 out-of-sample bulk test — by benchmark basis
| Basis | PE5 | PE10 | PE15 | PE20 | MdAPE | Bias | n |
|---|---|---|---|---|---|---|---|
| Category A — standard market sales (headline) | 41.0% | 67.4% | 81.7% | 89.4% | 6.4% | +1.03% | 266,525 |
| Blended — full set (incl. Category B) | 39.9% | 65.7% | 79.9% | 87.7% | 6.6% | +2.04% | 295,026 |
| Category B — non-standard only | — | 50.2% | repossessions, buy-to-lets, corporate transfers | 28,501 | |||
What changed in 6.12.1
Headline benchmark basis. From 15 June 2026, headline accuracy is reported on standard market sales — Land Registry Category A: PE10 67.4%. On the full set including non-standard sales — repossessions, buy-to-lets and corporate transfers (Category B) — it is 65.7%. Category B sales transact under conditions the model is not built to price, which understates accuracy on the standard market sale the model targets; the blended figure is retained in the technical document.
No change to valuations, bands or FSD — point valuations are byte-identical to v6.12 (same v6.11 weights). 6.12.1 changes only which sales the headline is measured against; the model engine, confidence bands and FSD are unchanged.
v6.12
14 June 2026 Superseded by v6.12.1Tested against 295,026 H2-2025 Land Registry transactions
H2-2025 out-of-sample bulk test
| PE5 | PE10 | PE15 | PE20 | MdAPE | Bias | Mean FSD |
|---|---|---|---|---|---|---|
| 39.9% | 65.7% | 79.9% | 87.7% | 6.6% | +2.04% | 25.8% |
Accuracy by FSD (Fitch) band
| Band | n | % of test | PE10 | MdAPE | Bias |
|---|---|---|---|---|---|
| Band A | 0 | 0.0% | — | — | — |
| Band B | 195,609 | 66.3% | 70.4% | 5.9% | -0.6% |
| Band C | 98,910 | 33.5% | 56.7% | 8.4% | -1.0% |
| Band D | 507 | 0.2% | 37.9% | 14.7% | +1.8% |
What changed from v6.11
Recalibration only. Point valuations are byte-identical to v6.11 — overall PE10 (65.4%) and MdAPE (6.7%) are unchanged because the model itself was not retrained. What changed is what FSD/band mean: a single calibration pipeline now produces both the calibration cells and the serving lookup, equal-count decile boundaries are stored and read at serving, evidence density (sparse vs. dense comparable evidence) is now a calibration dimension, and per-cell dispersion uses a robust (IQR-based) measure that is no longer thrown off by outlier tails.
The visible result is band re-allocation. Band B expanded from 49% to 66% of the test at unchanged PE10 (~70%) — same precision, broader coverage. Band D tightened from 9% to 0.2%, now reflecting genuinely uncertain cases rather than acting as a catch-all. Band A remains honestly empty: no cell meets the FSD ≤ 0.05 threshold on a robust dispersion basis.
Bias is now reported on a robust basis. From v6.12, bias is the median signed percentage error (Option A / EAA ESSVM convention). v6.11 figures were the mean signed percentage error. This is a measurement-definition change — it can move the number, and occasionally flip its sign, without any change to the model’s predictions.
v6.11
8 March 2026 Superseded by v6.12Tested against 295,026 H2-2025 Land Registry transactions
H2-2025 out-of-sample bulk test
| PE5 | PE10 | PE15 | PE20 | MdAPE | Bias (mean) |
|---|---|---|---|---|---|
| 39.6% | 65.4% | 79.8% | 87.6% | 6.7% | +1.92% |
Accuracy by FSD (Fitch) band
| Band | n | % of test | PE10 | MdAPE | Bias (mean) |
|---|---|---|---|---|---|
| Band A | 0 | 0.0% | — | — | — |
| Band B | 143,546 | 48.7% | 70.4% | 6.0% | −0.7% |
| Band C | 123,787 | 42.0% | 63.9% | 7.0% | +1.7% |
| Band D | 27,693 | 9.4% | 46.4% | 11.2% | +16.5% |
What changed from v6.10
Added spatial comparable pricing from nearby similar properties, weighted by similarity and adjusted for house price growth. +0.55pp PE10.
Also in v6.11: per-property FSD mapped to Fitch classification bands (A–D). FSD lookup table segmented by confidence decile, property type, and price band. Per-property SHAP contributions replacing global feature importance. FSD lookup recalibration against 100,938 backtest properties.
June 2026: proprietary confidence-tier overlay retired; the FSD/Fitch bands are now the sole confidence expression.
v6.7
7 March 2026What changed
Replaced national ONS House Price Index with a hyperlocal index — postcode-district and property-type-stratified quarterly index built from Land Registry transactions. Largest single accuracy gain in model history: +2.3pp PE10.
v6.6
6 March 2026What changed
Added predicted bathroom count as a dense feature. National predictor providing 99.4% coverage where EPC records lack bathroom data.
v6.3
6 March 2026What changed
Added predicted bedroom count and plot area from EPC records. +1.7pp PE10 combined.
v6.1
5 March 2026What changed
Dynamic model metrics system. Website accuracy stats now driven by quarterly bulk test results, not training metrics.
v6
1 March 2026Headline metrics
| PE10 | MdAPE |
|---|---|
| 60.2% | 7.8% |
What changed
School proximity features (distance to nearest outstanding, good, and any school). Full spatial feature set including flood zones, crime rates, station proximity, and Census 2021 neighbourhood statistics. First version exceeding the Kirchmeyer 50% PE10 minimum.
v5
28 February 2026Headline metrics
| PE10 | MdAPE |
|---|---|
| 59.3% | 7.93% |
What changed
Census 2021 neighbourhood statistics (median household income, employment rate, qualification levels). Income became the single most important feature.
v4
28 February 2026Headline metrics
| PE10 | MdAPE |
|---|---|
| 57.1% | 8.4% |
What changed
Crime rates (total, violent, burglary), station proximity, and flood zone data. First version with full spatial feature set.
v3
1 March 2026Headline metrics
| PE10 | MdAPE |
|---|---|
| 59.4% | 7.9% |
What changed
Previous sale price with HPI temporal adjustment. Single largest feature contribution in model history at this point. PE10 jumped from 45.1% to 59.4% — previous sale price proved to be the most powerful predictor of current value.
v2
February 2026Headline metrics
| PE10 | MdAPE |
|---|---|
| 45.1% | 11.3% |
What changed
First production model. EPC data (floor area, energy rating, construction age), Land Registry transaction history, and basic spatial features.
v1
February 2026Proof of concept. Basic hedonic model. Not deployed to production.