How AVM Confidence Scoring Works

Why confidence scoring matters

Every AVM produces a point estimate — a single number that represents its best guess at a property’s market value. But that number alone is incomplete. Without knowing how confident the model is, the recipient has no basis for deciding whether to rely on it.

Consider two valuations from the same model, both estimating £350,000. One is for a standard three-bedroom terrace in a street where six similar houses have sold in the past year. The other is for a converted chapel in a hamlet with no sales for three years. The model might produce the same point estimate, but the confidence behind each is fundamentally different.

Confidence scoring makes this difference explicit. It gives the user — whether a lender, valuer, broker, or investor — a structured way to assess how much weight to place on each individual valuation. Regulators, including the PRA and EBA, expect this: an AVM output without a confidence measure is considered incomplete for lending purposes.

What drives confidence in an AVM valuation

Confidence is not a single number derived from a single factor. It reflects a combination of evidence signals that together indicate how well-positioned the model is to value a specific property. The main drivers are:

Comparable transaction volume

How many similar properties have sold recently in the vicinity. More transactions mean more evidence for the model to learn from. A property in a heavily-transacted suburb has a natural advantage over one in a rural area with sporadic sales.

Recency of evidence

A comparable sale from three months ago is more informative than one from three years ago. Markets move, and older transactions carry less weight. The model tracks how recent the available evidence is and adjusts confidence accordingly.

Market homogeneity

In a street of identical 1930s semis, the model can be highly confident because the properties are similar and prices cluster tightly. In an area with a mix of period conversions, new builds, and bungalows, the comparables are less directly relevant and confidence drops.

Property typicality

Properties that are typical of their area — standard size, standard type, standard condition — are easier to value than outliers. A six-bedroom detached house in an area dominated by two-bedroom flats is harder to value accurately, even if transaction volumes are high.

Data completeness

The quality and completeness of the data available for the subject property matters. Properties with full EPC data, clear transaction history, and well-defined characteristics are valued with higher confidence than those with sparse records.

The model synthesises all of these factors into a single confidence assessment for each property. This is what makes confidence scoring property-specific rather than model-level: two properties valued by the same model on the same day will have different confidence levels if the evidence supporting them differs.

The three-tier confidence system

While the underlying confidence assessment is continuous (expressed as forecast standard deviation), most AVM providers — including Gadsden Valuations — translate this into discrete tiers. This is not a simplification for its own sake: it reflects how lenders and risk managers actually use the information.

A three-tier system is the industry norm, broadly corresponding to:

1

High confidence

Strong comparable evidence, property is typical of its area, low FSD. The model is well-positioned to value this property accurately.

2

Medium confidence

Adequate evidence but with some limitations — perhaps fewer comparables, less homogeneous market, or less complete data. The valuation is usable but carries wider uncertainty.

3

Low confidence

Limited evidence, unusual property, or volatile market. The model has produced an estimate but with significant uncertainty. A supplementary or physical valuation should be considered.

The tier boundaries are not arbitrary. They are calibrated against back-test results so that each tier corresponds to a defined range of actual prediction accuracy. Tier 1 valuations really are more accurate than Tier 2, which in turn outperform Tier 3 — the labels mean something because they are empirically grounded.

Meridian’s confidence tier performance

The table below shows how Meridian’s three confidence tiers perform against actual transaction data. These figures are drawn from our live back-test of 295,026 transactions across England and Wales.

The separation between tiers is significant and consistent. Tier 1 captures the majority of properties — those in well-transacted areas with good data coverage — and delivers the tightest prediction accuracy. Properties that fall into lower tiers are not failures of the model; they are honest acknowledgements that the evidence is weaker and the uncertainty is wider.

For the full breakdown including property type, price band, and regional segmentation, see our accuracy page or download the accuracy report PDF.

How lenders use confidence tiers

Confidence tiers give lenders a simple, consistent framework for making accept/decline/escalate decisions on valuation inputs. Rather than interpreting raw FSD values case by case, a lender can define policy rules tied to the tier system.

Common patterns include:

T1

Accept AVM-only

For remortgages and low-LTV lending where the financial exposure to valuation error is limited. The lender accepts the AVM output without further verification.

T2

Desktop review or second AVM

The valuation is useful but warrants additional checking. The lender may commission a desktop review, request a second AVM for comparison, or accept with adjusted LTV caps.

T3

Escalate to physical valuation

The model’s uncertainty is too wide for the lending context. The lender requires a physical valuation or declines to lend on the basis of the AVM alone.

These rules can be further refined by combining confidence tier with loan-to-value ratio, property type, loan purpose, or other risk factors. The confidence tier provides the valuation dimension; the lender’s credit policy provides the rest.

This approach is consistent with PRA and EBA guidance, which expects lenders to apply confidence-based acceptance criteria when using AVMs — not to treat all AVM outputs as equally reliable.

Confidence is not the same as quality

A common misconception is that a low-confidence valuation is a “bad” valuation. This is not the case. A Tier 3 valuation from a well-built model is still the best estimate available given the evidence — it simply carries more uncertainty.

The quality of an AVM is measured by aggregate metrics like PE10, MdAPE, and bias. These tell you whether the model is well-calibrated overall. The confidence tier tells you whether a specific property happens to be in the model’s sweet spot or at the edge of its capabilities.

An AVM that claims Tier 1 for every property is not better — it is less honest. The whole point of confidence scoring is to differentiate: to flag the valuations where caution is warranted so that users can take appropriate action.

Put another way: confidence scoring is the model being transparent about its own limitations on a case-by-case basis. That transparency is a feature, not a weakness.

Confidence you can quantify

Every Meridian valuation includes a confidence tier, FSD, and prediction interval. See how it works with a free account.

Get Started Free View Sample Report