Land Registry Data in Property Valuation

The transaction data that underpins every UK residential AVM — what it contains, where it comes from, and what it cannot tell you.

The Price Paid dataset

HM Land Registry’s Price Paid dataset is the foundation of residential property valuation in England and Wales. It records every residential property transaction that is lodged for registration, going back to January 1995. As of early 2026, the dataset contains over 28 million individual transaction records.

The data is published as open data under the Open Government Licence, updated monthly. New transactions typically appear in the dataset 4–8 weeks after completion, once the registration process is finalised. This delay is important: it means the most recent month or two of transactions are underrepresented in any analysis based on this data.

For AVMs, the Price Paid dataset serves two critical functions: it provides the training data from which the model learns the relationship between property characteristics and value, and it provides the test data against which the model’s accuracy is measured through backtesting.

What the data contains

Each record in the Price Paid dataset includes the following fields:

Field Description
Price The amount paid for the property, in pounds
Date of transfer The date on which the transaction completed
Address Full address including postcode
Property type Detached, semi-detached, terraced, flat/maisonette, or other
Old/new build Whether the property was newly built at the time of sale
Tenure Freehold or leasehold
Transaction category Standard price paid or additional price paid entry

Note what is absent: there is no floor area, no number of bedrooms, no property condition, no garden size, no EPC rating. The Price Paid dataset tells you what sold, when, and for how much, but very little about the physical characteristics of the property. This is why AVMs must supplement it with other data sources.

Limitations and data quality issues

The Price Paid dataset is authoritative but not perfect. Understanding its limitations is essential for interpreting AVM outputs and backtesting results.

Non-market transactions

The dataset includes some transactions that do not reflect open-market value: transfers between family members at below-market prices, right-to-buy sales at statutory discounts, shared-ownership transactions (where only a percentage of the property is purchased), and repossessions. While the “additional price paid” category captures some of these, it is not a reliable filter. AVM providers must apply their own filtering logic to exclude non-market transactions from training and testing data.

Registration delay

Transactions appear in the dataset 4–8 weeks after completion, sometimes longer. This means the most recent data is always incomplete, creating a blind spot for the most current market conditions. AVMs must account for this latency in their modelling.

No property characteristics

As noted above, the dataset contains no information about the physical property beyond its type and tenure. Two semi-detached houses on the same street might differ enormously in size, condition, and specification, but the Price Paid data records only the price and type. AVMs must source property characteristics from elsewhere.

Address matching challenges

Addresses in the Land Registry do not always follow standard formatting. Flat numbering schemes, building names, and rural addresses without postcodes can make it difficult to match transactions to properties in other datasets. Robust address-matching logic is essential for any AVM that uses this data.

England and Wales only

HM Land Registry covers England and Wales. Scotland has its own Registers of Scotland with a separate dataset, and Northern Ireland has the Land & Property Services. AVMs built on the Price Paid dataset are applicable to England and Wales only.

EPC data: filling the characteristics gap

Energy Performance Certificates (EPCs) are the single most important supplement to the Price Paid dataset for UK AVMs. Required for most property sales and rentals since 2008, EPCs contain detailed property characteristics that the Land Registry data lacks.

An EPC record typically includes:

Total floor area (m²)

The single most predictive physical characteristic

Property type and form

More granular than Land Registry categories

Number of habitable rooms

Bedroom/reception count indicators

Construction age band

e.g. 1930–1949, 1967–1975, post-2012

Wall type and insulation

Cavity, solid, insulated, uninsulated

Energy efficiency rating

A–G band, numeric score

Heating system

Boiler type, fuel source

Glazing type

Single, double, or triple glazed

The EPC register contains over 25 million certificates, covering the majority of residential properties in England and Wales. By linking EPC records to Land Registry transactions (using address matching and UPRN linkage), AVMs can associate physical characteristics with sale prices — the essential combination for accurate valuation.

EPC data has its own limitations: certificates are valid for 10 years, so some records describe a property as it was a decade ago. Properties that have not been sold or rented since 2008 may have no EPC at all. And the quality of individual assessments varies, particularly for floor area measurements. Despite these limitations, EPC data represents the most comprehensive source of property characteristics available in England and Wales.

Other data sources used by AVMs

Beyond the Land Registry and EPC data, modern AVMs draw on additional sources to improve accuracy and coverage:

Ordnance Survey and geographic data

Location matters enormously in property valuation. Geographic data provides coordinates, postcodes, ward and parish boundaries, and UPRNs (Unique Property Reference Numbers) that enable precise address matching. It also allows calculation of proximity features: distance to stations, schools, town centres, and other amenities that influence value.

Census and demographic data

ONS census data provides neighbourhood-level information about housing stock, tenure mix, population density, and socioeconomic characteristics. While not property-specific, these area-level features help the model understand local market context.

Planning and building data

Council tax bands, planning applications, listed building status, conservation area designations, and other regulatory data provide additional property and location features. These can be particularly important for unusual properties where standard comparables are scarce.

The art of building a good AVM lies in combining these disparate sources into a coherent feature set that the model can learn from. Each source has coverage gaps, quality issues, and matching challenges. The model’s performance depends as much on the quality of data engineering as on the choice of algorithm.

How Meridian uses these data sources

Meridian — the AVM powering Gadsden Valuations — ingests and links data from all the sources described above. The result is a property database covering 147,188+ transactions, with 49 features per property derived from Land Registry, EPC, geographic, and contextual data.

The linking process uses a combination of UPRN matching, address parsing, and fuzzy matching algorithms to associate transaction records with property characteristics. Where multiple data sources disagree (for example, on floor area), the model uses the most reliable source available for each property.

Data freshness is maintained through automated pipelines that ingest new Land Registry releases monthly and EPC updates as they become available. The model is retrained on the updated data to ensure it reflects current market conditions.

The quality of this data pipeline — matching accuracy, feature coverage, freshness — directly affects the model’s accuracy and confidence levels. Properties with complete, recent data receive higher confidence scores; those with gaps or stale records receive lower confidence, because the model honestly reflects the limitations of the available evidence.

See the data in action

Every Meridian valuation shows the comparable transactions and data sources that informed the estimate. Try it with a free account.

This site uses essential cookies and Google Analytics. See our Privacy Policy.