
As pension schemes across the UK begin to consolidate, one of the most consequential, and least discussed, risks lies in wait: the merging of member data. Legacy IT systems, inconsistent records, and missing data make it extremely difficult to match members accurately across schemes and providers. AI-powered entity matching is a proven, scalable solution: reducing errors, compressing timelines, and providing the audit trail that trustees and regulators expect.
Businesses that act early on data quality and AI-driven matching will merge faster and with less risk. Those that don’t will find data becomes their biggest bottleneck.
The UK pension industry is consolidating fast. The Pensions Regulator has been actively encouraging defined benefit (DB) scheme trustees to consider consolidation as a route to better member outcomes, and the emergence of pension superfunds has opened new pathways for smaller schemes to achieve the scale they need. The government’s Pension Schemes bill has brought similar momentum to the defined contribution (DC) space, looking to increase retirement incomes through minimum default scale requirements, consolidation of smaller pots and improved value for money. The direction of travel is clear: fewer, larger funds.
The business case is easy to make, larger funds lead to better investment leverage, lower administrative costs for members and, hopefully, stronger governance. What is being overlooked however is the technical and legal complexity of this task.
When two pension schemes merge, every member record from both sides must be brought together into a single, accurate dataset. That means identifying where the same person appears in both systems, detecting duplicates, and ensuring that no one falls through the cracks. The difficulty is that pension data is messy. Legacy administration systems, some decades old, store information in different formats, with different standards, and with decades of accumulated errors. Names are spelt inconsistently. Dates of birth contain transcription mistakes from old paper records. National Insurance numbers are missing or partially recorded. Addresses are years out of date.
So how should businesses go about reconciling these mammoth data sets?
Well, to use a saying from the early 2010’s, “There’s an AI for that.” In fact, there is an entire family of AI and ML techniques, dedicated to this exact problem. AI Entity Matching has been around for years and has a long history of proven success. Its whole purpose is to make smart decisions about whether two imperfect records refer to the same person.
In practice, AI-powered entity matching works in stages. First, it ingests and standardises data from both organisations. Then it compares records, using a combination of techniques to assess the likelihood of a match.
Some are straightforward, recognising that “Catherine Smith” and “Katharine Smith” are very likely the same person if they share a date of birth, phone number, email address etc. Others are more sophisticated, using models that weigh multiple signals simultaneously. Two records might have a slightly different name but an identical National Insurance number. AI can be used to assess these together and produce confidence scores, rather than relying on rigid rules that break down when the data is imperfect.
Crucially, this isn’t a fully automated process. The most effective implementations establish clear confidence thresholds:
AI handles the scale; human expertise handles the judgement calls.
The result is faster, more accurate, and more auditable than the manual approaches many schemes still rely on.

Member outcomes. The fundamental purpose of a pension scheme is to pay the right benefits to the right people. A matching error directly undermines that purpose. AI significantly reduces error rates compared to manual approaches, particularly at scale.
Regulatory exposure. The Pensions Regulator sets clear expectations around data quality, requiring trustees to measure and report on common and scheme-specific data standards. A merger that degrades data quality will attract scrutiny. UK GDPR and the Data Protection Act 2018 add further obligations, including the need for Data Protection Impact Assessments where large-scale matching is involved.
Cost and timeline. Poor data quality is one of the most common causes of merger delay. Manual reconciliation is labour-intensive and slow. AI-driven matching can compress months of painstaking work into weeks.
Reputation. Members whose records are mishandled lose trust, not just in the scheme, but in the organisations behind it.
Cloud platforms such as Google Cloud Platform (GCP) provide the building blocks for an end-to-end AI matching pipeline without heavy upfront infrastructure investment.
Data from both schemes is brought into BigQuery (GCP’s cloud data warehouse) where records can be stored, queried, and compared at scale. Dataflow handles the transformation work needed to bring inconsistent records into a common format. If working across multiple cloud providers, for example merging with a company running infrastructure on AWS, using GCP’s BigQuery Omni removes the need to transfer your data, allowing you to run analytics and queries without additional migration overhead.
For the matching itself, Google’s Vertex AI allows you to develop bespoke entity matching models, tailored to your data and requirements. Alternatively, you can deploy open-source tools like Splink, developed by the UK Ministry of Justice and widely adopted across the UK public sector, which can run at scale on GCP’s Dataflow service.
Dataplex (Google’s data fabric) is particularly valuable in a merger context. It creates a virtual governance layer over data sitting across multiple systems, allowing both schemes’ records to be discovered, catalogued, and quality-checked from a single control point without physically consolidating everything first. Automated rules can flag records with missing National Insurance numbers or inconsistent dates before they enter the matching process.
Security is embedded throughout. Cloud Data Loss Protection (DLP) masks sensitive fields during processing. Access controls ensure only authorised individuals see member data. Cloud Audit Logs provide full traceability, every matching decision recorded, every action attributable. And the cloud model means you scale up during the merger and back down afterwards, with no permanent infrastructure overhead.
Every pensions provider I speak to wants to know how AI can speed up their operations. Most are hoping the answer is “Buy 300 Co-Pilot licenses and let the magic happen.” This has never been the way to deliver. AI performs best when it is aimed at a specific, high-stakes problem, not thinly spread over an organisation. Fund mergers are full of exactly these problems. Entity matching is where to start.




