Article chapters
Move to any point in this article by clicking on the chapter name or scroll through yourself.
Welcome to the Health Data Wild West
You digital data isn't dusty though. Concerned?
The global healthcare scene is currently getting a massive digital glow-up. We are swapping dusty, analogue filing cabinets for a super-connected digital universe of health data. This shift is brilliant for clinical research, hospital efficiency, and training snazzy new medical AI. But there’s a catch: it might also expose our most private, sensitive information to the highly lucrative and sometimes dodgy global data brokerage market.
In the rush to harvest ‘Real-World Evidence’ (RWE). Basically the data exhaust left behind by electronic health records (EHRs) and medical devices, our fundamental right to privacy is often treated like a pesky roadblock. Instead of baking privacy into the very core of these systems, the industry often relies on technical loopholes and legal gymnastics to get what it wants.
This report takes a globe-trotting look at what happens when health data commodification collides with somewhat rubbish cryptographic safeguards. We’ll kick things off with the jaw-dropping 2015 South Korean prescription data crisis involving IMS Health (the company that later became IQVIA), which proves just how fragile ‘anonymised’ data really is. Then, we’ll dive into the actual maths of data masking to see why simply scrambling a fixed-length identifier, say that of a 10-digit UK NHS number, isn’t the silver bullet everyone hopes it is.
Finally, we’ll bring it all home to the UK’s shiny new multi-million-pound NHS Federated Data Platform (FDP). With the controversial spy-tech firm Palantir crunching the numbers and our old friends at IQVIA acting as the privacy guards, we’ll ask the million-pound question: are our medical secrets actually safe from modern hackers and AI?
The Big Business of Real-World Evidence
To understand why your medical data is under siege, you have to look at the money. Traditional randomised clinical trials are the gold standard for testing new medicines, but they are incredibly rigid, slow, and frightfully expensive. As a result, health authorities and Big Pharma are increasingly turning to Real-World Evidence (RWE) to figure out if drugs actually work in the wild.
The sheer volume of health data out there is exploding. Thanks to EHRs, the mountain of person-level medical information is predicted to grow by a massive 36% every year through 2025. This isn’t just basic stuff; we’re talking about GP notes, specialist hospital records, medication dosages, and even unstructured data like X-rays and doctors’ clinical narratives.
For multinational pharma giants and data brokers, this is absolute gold dust. They use it to aggressively target marketing campaigns, find highly specific patients for clinical trials, and train advanced AI algorithms.
But here’s the massive spanner in the works: privacy laws. Regulations like the GDPR are built on the idea of ‘data minimisation’ meaning you should only collect the absolute bare minimum data required, and you need explicit consent to use it. Big data analytics, however, is a hungry beast; it relies on ‘data maximisation’ to spot hidden trends.
To dodge these strict rules, the data brokerage industry relies heavily on a neat little trick called “pseudonymisation” or tokenisation. By scrambling the data just enough to legally classify it as “non-personal,” they can bypass all those annoying consent rules and trade your health history like a commodity.
| The Rulebook | Where It Applies | The General Vibe | How They Treat Scrambled Data |
| GDPR | EU & United Kingdom | Very strict. Privacy is a fundamental human right. Keep data to a minimum. | Pseudonymised data is still “Personal Data”. If there is any risk of re-identification, the strict rules still apply. |
| HIPAA | United States | Sector-specific. Mostly cares about how health entities share data. | If you strip out 18 specific identifiers (or get an expert to say it’s safe), it’s no longer protected health info. Have at it! |
| PIPA | South Korea | Incredibly strict, consent-driven, and packs a massive punitive punch. | Data is personal if it can identify you directly or if it can be easily combined with other info to find you. Messing this up means huge fines. |
As Table 1 shows, what counts as “anonymous” depends entirely on where you are standing. Data brokers love to exploit these legal grey areas. But as we’re about to see, the maths behind these anonymisation tricks is often terrifyingly weak.
The 2015 South Korean Data Fiasco
If you want to see the catastrophic failure of this model in action, look no further than South Korea in 2015. It remains one of the most spectacular medical data breaches in modern history.
The Great Data Heist
Back in July 2015, following a massive string of police raids, South Korean authorities indicted two dozen people and companies. The star of the show? The president of the South Korean branch of IMS Health, a colossal global data broker. (IMS Health later merged to become IQVIA, which is a major player in today’s NHS plans… but more on that later).
The authorities uncovered the illegal harvesting and sale of a staggering 4.7 billion medical and prescription records, affecting around 44 million South Korean citizens. To put that into perspective, they essentially vacuumed up the highly sensitive medical histories of nearly 90% of the entire country.
The wildest part? This wasn’t some sophisticated cyber-attack by foreign spies. IMS Health Korea bought this treasure trove from domestic middlemen who had quietly hidden data-harvesting code inside everyday pharmacy and hospital software. The Korea Pharmaceutical Information Center (KPIC) slipped this code into PM 2000, a programme used by 10,800 pharmacies. Other software firms did the same in 7,500 hospitals. Between 2008 and 2014, these programmes silently scooped up names, birth dates, full Resident Registration Numbers (RRNs: like a National Insurance number), diagnoses, and precise drug dosages, entirely without the patients’ consent.
Flipping Data for Profit
The economics of this breach are staggering. IMS Health Korea bought these unconsented records for absolute peanuts. They paid the hospital software vendors a paltry 330 million won (about £180,000 at the time) for 430 million records, and gave KPIC 1.93 billion won for 4.3 billion pharmacy records.
They shipped all this raw data over to servers in the United States, cleaned it up, turned it into slick commercial databases, and sold it straight back to pharmaceutical companies in South Korea. The pharma companies used these insights to aggressively market their drugs to specific clinics, netting IMS Health an estimated 7 billion won in profit.
The Encryption Illusion
When caught, IMS Health Korea’s main defence was essentially: “We didn’t do anything wrong; the data was encrypted!” They argued that because the data was scrambled, it wasn’t technically “Personal Information” under South Korean law.
The prosecutors, however, quickly tore this argument to shreds. The encryption used to hide the 13-digit Resident Registration Numbers was laughably weak. Under South Korea’s PIPA laws, data is still personal if it can be “easily combined” with other data to figure out who someone is. Because the encryption was so basic, and the encryption keys were managed so poorly, anyone with half a brain could cross-reference the scrambled IDs with the clinic locations and appointment dates to re-identify the patients.
This laid bare the ultimate flaw in the data broker business model: legally pretending highly sensitive medical records are “anonymous” commodities by slapping a cheap digital padlock on them.
The Global Legal Battlefield
After the dust settled in South Korea, the global legal landscape for health data fractured.
South Korea Gets Tough
The legal fallout for IMS Health was a messy web of civil and criminal cases. In the civil suit, patients and doctors demanded 5.4 billion won in compensation. In 2017, the court delivered a rather frustrating verdict: they fully admitted that KPIC and IMS Health had broken privacy laws and used rubbish encryption, but they refused to award damages because the data hadn’t been leaked to the public internet or used for direct identity theft.
The criminal side was just as convoluted. Despite the civil court admitting the breach happened, IMS Korea executives were actually acquitted in 2020, a decision upheld on appeal in 2021. Prosecutors, furious that millions of records were sold without consent, have dragged the case all the way to the South Korean Supreme Court.
Regardless of the final verdicts, South Korea didn’t take this lying down. They have since transformed PIPA into one of the most terrifyingly strict privacy laws on the planet. By 2026, the authorities will be able to dish out fines of up to 10% of a company’s total global revenue for major data breaches. Crucially, the law now holds the CEO personally responsible—piercing the corporate veil to make sure executives actually care about your data.
The US: Privacy vs. "Free Speech"
Across the pond in the United States, things are vastly different. When the state of Vermont tried to ban pharmacies from selling doctors’ prescribing data to data miners without consent, IMS Health (yes, them again) sued the state.
In a landmark 2011 ruling (Sorrell v. IMS Health Inc.), the US Supreme Court sided with the data brokers. They ruled that selling medical data for targeted pharmaceutical marketing is a form of “commercial free speech” protected by the First Amendment. This basically torpedoed any state-level attempts to stop non-consensual medical data harvesting in America.
Europe Cracks Down on Monopolies
In Europe, thanks to the GDPR, the battle has shifted towards anti-monopoly laws. Because companies like IQVIA hoard so much data, regulators are getting nervous. In December 2025, the Belgian Competition Authority launched an investigation into whether IQVIA was abusing its massive market dominance in the pharmaceutical data sector. The US Federal Trade Commission also stepped in to block an IQVIA acquisition in 2023, worried they had too much power over both identity data and medical prescribing data. Gathering all this health data doesn’t just crush privacy; it creates terrifying corporate monopolies.
The Maths of Re-identification
And Why Hashing is Often Rubbish
To really understand why the UK’s current plans are so fiercely debated, we need to geek out for a minute and look at the maths. The health tech industry loves to confuse anonymisation (which is permanent and irreversible) with pseudonymisation (which is basically just wearing a fake moustache and can easily be reversed).
The Trouble with 10-Digit Numbers
A favourite trick of the NHS and researchers is to ‘hash’ your NHS number. A hashing algorithm (like SHA-256) is a one-way mathematical meat grinder; you put your NHS number in, and a fixed string of gibberish comes out.
But here’s the problem: entropy. Entropy is the measure of unpredictability. A UK NHS number is exactly 10 digits long. That means there are only 10 billion possible combinations (10¹⁰). To you and me, 10 billion sounds massive. To a modern Graphics Processing Unit (GPU), it’s an absolute doddle. A standard gaming GPU can churn through billions of hashes a second.
If a hacker steals a database of hashed NHS numbers, they don’t bother trying to “reverse” the maths. They just generate a ‘rainbow table’. They hash every single number from 0000000000 to 9999999999, and simply match the results against the stolen database. Boom. Millions of records completely de-anonymised in seconds. This is exactly why the 13-digit numbers failed so miserably in South Korea.
To fix this, you have to add “salt” which is a massive, secret string of random letters mixed in with the NHS number before hashing it. But if a lazy admin leaks the salt, or uses the same salt across different hospital databases, the whole house of cards collapses.
The Danger of "Quasi-Identifiers"
Even if you perfectly hash the NHS number, you aren’t safe. Hackers use “linkage attacks” to zero in on you using “quasi-identifiers”, things like your gender, your date of birth, and the first half of your postcode.
Academic studies are brilliantly clear on this: knowing just those three basic demographic details can uniquely identify between 63% and 87% of the entire population when cross-referenced with public data like voter registries. In massive national databases, determined attackers can easily re-identify 10% to 20% of supposedly “anonymous” patients.
| The Hacker Persona | What They Want | How They Do It |
| The Prosecutor | To find one specific person they already know is in the database. | Easy. They use known facts (like a specific hospital visit date) to fish out the target’s entire medical history. |
| The Journalist | To re-identify anyone just to prove the system is broken and write a cracking headline. | Very Easy. They look for the most unique outliers (e.g., a 99-year-old in a tiny village) to prove the privacy is a sham. |
| The Marketer | To unmask millions of people at once to build commercial profiles. | Doable. Requires heavy-duty cross-referencing with consumer credit files, but highly lucrative. |
AI Changes the Game Completely
What counts as “anonymous” is a moving target. Data that is perfectly safe today might be incredibly easy to crack tomorrow thanks to artificial intelligence.
Take an MRI scan of a head, for example. You strip out the name and date of birth, so it’s safe, right? Wrong. Modern AI facial reconstruction algorithms can take the bone structure from an anonymous MRI and generate a highly accurate 3D model of the patient’s face, matching it to their Facebook profile.
Or consider “Membership Inference Attacks”. Hackers can poke at a medical AI model and mathematically deduce if a specific person’s data was used to train it. If the AI was trained exclusively on HIV patients, and the hacker proves your data was in the training set, they’ve just outed your health status without ever breaking an encryption key.
Enter the NHS Federated Data Platform (FDP)
With all those terrifying risks in mind, let’s look at the UK’s boldest health data project: the NHS Federated Data Platform (FDP).
Why Does the NHS Need It?
Frankly, NHS IT systems are historically a bit of a mess. Different hospitals and GPs use thousands of different legacy systems that completely refuse to talk to one another. This makes managing things like massive elective surgery waiting lists, bed capacity, and ambulance handovers an absolute nightmare.
To drag the NHS into the digital age, the Government commissioned the FDP. They awarded a massive contract worth £330 million over seven years, though implementation costs push it closer to £480 million to a consortium led by Palantir Technologies.
How Does It Work?
Rather than building one giant, terrifying mega-database of everyone in the country (which the British public would absolutely riot over), the FDP is “federated.” Every Hospital Trust gets its own separate software instance, and they remain the absolute boss (Data Controller) of their own data.
However, these local platforms talk to each other to coordinate care, and they feed specific, scrambled data up to a National Instance (the NDIT) so NHS England can plan for the big picture. To keep things safe, the NDIT strictly separates identifiable data from anonymous data, and access is tightly restricted.
| The Shiny New Tool | What It Does | How It Handles Data |
| A&E Forecasting | Predicts when A&E is going to be swamped. | Crunches heavily pseudonymised historical data. |
| Ambulance Dashboard | Stops ambulances queuing outside hospitals. | Uses live dispatch data to speed up handovers. |
| OPTICA Acute | Helps discharge patients faster. | Uses directly identifiable data (because doctors need to know exactly who they are treating!). |
The Palantir Drama: Spies, Trust, and Dodgy Emails
Despite the brilliant operational benefits, the FDP has caused an absolute uproar. And it’s almost entirely because of Palantir.
Spies and Surveillance
Palantir isn’t your average tech company. Backed early on by the CIA’s venture capital arm, they made their name building software for the military, the NSA, and US immigration enforcement (ICE). Putting a company with deep ties to global surveillance in charge of the NHS’s central nervous system has understandably made privacy groups like Foxglove extremely twitchy.
The NHS.net Email Fiasco
The rollout of the FDP hasn’t exactly been a masterclass in building public trust. Early contracts were published with hundreds of pages blacked out, and only saw the light of day after legal threats.
Then came the email scandal. It turned out that Palantir engineers working on the FDP were handed active @nhs.net email accounts. This seemingly innocent IT move granted private contractors access to the internal NHSmail directory, exposing the contact details, job roles, and mobile numbers of up to 1.5 million health service staff! They even got access to internal Microsoft Teams groups. While Palantir shrugged it off as “normal practice for government suppliers,” medical professionals were furious that they hadn’t consented to sharing their details with a military-adjacent tech firm.
When you combine this with the NHS’s blunt “National Data Opt-Out” where patients have to choose between sharing their data with everyone or sharing it with no one, it’s a recipe for a massive public trust collapse.
Eye-Watering Costs and the US CLOUD Act
Palantir’s global footprint is just as controversial as its origins. On the UK’s Digital Marketplace, a single Palantir Foundry licence costs a mind-bending £3,000,000. Throw in £66,000 for implementation and £21,500 per server core annually for support, and it is a wildly expensive bit of kit. In the US, they are swimming in federal cash, boasting a $90 million deal with Health and Human Services (HHS) and massive military contracts.
But other countries aren’t so keen. The Swiss military firmly rejected Palantir, noting it was too expensive and that relying on their engineers could compromise Switzerland’s independence during a crisis. In Germany, while some regional police use it, the federal government recently banned its national police force from adopting the system. And during the pandemic, the Greek government signed a controversial contract that included a sneaky “improvement clause”, which privacy groups warn could allow Palantir to quietly improve its own software by learning from Greece’s data.
But the biggest headache? Data sovereignty. Because Palantir is a US company, they are subject to the US CLOUD Act. This means the US government can legally demand access to data held by US cloud providers, even if that data belongs to UK citizens and is sitting on UK servers. No amount of local contract clauses can completely override that federal US law.
IQVIA: The Poacher Turned Gamekeeper
Knowing that Palantir was a PR nightmare waiting to happen, the NHS came up with a clever architectural trick: the “Separation of Control”. They hired an entirely separate company to build a Privacy Enhancing Technology (PET) layer. Palantir provides the analytics engine, but they don’t get to touch the privacy controls.
The IQVIA PET Middleware
This multi-million-pound PET contract was awarded to IQVIA. The PET sits like a digital bouncer between the raw NHS data and Palantir’s FDP. Before any data goes into Palantir’s system, it must pass through IQVIA’s platform, which automatically applies the strict privacy rules set by the local NHS bosses. Palantir is legally banned from commercialising the data or using it to train their own AI, and they only hold the data temporarily.
Some Very Clever Tech
IQVIA’s PET is far more sophisticated than the rubbish encryption used in South Korea. It uses:
- Dynamic De-identification: It strips out identifiers and salts the pseudonyms heavily. Crucially, the key to reverse it is held only by the NHS, not by Palantir or IQVIA.
- Differential Privacy: It injects calibrated mathematical “noise” into the data. This means you can get accurate big-picture statistics, but it’s mathematically impossible to prove if any specific individual’s data is in the set.
- Targeted Generalisation: It blurs the lines. Instead of your exact date of birth, it might just say you are “30-35 years old”, ruining the chances for hackers relying on linkage attacks.
The Ultimate Paradox
While the tech is incredibly impressive, the irony is impossible to ignore. IQVIA, the company entrusted to act as the ultimate privacy shield for the NHS, is the exact same multinational data broker that was at the absolute centre of the 2015 South Korean data-harvesting disaster!
While IQVIA is bound by incredibly strict contracts within the NHS, the fact that one of the world’s largest commercial consumers of health data is also the world’s premier vendor of privacy technology perfectly sums up the bizarre, deeply conflicted reality of the modern health data economy.
The Bottom Line
We absolutely need our data-driven healthcare corrected; if we want to clear NHS waiting lists, predict A&E surges, and invent new medicines, our medical records need to talk to each other.
But the 2015 crisis in South Korea and how Palantir are perceived globally is a stark warning. It proves that simply scrambling data and hoping for the best is a recipe for disaster. The illusion of anonymity is easily shattered by modern computers.
The UK’s approach with the NHS Federated Data Platform is deliberately using Palantir and IQVIA to act as checks and balances against each other which is a massive step up in security. The differential privacy and dynamic masking technologies are lightyears ahead of old-school hashing.
However, as AI continues to evolve and public data becomes more widely available, true anonymity will always be in a perishable state. The success of the NHS FDP won’t just depend on clever algorithms; it requires iron-clad legal boundaries, absolute transparency with a highly sceptical public, and a fundamental refusal to treat our medical privacy as an obstacle to be engineered away.
Share it, so others know...
- Share on Bluesky (Opens in new window) Bluesky
- Share on Mastodon (Opens in new window) Mastodon
- Email a link to a friend (Opens in new window) Email
- Print (Opens in new window) Print
- More
- Share on X (Opens in new window) X
- Share on WhatsApp (Opens in new window) WhatsApp
- Share on Reddit (Opens in new window) Reddit
- Share on Pinterest (Opens in new window) Pinterest
- Share on Threads (Opens in new window) Threads
- Share on Tumblr (Opens in new window) Tumblr
- Share on Facebook (Opens in new window) Facebook
- Share on Telegram (Opens in new window) Telegram
- Share on Nextdoor (Opens in new window) Nextdoor
- Share on LinkedIn (Opens in new window) LinkedIn
- Share on X (Opens in new window) X

