Nineteen endoscopists, each with more than 2,000 colonoscopies behind them, got measurably worse at their core job after a few months of working alongside AI. Their adenoma detection rate on unassisted procedures fell from 28.4% to 22.4%, a 20% relative drop, documented across 1,443 colonoscopies in a multicentre study in The Lancet Gastroenterology & Hepatology. These were not trainees. They were experts whose hands forgot what the software started doing for them.
That study is the first hard measurement of clinical AI deskilling. It will not be the last, and pharmacy is closer to the edge than gastroenterology.
The problem most health systems are getting wrong
Deskilling gets filed under “clinician sentiment.” Something to acknowledge in a town hall, soothe with a training module, and move past. The Wolters Kluwer Future Ready Healthcare survey, published June 2, 2026, found that 74% of clinicians name loss of critical thinking and decision-making skills as one of the greatest risks of AI adoption. The standard reading of that number is that clinicians are anxious and need reassurance.
That reading is wrong. The clinicians are not anxious. They are correct, and now there is data proving it.
Here is what makes the gap dangerous. In that same survey, 73% of doctors and 70% of nurses reported using AI at least weekly, up from 38% and 46% a year earlier. Daily heavy use roughly tripled. Adoption is no longer the question. Yet only 27% of clinicians know how their organization addresses AI governance, just 35% know whether guidelines exist for checking AI accuracy, and only 22% say their employer has any policy defining who is responsible when the clinician and the AI product disagree.
Read those two facts together. Use is near-universal. Accountability is a rounding error. Health systems are deploying tools that demonstrably erode skill, at scale, with no policy describing what the clinician is still on the hook for. That is not a sentiment problem. That is an unmanaged operational liability.
The scale is the part that should worry operators most. Fierce Healthcare reported in 2026 that 75% of US health systems use or plan to use an AI platform, with clinical note-taking at 68% adoption and AI-driven clinical documentation improvement at 43%. Half of surveyed systems run three or more AI applications. This is not a pilot anymore, and pilots were never the danger. The danger is the second and third year of production use, the window the colonoscopy study actually measured, when the tool has been quietly carrying the cognitive load long enough that the underlying skill has thinned without anyone noticing.
The insight: deskilling lands in the pharmacy first
Three conditions produce skill atrophy: high task volume, heavy repetition, and an AI that quietly does the cognitive work the human used to do. Colonoscopy has all three, which is why the effect showed up there first. Pharmacy verification has them in higher concentration.
A hospital pharmacist clears hundreds of orders a shift. The work is repetitive by design. AI now pre-screens interactions, flags dosing outliers, and ranks alerts before a human ever looks. When the model is good, the pharmacist’s job collapses into confirming what the software already decided. That is the exact mechanism that pulled six points off those endoscopists’ detection rate.
Picture the failure mode concretely. A renal dosing adjustment the AI is not trained to catch slides through on a busy night. The pharmacist who has spent eighteen months approving the model’s reads no longer runs the calculation reflexively, because the software always ran it first. The error reaches the patient. In the post-incident review, the tool gets blamed, but the tool did exactly what it was built to do. The skill that should have caught it had been eroding on every uneventful shift for a year and a half. That is not negligence. It is the predictable output of a workflow nobody designed to keep the human sharp.
Telepharmacy concentrates the risk further. Centralized and remote verification models exist to push order volume through fewer pharmacists across more sites, which is the whole economic case for the model. Layer AI pre-screening on top and you have built the densest possible version of the three deskilling conditions: maximum volume, maximum repetition, maximum reliance on the software to triage what the remote pharmacist sees. A pharmacist covering eight rural hospitals at 2 a.m. through an AI-ranked queue has almost no path back to the unassisted skill, because the unassisted version of that job was never physically possible. That is not an argument against remote coverage. It is an argument that remote operations need the deskilling controls first, not last.
The design of the AI decides how fast the erosion happens. A randomized controlled trial in JMIR Medical Informatics tested pharmacists on a mock verification task and found that black-box AI produced shorter dwell times and overreliance: pharmacists approved misfilled medications because the model implied they were fine. Uncertainty-aware AI, the kind that shows its confidence and flags what it does not know, acted as a safeguard and caught bad recommendations, at the cost of some unnecessary double-checks. Same task, opposite safety profiles, and the only variable was whether the tool admitted doubt.
Now connect that to the buying market. The loudest vendors are selling autonomy, “hands-off” verification, “lights-out” workflows, the smallest possible human footprint. They are selling the precise conditions that manufacture deskilling, and they are pricing it as savings. Most AI ROI models I see count the labor hours removed and never price the competence lost. You cannot find “verification accuracy on the day the AI is down” on a single one of those spreadsheets. It belongs there, because that is the day the deskilling bill comes due.
More than half of the systems that managed to quantify AI ROI reported at least 2x returns, per the 2026 Fierce Healthcare reporting. I do not doubt the number. I doubt the completeness of the ledger behind it. A 2x return that books every removed labor hour as pure savings, while carrying zero liability for skill erosion, downtime exposure, or the cost of retraining a workforce that has forgotten how to operate manually, is not a return. It is a deferred cost wearing a return’s clothing. The honest version of that model has a line for resilience, and resilience is exactly what deskilling spends.
“The colonoscopy study is not a warning about the future. It is a measurement of the present, and your verification queue is next.”
What to actually do about it
The fix is not slowing adoption. It is governing it, with controls aimed specifically at preserving skill. The Coalition for Health AI released eight governance playbooks on May 27, 2026, built with input from more than 150 clinicians, covering organizational AI policy, risk and impact assessments, lifecycle management, and third-party oversight. They are open-source and they are a strong floor. They are also silent on deskilling as a named risk. Treat that as the gap you close yourself.
Use this decision matrix to classify any clinical AI tool before it touches a live workflow.
| AI design / use pattern | Deskilling risk | Governance control to require |
|---|---|---|
| Black-box auto-approval, no confidence shown | High | Mandatory uncertainty display; periodic unassisted audits |
| AI pre-ranks, human confirms every case | High | Random AI-off shifts; track unassisted accuracy as a metric |
| AI flags, human independently reviews first | Moderate | Sequence review so clinician forms a judgment before seeing AI |
| AI as second check after human decision | Low | Standard; preserves the primary skill |
| Decision-support only, no automated action | Low | Document override authority and clinician accountability |
The pattern in the right column is one rule: the human has to keep doing the cognitive work, not inherit the AI’s answer. Sequence matters more than almost anything else. A pharmacist who reads the order and forms a judgment before the AI’s recommendation appears stays sharp. One who sees the AI’s verdict first is being trained, every shift, to defer.
The AI-off audit is the control most systems do not have and the cheapest one to build. It is not complicated. On a set cadence, quarterly is defensible, route a small randomized sample of cases through a clinician with the AI suppressed, then compare their unassisted accuracy against both the AI-assisted baseline and the prior audit. A stable number means the skill is holding. A declining one is your early-warning system firing before a patient finds the gap for you. The endoscopists in The Lancet had no such audit, which is precisely why their six-point drop went unnoticed until researchers went looking. The instrument that would have caught it costs a few hours of pharmacist time per quarter.
Executive takeaway
-
Run an unassisted-skill baseline this quarter. For your highest-volume AI-assisted workflow, pull a sample of clinicians and measure their accuracy with the AI turned off. You cannot govern deskilling you have never measured, and the Lancet endoscopists looked fine until someone checked.
-
Reject black-box auto-approval in your next AI contract. Require vendors to show model confidence and to support a clinician-first review sequence. If a tool only works by replacing the human’s judgment rather than checking it, that is a deskilling engine, not a safety tool.
-
Write the one policy 78% of systems are missing. Define, in writing, who is accountable when the clinician and the AI disagree, and mandate periodic AI-off audits for any high-volume clinical workflow. Borrow the CHAI playbook structure and add the deskilling control it omits.
The endoscopists did not choose to lose their edge. The system let it happen because no one was measuring the right thing. Your clinicians are telling you the same risk is real in your building. Believe them, then go check the queue.