Founder of Detection Drama, covering AI-detection tools and academic-integrity policy
Published June 19, 2026
Schools keep buying AI detectors to win a war the largest study of student AI use just called unwinnable. That’s my read of the new Science paper the academic-integrity world is citing this month — and the researcher behind it says the quiet part out loud.
His verdict on the detector-versus-humanizer fight: “a never-ending battle.” If your classroom policy rests on a detection score, that line should stop you cold.
Here’s why it matters. Turnitin’s AI indicator runs inside thousands of institutions, and in February the company reported that 15% of essay submissions were now more than 80% AI-written, up from 3% when it launched detection in 2023. The instinct is to detect harder. The biggest dataset we have says that instinct is the wrong one.
Igor Chirikov, a senior researcher at UC Berkeley, surveyed more than 95,000 students at 20 research universities and published the results May 21 in Science. About two-thirds use generative AI; nearly 40% use it monthly or more. At least 9% of AI users admitted using it to cheat.
That cheating number is real, and I won’t pretend otherwise. It climbs with use — 26% of daily AI users said they had cheated, against 7% of monthly users. Anyone calling student AI misuse a moral panic hasn’t read the data.
What the data does not support is the detector. Chirikov’s own words: AI detection is “a cat-and-mouse game, because at the same time, there are AI humanizers… So it will be a never-ending battle.”
He is describing the exact arms race that schools are funding — and he is the one who counted the students. Banning AI fares no better. Chirikov calls a blanket ban “not a productive solution,” because students keep using it regardless, a conclusion other universities reached the hard way. The study’s recommendation is unglamorous and expensive: redesign assessment so it can’t be faked — oral exams, in-class writing, work that shows reasoning rather than a polished output.
And the cost of getting this wrong isn’t only a wasted software budget. Detectors flag non-native English writers and neurodivergent students at higher rates, turning a shaky score into a disciplinary case. Every false positive is a real student defending work they actually wrote.
Chirikov put the real stake plainly: “If every student gets an excellent grade, it becomes harder to trust that credential.” That is the problem worth solving. A detector that loses to a $12 humanizer was never going to solve it.
