Anthropic got caught spying on its own users and called it an experiment

A developer reverse-engineering Anthropic’s Claude Code tool stumbled upon something not in any release notes, changelog, or terms of service: hidden logic, buried since April 2026, that silently fingerprinted users connecting through Chinese-linked proxies, encoding the results as invisible Unicode character swaps in every system prompt it sent. Anthropic, the company that has built its entire brand around being the “safety-first, trust-us-we’re-the-responsible-ones” AI lab, got caught running what amounts to a covert surveillance mechanism on its own users. They eventually rolled it back and called it an “experiment.” I have been in enough IT operations rooms to know that when someone calls an undisclosed surveillance mechanism an “experiment,” they really mean: we got caught.

There’s a networking trick I used to pull back in my days doing IT support and network troubleshooting. When I needed to understand what was actually happening on a network, not what the documentation said was happening, but what was actually happening, for example, those sending a lot of data or consuming a lot of bandwidth, I would run a packet capture. Just sit quietly on the wire, collect everything, and let the data tell the truth. The whole point was that nobody on the network knew I was doing it.

It is a legitimate technique when you are the network administrator doing your job. It is a very different thing when you are a software vendor doing it to your paying customers without telling them. Anthropic just learned that difference the hard way. Or rather, they got taught it by a developer with a hex editor and too much free time.

The discovery broke the internet anyway because on June 30, 2026, a Reddit user identified as LegitMichel777 posted detailed findings on r/ClaudeAI, claiming to have reverse-engineered Claude Code while attempting to restore a disabled remote control feature in version 2.1.196. During that process, he discovered obfuscated code that had been silently present since version 2.1.91, released on April 2, 2026, with no mention in the release notes, according to Cyber Security News

What did the code actually do? Here is where it gets technically elegant in a way that should make every developer deeply uncomfortable. The client altered two things within the “Today’s date is…” line that Claude Code injects into every system prompt. The date separator flipped from the standard dash (2026-06-30) to a slash (2026/06/30) if the user’s system timezone was set to Asia/Shanghai or Asia/Urumqi. The apostrophe in “Today’s” switched between four visually identical Unicode characters depending on whether the proxy hostname matched known domain lists or AI-lab keyword lists to encode a three-bit fingerprint, embedded in what appeared to be a timestamp, riding inside every affected system prompt.

Let me translate that for the non-technical reader. Claude Code was hiding a secret message inside what looked like an ordinary date stamp. The kind of trick you would see in a spy novel. Except this was running on the machines of developers who trusted the tool with filesystem access, shell access, and in many cases, access to production codebases. As one GitHub commenter put it: “For a coding agent with filesystem and shell access, this is unacceptable without explicit disclosure and user consent.” That is the understatement of 2026.

To be fair, and I am going to be fair, even though Anthropic uses me to write things, so I have a conflict of interest, I should probably declare that the context behind this decision is not trivial. In February 2026, Anthropic disclosed that three Chinese AI labs, DeepSeek, Moonshot AI, and MiniMax, had collectively run more than 16 million exchanges with Claude through approximately 24,000 fraudulent accounts, using the output to train competing models. In June 2026, Anthropic separately accused Alibaba-affiliated entities of orchestrating approximately 29 million exchanges and roughly 25,000 fraudulent accounts, all aimed at extracting Claude’s capabilities for use in competing AI systems.

So Anthropic was being robbed. Systematically. At scale. By state-affiliated actors running what the company described to the US Congress as a national security problem, not a terms-of-service dispute. I understand the impulse to do something about that. If someone were siphoning gigabytes of data from my network infrastructure, I would want to fingerprint the traffic, too. That is a legitimate operational security goal. But here is the thing they got wrong, and it is the same thing that gets IT administrators fired when they do it without authorisation

In my findings, disclosed telemetry is something developers can evaluate, block, or consent to. A modification to invisible prompt characters is something developers cannot inspect without reverse-engineering the binary. The moment you decide to hide the mechanism, you have crossed from a security measure into deception. You don’t get to call it an “experiment” after the fact. An Anthropic employee, when confronted, described the mechanism as an “experiment”, a framing that did not go down well in the developer community. Experiments have consent forms. This one had XOR obfuscation.

Anthropic acknowledged the code was present and said a new version would be released to remove it. That version, 2.1.197, was published early Wednesday, though the official changelog contained no mention of the steganographic code’s removal. They fixed it silently. The same way they shipped it.

I have managed change control processes in Oil and Gas IT environments where a single undocumented configuration change could trigger a compliance audit. The principle is simple: if you change something, you document what you changed and why. Removing surveillance code from your product without noting it in the changelog is not a fix. It is a continuation of the same communication problem that created the controversy in the first place.

Now, what does this mean for African developers is the angle I want my readers in Buea, Lagos, Nairobi, Yaoundé, Accra, Kigali and everywhere in Africa to sit with for a moment. The entire framing of this story, Anthropic vs. Chinese AI labs, US national security, state-sponsored distillation attacks, is a geopolitical conflict happening between two superpowers. Africa is not a party to that conflict. But African developers using Claude Code are sitting in the middle of it, with no say and no disclosure.

An African developer building a fintech application in Lagos, routing their Claude Code requests through a VPN or proxy because African internet infrastructure sometimes requires creative routing to get reliable latency, could potentially have been flagged by this system. Not because they are stealing AI models. Because their network path looks unusual to a detection algorithm designed for a geopolitical conflict they have nothing to do with.

That is the collateral damage problem with covert fingerprinting. The algorithm does not know your intentions. It knows your timezone and your proxy URL. And if those look like the wrong answer, you get quietly tagged, and you will never know it happened. For a continent whose developers are already navigating the disadvantages of infrastructure gaps, bandwidth costs, and patchy cloud region coverage, the last thing needed is AI tools that treat non-Western network patterns as suspicious by default.

The incident is the latest in a pattern of undisclosed technical decisions that have put Anthropic, a company that markets itself as the safety-first alternative in AI, in the uncomfortable position of explaining why its actions diverged from its stated values. That framing is the one that matters most. Anthropic is not a random software company. It is the company that built its entire public identity around responsible AI, safety-first development, and trustworthy systems. It is the company that publishes Acceptable Use Policies, Constitutional AI frameworks, and detailed safety research, and then, quietly, ships a covert user fingerprinting mechanism in a CLI tool and calls it an experiment when caught.

Trust, in technology, is not a marketing position. It is an operational commitment. You either disclose what your software does on a user’s machine, or you do not. There is no middle ground called “responsible covert surveillance.” I say this as someone who uses Claude daily for refinements, recommends it to clients, and has built workflows around it. Precisely because I am that user, I expect better. The bar for an AI company claiming the safety-first crown is not “we eventually rolled it back.” It is “we would never have shipped it without telling you.” That bar, this week, was not cleared honestly. But just to ask, do you still trust AI coding tools after this?

Category Collection

Anthropic got caught spying on its own users and called it an experiment

Leave a Reply Cancel reply

Africa Invents

Africa is building vaccine factories, but finding someone to actually buy the Vaccines is the hard part