The Machinery of "No"

What the standards bodies and infrastructure players are actually building

3-4min
Blog

Part 1 - Too Small to Sue, Too Small to License, Ingested all the Same
Part 2 - The Answer Without the Author
Part 3 - The Machinery of "No"

The first two posts in this series - described the problem: original work is ingested without consent or credit, and AI answers increasingly satisfy the reader in place of the source. The fair question is what is being built in response. The honest answer: more than you might think, and less than you would hope.

The fastest mover is infrastructure

In July 2025 Cloudflare, which sits in front of a large share of the web, flipped the default. New sites now block known AI crawlers unless the owner allows them. Its Pay Per Crawl marketplace lets a publisher allow, charge, or block each crawler individually; by Cloudflare's own April 2026 figures the network returns more than a billion "402 Payment Required" responses to AI crawlers every day. A companion Content Signals Policy adds machine-readable lines to robots.txt that separate three uses: search, ai-input, and ai-train. For the first time, "do not train on this" is a default a small publisher gets for free.

A standard for licensing

Really Simple Licensing (RSL), from the people who built RSS, became an official standard in December 2025. It turns the old yes/no of robots.txt into machine-readable terms, including pay-per-crawl and pay-per-inference, and the RSL Collective exists to bundle the long tail (the small publishers who cannot negotiate alone) into something an AI company will actually transact with. More than 1,500 organisations have signed on, from the Associated Press to Yahoo, and there are now one-click plugins for common content management systems.

Preferences and provenance

Underneath, the IETF is standardising a shared vocabulary for AI-usage preferences through its AIPREF working group, so that a single "Content-Usage" signal can replace today's patchwork. It is real and it is moving, but it is not finished: the core drafts are still working toward consensus, with a standards milestone targeted for later this year. In parallel, C2PA "Content Credentials" attach cryptographic provenance to a file, now built into cameras, phones and major platforms, and the EU AI Act's machine-readable marking duties for AI-generated content take effect on 2 August 2026.

Two honest caveats

So the machinery is arriving: a way to say no, a way to set terms, a way to mark provenance. Two honest caveats follow.

First, published is not the same as honoured. A signal only works if AI firms read and respect it, and adopting the signal is not the same as complying with it. Second, and sharper for researchers, NGOs and archives: none of this brings back the visit. A signal that blocks or prices a crawler is a defence. It is not discovery. Even on a fully licensed web, the answer engine still satisfies the reader without a click, and being attributed inside an AI answer is not the same as being read.

Necessary, not sufficient

Which is why, for mission-driven publishers, the standards are necessary but not sufficient. They are worth adopting: they are mostly free, and being legible to the emerging rights-and-provenance layer is quickly becoming table stakes. But they protect the boundary, not the relevance. Staying the source that AI systems reach for, name, and point people back to is a separate problem.

Next in this series

Next in this series: what the courts have actually settled, and what they have not.