Labelling AI-Generated Content: Turning a Code of Practice Into Code

On June 10, 2026, the European Commission published a Code of Practice on marking and labelling AI-generated content, part of a broader tech sovereignty package proposed in early June 2026. A code of practice is softer than a regulation, but it signals direction clearly, and the direction is this: AI-generated content should carry a mark that says so. That sounds simple. As someone who builds pipelines, I can tell you it is one of those requirements that is easy to state and genuinely hard to implement well.

Marking is easy; durable marking is not

The naive version of "label AI content" is a visible watermark or a line of metadata. Both are trivial to add and trivial to remove. A screenshot strips a metadata tag. A crop removes a corner watermark. A re-encode can wipe an embedded signal. So the real engineering question is not can I mark this but will the mark survive the journey from generation to wherever the content actually lands.

When I think about implementing labelling that is worth anything, the hard parts cluster here:

Robustness — does the mark survive compression, resizing, format conversion, and casual editing?
Provenance, not just a flag — a credible system records who or what generated content and when, ideally cryptographically signed, rather than a removable boolean.
Interoperability — a label is only useful if downstream platforms can read and trust it, which means leaning on shared standards rather than a bespoke scheme.
Adversarial resistance — anyone who wants to pass AI content off as human will actively try to strip the mark, so the threat model must assume motivated removal.

What a code of practice means for builders

Because this is a code of practice and not yet a hard mandate, I read it as the Commission setting expectations early and giving the industry room to converge before rules harden. That is actually the good case for engineers. It gives time to adopt provenance standards as part of the generation pipeline rather than bolting a label on at the edge.

If I were building or operating a generative system in scope, I would start treating provenance as a first-class output:

Sign generated artifacts at the point of creation, where I have the most context and the strongest guarantees.
Carry provenance metadata through every transformation stage instead of dropping it on the first re-encode.
Assume the mark will be attacked and measure how my labelling holds up under realistic tampering, not just clean test cases.

The systems that age well here are the ones where labelling is part of how content is produced, not a compliance sticker applied afterward.

The take

I like that the EU is pushing on provenance, because the underlying problem is real: a world where you cannot tell synthetic from authentic is a worse world to build trustworthy systems in. But I am clear-eyed that marking content is an arms race, not a checkbox. A removable label gives the illusion of transparency while delivering very little. If labelling is going to mean anything, it has to be durable, signed, and interoperable, and that is the part that lands on engineers. A code of practice is the cheap moment to get it right. Once it is a hard requirement, the same work happens under pressure.

Sources: Regulatory Framework for AI.