Every project starts with a tagging spec. Six months later, nobody trusts it. Here's how I built a system where the documentation writes itself — and stays true forever.
If you've worked in digital analytics long enough, you've been in this exact meeting. Someone asks which parameters the generate_lead event collects. You open the tagging spec. It says "email, subject, budget". You check GA4. Only "subject" and "budget" are showing up. You open the HTML. There it is — email was hashed and renamed to email_sha256 three sprints ago. The spec was never updated.
The document is three months old, two product iterations behind, and now actively misleading. And nobody feels at fault — because this is just how analytics documentation works in most projects.
The real cost: teams waste hours reconciling what the spec says with what's actually in GA4. Analysts make decisions on data they don't fully trust. Audits reveal the same inconsistencies every time.
The root cause isn't negligence. It's that documentation is treated as a deliverable, not a system. You write it once, ship it, and it starts drifting from reality the moment the first product change lands.
The approach I've taken in this project flips the model. Instead of writing documentation separately and trying to keep it in sync, I designed a system where the tracking implementation and the documentation are the same artifact.
Three principles drive this:
Rather than writing JavaScript event handlers for each element, every trackable link in the site carries its analytics payload directly in the markup:
<!-- The tracking parameters live here, in the HTML itself -->
<a href="blog.html"
data-track='{"content_type":"nav_link",
"content_id":"blog",
"content_name":"Blog",
"item_list_name":"nav"}'>
Blog
</a>
A single delegation listener in analytics.js intercepts all clicks on elements with a data-track attribute, parses the JSON, and pushes to the dataLayer. No per-element event handlers. No risk of parameters drifting between the HTML and some external document — they are the HTML.
Adding tracking to a new element doesn't require writing JavaScript. You add the attribute with the right parameters, and the delegation listener handles the rest. The spec updates itself.
The central analytics script handles PII filtering at the system level, before any data touches the dataLayer. This isn't a checkbox in GTM — it's the default behavior of the code:
// Email → hashed with SHA-256 before it ever reaches the dataLayer
async function hashEmail(email) {
const buf = await crypto.subtle.digest(
'SHA-256',
new TextEncoder().encode(email.trim().toLowerCase())
);
return Array.from(new Uint8Array(buf))
.map(b => b.toString(16).padStart(2, '0')).join('');
}
// Free-text fields → never captured raw, only metadata
// message content → message_word_count, message_char_count, message_filled
// Fields named name, phone, address, etc. → automatically excluded
The dataLayer push for a form submission looks like this — the raw message is never in there:
window.dataLayer.push({
event: 'generate_lead',
form_name: 'contact',
subject: 'analytics audit',
budget: '5k-10k',
message_word_count: 47,
message_char_count: 214,
message_filled: true,
email_sha256: '3a7bd3e2...' // hashed, never raw
});
The part that makes this system genuinely different from a well-maintained spreadsheet is the visual layer. A Puppeteer script (capture_measurement.mjs) runs through the entire site and produces proof of what's being measured:
The output is a folder of 30+ screenshots, each showing exactly which element fires which event. When a designer moves a button, you re-run the script and the screenshots regenerate. No manual annotation, no outdated arrows on a slide deck.
This is auditable documentation. Not "the spec says this button tracks X". But "here is a screenshot of that exact button, with a red box around it, taken automatically from the live site."
All of this feeds into an interactive HTML document — the measurement plan. It has eight tabs, one per event type. Each tab has a parameter table, example payloads, and the Puppeteer screenshots embedded inline.
Keeping it up to date requires no manual effort beyond the tracking implementation itself. When something changes:
data-track attribute (or inline JS) in the HTMLnode capture_measurement.mjs — screenshots regeneratenode build_standalone_measurement.mjs — the standalone file rebuilds with all images embedded as base64That standalone file — a single self-contained HTML — can be sent to any stakeholder, opened without a server, and read without any setup. It's the shareable artifact of record.
The measurement plan for this project documents six event types across 28 parameters:
The measurement plan itself is versioned. Before any significant update — new event added, existing event removed, parameter renamed — the current version is archived to measurement_plan/archive/measurement_plan_vN.html. The main document gets a version bump and a changelog entry.
Minor fixes (typos, refreshed screenshots, link corrections) don't trigger a version bump. Structural changes always do. This gives you a full audit trail: you can open the v1 archive and see exactly what was being tracked on launch day.
The immediate benefit is obvious: the documentation is always right. But the second-order effect is more interesting.
When updating the spec takes one command instead of an afternoon in a spreadsheet, people actually do it. The friction disappears. Documentation stops being a chore that competes with delivery, and becomes a natural side effect of doing the implementation work.
The audit trail also changes how conversations about data quality happen. Instead of "I think the event tracks X", you open the standalone HTML, navigate to the event tab, and show the screenshot of the exact element, taken from the live site, with the parameter table next to it. There's no ambiguity to argue about.
The goal isn't better documentation. It's making the cost of accurate documentation so low that it's never worth skipping.
This system was built as part of the TNK portfolio project — the same site you're reading this on. Every event described above is live and currently tracking. The measurement plan is available as a standalone HTML that reflects the current implementation, not the one from the launch sprint.
Comments
Guillermo García
Digital Analytics Engineer · TNK Design & Analytics