Intent debt: keeping your cognitive sovereignty in the age of AI


§1 · START HERE

The 90-second version

The translator stands between you and the machine. It speaks fluent English. It is helpful, confident, and frequently — though not always — right. And it is quietly erasing the mental model that used to let you override it when it isn’t.

  • The mental model that built override is going with the logs. Every offload looks like a productivity win and a model-erosion event at the same time.
    • Logs — operators who used to read raw signal now read AI summaries.
    • Topology — operators who used to walk the system in their head now click approve on recommended ones.
    • Intuition — operators who used to feel something was off before the dashboard turned red now wait for the dashboard.
  • The cost is non-linear. It compounds at the rate of agent activity, not the rate of human review. There is a name for what accumulates — , the third leg of Margaret-Anne Storey’s 2026 Triple Debt Model. This article extends it from software engineering into IT operations.
  • The postmortem regime that worked when humans authored changes will not catch the loss. The primary witness is no longer human-shaped — five qualities a credible witness must have, the agent fails each one. The witness has to be recorded at the moment of the act, before the change touches production.
  • The framework that closes the gap is three disciplines, calibrated by work mode, that any operator can begin without organizational permission.
  • The bet — by June 30, 2027, both major-analyst recognition and a reported incident will exist. I am betting YES on both halves.

TERMINOLOGYGenerative solutions refers to the broad class of AI assistance in operations. Agents refers to the autonomous-action variants that can change production state. The distinction matters at gate time, where the gate fires on agent-initiated changes specifically.


§2 · ORIGINS

How the mental model used to get built

Before the loss, the question — what built the mental model in the first place?

There is a finding in cognitive science that most people refuse to believe when they first hear it. Smooth performance during practice predicts worse long-term retention. Effortful, error-prone, frustrating practice predicts better.

Robert Bjork named the principle desirable difficulties in 1994. It has been replicated across domains for three decades.

The implication for IT operations is direct

The operator who fought to bring up a service the first time owns the mental model of that service afterward.

The operator who watched a bring it up — smoothly, in one shot, with no friction — does not.

The smooth path felt productive. It produced no durable mental model.

What the model does — The mental model is the simulation the operator runs when a recommendation comes back and something does not match. The mismatch is detected by the model, not by reasoning. Reasoning comes later — when the operator has to explain to a room of executives why they overrode the agent.

This is the asymmetry the productivity metrics cannot see. Smooth practice produces faster ticket close rates this quarter and shallower mental models forever.

The quarter shows up on the dashboard. The mental model shows up exactly once, at the incident.

* * *

The generative solutions are replacing the model, segment by segment.

§3 · THE WALL

Where understanding is lost

The mechanism is easier to see if you start with the pattern and zoom in.

ToolWhat it offloads
SpellcheckSpelling recall
GPSSpatial recall
CalculatorNumber sense (contested)
Generative solution in IT opsThe ability to govern the infrastructure

The cognitive science is settled enough on the first two.

  • The Google effect — Sparrow, Liu & Wegner (2011) showed that when people expect future access to information, their recall of the information drops while their recall of where to find it rises.
  • GPS and spatial memory — Dahmani & Bohbot (2020) showed habitual GPS use correlates with worse self-guided navigation; a small (n=13) longitudinal sub-sample pointed to steeper hippocampal-dependent decline.
  • The calculator claim is debated. Treat it as such.

So the pattern is old. The novelty is the stake.

* * *

Zoom in one level. Three operators. Same story, different surface.

FIG 3.1 Three operators

The cloud architect

Was Designed topologies in her head. Three regions, two failover paths, latency budgets memorized.

Now Prompts a generative solution with the requirements and reviews the proposed topology. The output is correct, usually.

Lost She has not walked the topology in her head in eighteen months. She understands the generative solution’s opinion of the system, not the system.

The security analyst

Was Read raw firewall logs. Spotted the anomaly the rules did not yet match.

Now Reviews summarized findings. The findings are usually well-organized and frequently right.

Lost Her ability to spot the anomaly the rules don’t yet match has flattened to nothing. She understands the generative solution’s opinion of the network, not the network.

The sysadmin

Was Knew which service was finicky — which crashed under specific loads, which needed restarting in a specific order, which had an undocumented dependency on the cron job.

Now Asks an agent. The agent is helpful.

Lost His tacit knowledge of the system’s personality has decayed to anecdote. He understands the generative solution’s opinion of the machine, not the machine.

* * *

Zoom in one more level. Picture a Tuesday morning at a mid-sized manufacturer — a composite, but every move in it is one you have seen.

The system stops accepting orders.

FIG 3.2 One incident, four agents
WEDNESDAY The database administrator

Move

Pulls up a generative solution. The agent recommends a query rewrite. She applies it.

Result

Orders flow for an hour, then stop.
WEDNESDAY The network architect

Move

Joins. His agent recommends a route table change. He applies it.

Result

Latency drops, then spikes.
THURSDAY The security analyst

Move

Joins. Her agent recommends a firewall rule update. She applies it.

Result

Auth recovers, then breaks differently.
THURSDAY The sysadmin

Move

Joins. His agent recommends a config rollback that touches three services. He applies it.

Result

The system limps along.

By Thursday it is up.

By Thursday afternoon, in the postmortem, no one on the call can answer the question what changed?

Each operator applied an agent’s recommendation. Each recommendation interacted with the others. The answer lives only in the ephemeral context windows of four agents that have since been used for forty other tasks.

* * *

That is what manual override looks like after the model is gone.

The button still works. The operator can still press it. What is missing is the competing model that would tell the operator whether to press it, and what to press it toward.

Override without a model is not override. It is a guess that happens to disagree with another guess.

You have given up your cognitive sovereignty

§4 · THE PRIMITIVE

The primitive — intent debt

The thing you have given it up to has a name. The name is not original to this article.

In March 2026, Margaret-Anne Storey (University of Victoria) introduced the Triple Debt Model in a paper titled From Technical Debt to Cognitive and Intent Debt: Rethinking Software Health in the Age of AI. Three debts compound independently.

DebtAccumulates inPaid down by
TechnicalCodeRefactoring
CognitivePeopleFriction-rich practice
IntentArtifactsA person rebuilding the missing rationale

Storey’s framing is software-engineering-focused. This article extends it into IT operations — because the unit of intent debt is different.

The unit, operationally — Every approved AI-initiated change to production where the operator clicks approve without the mental model to evaluate the action is one unit of intent debt. The change happens. The system state advances. The rationale exists only inside an ephemeral context window that will be cleared before the next ticket.

In software, intent debt looks like a codebase whose rationale has gone unrecorded — pull requests that no longer have authors, design decisions whose constraints have been forgotten.

In IT operations, the unit is operational, not architectural.

Technical debt accumulates in the system. Intent debt accumulates at the boundary between the system and the people who govern it.

* * *

is the operator-level outcome that both kinds of debt threaten. is the operator’s atrophying mental model. Intent debt is the unwritten rationale of every change the model can no longer audit.

Both compound. Both compound non-linearly.

The curve is non-linear for two reasons.

Each change interacts with every prior change in ways that no individual rationale captures — n changes open on the order of n²/2 possible interactions, so the surface to audit grows quadratically while the changes themselves arrive linearly. And the operator’s ability to audit drops below the change rate — at some point, more changes arrive per hour than a human can read postmortems for.

Once those two curves cross, intent debt compounds at the rate of agent activity, not the rate of human review.

FIG 4.2 — THE GAP

FigureWhat it measuresSource
4.8MUnfilled cybersecurity roles globally — the last gap figure publishedISC2 (2024)
26%US cybersecurity roles vacant — 514,000 current openingsCyberSeek (2026)
14%Organizations that say they have the skilled people they needWEF Global Cybersecurity Outlook (2025)

ISC2’s 2025 study dropped the 4.8M estimate, citing a shift from a headcount shortage to a skills shortage — which sharpens, not softens, the point.

The workforce dimension makes this worse.

The operators who carry the mental models built the systems before generative solutions existed. The operators arriving now skip the friction §2 described and reach productivity in their second week without the mental model.

Cybersecurity is the most-measured slice of IT operations. The rest of the discipline follows the same shape — the people who built the systems are aging out, the people coming up are arriving without the friction-built models, and the gap is being filled with generative solutions that produce changes faster than humans can read them.

The system becomes ungovernable not because it stops working, but because nobody knows how it works.

§5 · THE BET

A falsifiable prediction

The honest version of the bet is the compound one. The analysis is making two claims, not one — that AI velocity is rising, and that mental models are eroding. The giant issue is the product of those two trajectories crossing.

THE BET PENDING

Position YES Horizon June 30, 2027 Checkable An afternoon of reading.

On a 12-month horizon — by June 30, 2027 — both of the following will be true.

First. At least one major industry analyst firm (Gartner, Forrester, IDC, 451 Research, or RedMonk) will have published a public report, blog post, or research note acknowledging that the rate of AI-initiated production changes has outpaced human review capacity at a meaningful fraction of surveyed enterprises.

Second. At least one reported article in The Register, Ars Technica, Krebs on Security, The Information, or Wired will have described a specific incident in which that gap produced an operational failure.

I am betting YES on both halves. Checkable on June 30, 2027 with an afternoon of reading.

Both halves have to land for the bet to win. If only one lands — the analysts say it but no incident has been reported, or an incident has been reported but no analyst has named the underlying gap — the bet is half right and I count it as a loss.

OPEN · CHECKABLE An afternoon of reading.

§6 · THE COUNTER

Why the postmortem won’t save you

The bet handles whether. The objection that arrives next is forensics.

Even if a giant issue does occur, we’ll catch it in the postmortem.

The postmortem is, structurally, an interview. You interview the system, the operators, the logs — and from those witnesses you reconstruct what happened.

The discipline has worked for decades because the witnesses were human and the trail was traceable.

The AI agent is now the primary witness for any change it initiated. Five qualities a credible witness must have. The agent fails each one.

FIG 6.1 Five qualities of a credible witness
Memory

Quality

A credible witness remembers what they did and why.

Failure

The agent’s reasoning lived inside an ephemeral context window. The next forty tasks consumed it. By the time the postmortem convenes, the witness has no memory of the testimony you need.

Term

Erasure of rationale.
Selection

Quality

A credible witness can identify the relevant moment.

Failure

The agent ran four hundred decisions in the same minute as the one that broke production. It has no native way to single out the decision that mattered. Finding it again costs the same engineering hours the incident already cost.

Term

Context overlap.
Truth-telling

Quality

A credible witness tells the truth more compellingly than they tell a lie.

Failure

Sui, Duede, Wu & So (2024) showed hallucinated LLM outputs display increased narrativity and semantic coherence relative to truthful ones. The fabricated reconstruction reads better than the true one. The trap is not that the agent lies — the trap is that the agent’s lies are more polished than its truths.

Term

The trap of perfect-sounding logic.
Sequence

Quality

A credible witness can recount their actions in order.

Failure

The agent did not act once. It made five to ten corrections in seconds, each interacting with the next. The postmortem ships with the visible cause as the recorded cause, because the team has another incident waiting.

Term

The compounding error loop.
Consistency

Quality

A credible witness gives the same testimony when asked again.

Failure

Atil et al. (2025) measured accuracy variations up to 15% across runs of the same hosted LLM with the same input — gaps up to 70% between best and worst run. He et al. (Thinking Machines Lab, 2025) traced the cause to floating-point non-associativity in production inference. Re-run the agent with the inputs from the incident and it will not reproduce the outputs from the incident — the witness testifies, but to a different incident.

Term

Non-determinism.

MEMORY LAYERS — Memory layers store outputs, not reasoning. They are themselves subject to the consistency failure on read. They are useful infrastructure. They are not the witness you would have wanted.

The agent is not a credible witness for the kind of incident the postmortem was designed to investigate.

The postmortem was designed for a human-shaped witness. The witness is no longer human-shaped.

Record the testimony at the moment of the act

The only viable response is to record the testimony at gate time — before the change touches production, while the witness still exists.

§7 · WHAT TO DO

The framework, and where it lives

The diagnosis names what is at risk. Closing the gap is three disciplines.

FIG 7.1 Three disciplines

First-contact sovereignty

Discipline Form your independent prior before the generative solution’s prior becomes yours. Read raw signal. Sketch the topology. Form the mental picture against which every AI recommendation that follows will be judged.

Mode Calibrated by urgency. Sixty seconds in a critical incident; a full workday at the start of a design.

The AgODR gate

Discipline Every agent-initiated production change must produce a signed — with rationale and a reversibility class — before it touches production. A local pre-action hook checks; pass appends to the operations ledger and applies; fail brings the operator into the loop.

Mode Same mechanism in response, construction, and design — the unit of decision changes; the gate does not.

Adversarial prompting

Discipline The operator writes the plan first. The generative solution attacks it. The AI becomes a gym, not an oracle — the operator’s mental model gets exercised against an adversary that does not tire.

Mode Different prompts per mode; same posture — “Act as an adversarial incident commander. Find the failure modes my response plan misses.”

None of the three requires organizational permission to begin.

Each one trades velocity for governance on purpose. The gate slows an agent-initiated change to the speed at which a human can still own it — that is the cost, and on anything that touches production it is the cheap side of the trade. The change that ships in four seconds and cannot be explained in the postmortem is the expensive one.

For the director signing off on the trade: the cost is measured in changes-per-hour given up across a team; the return is that every change touching production stays auditable when the capital is yours to answer for. That is the line item — paid in velocity now, against an incident you cannot price later.

LINEAGE — AgDR — Agent Decision Record — was introduced by me2resh on GitHub (me2resh/agent-decision-record). AgDR is a software-engineering convention; the unit of decision is an architectural choice an agent makes during development. AgODR — the operations variant this publication is developing — extends the concept into IT operations, where the unit is an operational action, reversibility classification becomes an execution gate, and governance is the dominant driver. Both are emerging conventions. Neither is a ratified standard.

The disciplines exist in this article as names. The mechanics live in the workshops.

The AgODR workshop builds the gate — template, fields, reversibility class taxonomy, co-sign mechanics, local pre-action hook implementation.

The adversarial-prompting workshop builds the gym — per-archetype prompt libraries, deliberate-practice protocols, drills across response, construction, and design modes.

Both are forthcoming through CTRL ALT PRESS.

§8 · LAST WORD

Close

The central claim, once, clean.

Cognitive sovereignty — the operator’s earned mental model of the system, built through the friction §2 described — is the unit at risk.

Intent debt is what accumulates in its place. Storey’s term, March 2026, extended here from software engineering into IT operations.

The postmortem regime that worked when humans authored changes does not survive the substitution of agent for human.

The framework that closes the gap is three disciplines, calibrated by work mode, that any operator can begin without organizational permission.

* * *

The bet from §5 resolves on June 30, 2027.

The translator will still stand between you and the machine. It will still speak fluent English, and it will still be right most of the time. What you decide now is whether you can still hear yourself over it.

The system can be governable. The operator can keep the mental model. The override can stay possible.

None of that is guaranteed by the framework. The framework is the floor below which the loss becomes structural. Above the floor is what you build with what you do on Monday.

I will see you in the workshops.

Citations

§2 — Origins

  • Bjork, R. A. (1994). Memory and metamemory considerations in the training of human beings. In Metcalfe & Shimamura (Eds.), Metacognition: Knowing about Knowing. MIT Press, pp. 185–205.
  • Bjork, E. L., & Bjork, R. A. (2011). Making things hard on yourself, but in a good way: Creating desirable difficulties to enhance learning.

§3 — The Mechanism

  • Sparrow, B., Liu, J., & Wegner, D. M. (2011). Google Effects on Memory: Cognitive Consequences of Having Information at Our Fingertips. Science, 333(6043), 776–778.
  • Dahmani, L., & Bohbot, V. D. (2020). Habitual use of GPS negatively impacts spatial memory during self-guided navigation. Scientific Reports, 10:6310.
  • Hembree, R., & Dessart, D. J. (1986). Calculator/number-sense meta-analysis; cited as contested.

§4 — The Primitive

  • Storey, M.-A. (2026). From Technical Debt to Cognitive and Intent Debt: Rethinking Software Health in the Age of AI. arXiv:2603.22106. Also discussed at ICSE 2026 panel “Technical Debt in the AI Era.”
  • ISC2 (2024). Cybersecurity Workforce Study.
  • CyberSeek (2026). NICE / NIST live data (cyberseek.org).
  • World Economic Forum (2025). Global Cybersecurity Outlook 2025.

§6 — The Counter-Objection

  • Sui, P., Duede, E., Wu, S., & So, R. (2024). Confabulation: The Surprising Value of Large Language Model Hallucinations. ACL 2024 (Long Papers), pp. 14274–14284.
  • Atil et al. (2025). Non-Determinism of “Deterministic” LLM System Settings in Hosted Environments. ACL Eval4NLP Workshop 2025.
  • He et al. (Thinking Machines Lab, 2025). Floating-point non-associativity + dynamic batching as root cause of LLM non-determinism.
  • Gond, Kamath, Ramjee & Panwar (Microsoft Research, 2026). LLM-42: Enabling Determinism in LLM Inference with Verified Speculation. arXiv:2601.17768.

§7 — The Framework

  • me2resh (Ahmed). Agent Decision Record (AgDR). GitHub: github.com/me2resh/agent-decision-record. Blog: me2resh.com/blog/agent-decision-records.

Show my work

v2.1 — 2026-06-03

Readability and visual-anchor pass — the piece was front-loading its components (§3 Perspectives, §4 Graph, §5 Bet) and then running as unbroken prose through the back half, exactly where readers were dropping. Four conversions, all on the existing component library, no new components. The §3 Tuesday-morning ERP cascade — a four-operator sequential narrative — became a Walkthrough (FIG 3.2), one step per operator, Move / Result facets with the failure beat tinted. The five witness failures in §6 finally got the card treatment the v2.0 note below promised but never shipped: a Walkthrough (FIG 6.1) of five steps, Quality (green) / Failure (red) / Term facets — the ‘five red cards’ realized as Facet tone rather than the retired TintedList. The §7 three disciplines became a Perspectives (FIG 7.1), Discipline / Mode lines, matching the §3 pattern. The §4 workforce numbers were lifted out of a blockquote into a captioned stat table (FIG 4.2) so the figures read as a scannable row. No argument changed; every glossary link, anchor, and citation survived.

v2.0 — 2026-05-26

Funnel-discipline pass and structural tightening. The framework section (§7) was a full curriculum in v1 — three disciplines with response/construction/design-mode breakdowns, the AgODR pipeline-gate ASCII diagram, the adversarial-prompting per-archetype prompt templates. All of that is what the workshops teach; carrying it in the public essay removed the reason to register. v2 keeps the diagnosis (the witness-failures argument is the load-bearing analytical move) and cuts the method to a two-paragraph signpost. The five witness failures in §6, previously written as a paragraph cascade, were restructured into a TintedList of five red cards with Beat anatomy (Quality / Failure / Term) — the AgDR-0015 sub-primitive generalizes from its original Signal/Tell/Move idiom to this analytical taxonomy. The compounding-curve placeholder in §4 was rendered as a schematic Graph with empty axis ticks, honoring the draft’s ‘no axis values’ brief. It reads complete at rest — the point sits at the crossover, annotated ‘audit rate < change rate’, and the caption carries the whole claim; the slider only lets a reader trace the curve green→amber→red, it does not move the crossover.

v1.0 — 2026-05-12

First longhand pass. The ‘translator’s trap’ framing and the Tuesday-morning ERP cascade were already present, but the piece had no name for the loss — ‘cognitive sovereignty’ arrived in this version, ‘intent debt’ did not. The §6 witness-failure analogy and the §5 falsification bet were not yet in the draft. The framework section ran as undifferentiated prose, before the response/construction/design-mode calibration emerged.

Continue

If this changed how you see the problem

The diagnosis names the gap. These workshops are where you close it — start with the one built for the call you actually make.

Start here — Operational Decision Records & AgODRs1 day live / 6 self-paced modules · Orchestrator

Or take it from another angle

  1. 02The Triad of Prompt LensesAdversarial prompting is a discipline of refusing the comfort of unmeasured success.
  2. 03The Experience Outcome LayerStop playing two different games tracking two different scores.