A mutable log is a story you can rewrite. An append-only log is a witness you can call. We built Magistry on the second model — every action the agents take is a row, every row is signed, and no row is ever updated in place.
That sounds like a database design choice. It's actually a trust choice, and it's the one that decides whether an operator can hand the store keys to an agent and still sleep. The rest of this post is the case for why.
Provenance: the answer is a row, not a vibe
When the Head of Paid Media asks 'why did we scale that ad?', the honest answer in most AI tools is 'the model thought it was a winner.' That's not an answer you can take to a budget review. In Magistry the answer is a row — with the judge score, the evidence rows it pointed to, the executor that wrote it, and the operator who held the kill switch at that moment.
{
"id": 84193,
"agent": "campaign.specialist",
"action": "SCALE_BUDGET",
"trigger": "blended_roas > floor for 7d",
"evidence": ["ad_decision_log#21498", "metrics_window#9m2"],
"judge_score": 0.91,
"applied_to": "google_ads/campaign/771",
"reverse_op": "SET_BUDGET 1840.00",
"applied_at": "2026-05-18T07:42:11Z"
}Notice what's stamped on the row: the trigger that fired, the evidence it cited, and — critically — the reverse op. The row is self-describing. Six months later, no one has to reconstruct intent from logs and Slack threads. The intent is the record.
Reversibility: write the undo when you write the do
Every action is paired with its rollback op at the moment the action lands. Not computed later, not derived from a diff — written. To reverse, you read the row and run the op. The agent doesn't need to remember anything, and the rollback can't drift away from the action it undoes, because they were born together.
This is the difference between 'we can probably undo that' and 'undo is a row you can point at.' A rollback that's computed after the fact is a second guess. A rollback that's stored with the action is a guarantee.
A mutable log lets you ask what the system thinks now. An append-only log lets you ask what it knew then. Audits only ever care about the second question.
— Ines Park
Why mutability is where audits go to die
Tier-A cost confidence at decision time is not the same as Tier-A at read time. Costs get restated, suppliers change, a SKU gets recategorized. If your log mutates, the tier you read today is not the tier the agent acted on — and now your audit is fiction with good formatting.
- The tier is stamped on the row when the decision was made — what the agent saw is what the auditor sees.
- Corrections are new rows that reference the old one, never edits. The chain stays intact.
- Each row is HMAC-signed and chained to its predecessor, so a silent edit is detectable, not invisible.
The cost we pay for it
Append-only isn't free. Storage grows monotonically, 'current state' becomes a projection you compute rather than a column you read, and you give up the convenience of UPDATE. We took those costs deliberately. Storage is cheap; a board member's trust is not. And computing current state from an immutable history turned out to be a feature — it means every view of the store is reproducible from first principles.
If you only remember one thing: autonomy is not 'the agent is smart enough to trust.' Autonomy is 'every move the agent makes is a fact you can read, verify, and reverse.' Append-only is how you keep that promise even when no one is watching.
