ARD-0005: Security model inversion — contain the non-engineer + AI from production systems¶

Status: Accepted
Date: 2026-05-23
Deciders: Tom (Claude facilitating)
Amends: ARD-0001 — security framing — and ARD-0004 — implementation order
Extended by: ARD-0009 (closes §3 codegen deferral), ARD-0011 (closes §4 egress deferral), ARD-0016 (extends containment past container boundary), ARD-0017 (agent workflow defaults), ARD-0018 (extension trust boundary), ARD-0019 (security model applies to second surface), ARD-0022 (silent execution as trust UX)
Related: [[ard-0001-v1-architecture]], [[ard-0004-shopify-first-as-dogfood-path]]

Context¶

ARD-0001 framed boring's security model as AI containment to prevent exfiltration of prod-shape data: egress allowlists, ephemeral DB volumes derived from data_sensitivity, observation-derived allowlist learning, audit logs of sensitive restores. That framing made sense for the Django-style use case — real customer data restored locally via dbx, a powerful agent loose in the same container, network egress the obvious leak channel.

ARD-0004 picked Shopify-first for v1. That decision quietly inverted the threat model and we hadn't named it yet.

Talking through the actual v1 users — an internal team editing a production Shopify theme repo, external collaborators doing the same, the maintainer simulating both — surfaced what the failure mode really looks like:

The repo being edited is Liquid templates, not prod data. There is nothing meaningful for an AI to exfiltrate.
The repo does deploy to a live storefront. Two-repo deploy gates frequently exist in Shopify projects (a source repo paired with a deploy repo Shopify auto-commits into), and project-level rules in CLAUDE.local.md are dense with "NEVER push directly to the deploy repo," "NEVER push to the live-preview branch unless explicitly asked." Those rules exist because pushing the wrong branch ships to production.
Those rules live in markdown that the AI reads and the human is told to read. Both are fallible. The non-engineer in particular has no priors for which git push is the dangerous one.

The v1 failure mode is therefore a non-engineer + AI accidentally damaging production systems, not an AI exfiltrating data. Both are real long-term threats — but for v1, the second one isn't load-bearing and the first one is the one that bites this week.

Naming this honestly is the point of the ARD. ARD-0001 wasn't wrong about the long-term security model; it was wrong about which threat v1 is actually defending against.

Decision¶

1. v1 commits to "contain the non-engineer + AI from prod," not "contain the AI from exfiltrating data"¶

Both threats are real and both will eventually be addressed. v1 explicitly picks the first as the load-bearing security story, because:

The Shopify-first dogfood (ARD-0004) puts no sensitive data in the container in the first place.
The deploy-gate failure mode is concrete, frequent, and currently mitigated only by markdown discipline.
A boring that prevents an accidental push to a deploy-mirror branch is materially safer than the status quo. A boring that prevents Liquid template exfiltration is not.

2. Guardrails are a first-class profile schema field¶

New guardrails: block in .boring/profile.yaml, parsed by lib/profile.sh alongside mounts:, forward_ports:, and theme::

guardrails:
  forbid_branches:
    - main
    - dev-preview
  forbid_commands:
    - "gh pr merge"
    - "shopify theme push --live"
    - "git push origin main"
  allowed_claude_tools:
    - read
    - edit
    - grep
    # bash present but wrapped (see below)

Field semantics:

forbid_branches: — branch names that the container's pre-push git hook refuses outright. Defaults derived from the theme: preset (e.g. theme: shopify seeds main, common deploy-preview branches, and any branch that maps to a project's deploy-mirror repo).
forbid_commands: — CLI invocations the in-container shell refuses via a wrapper that shadows the real binary on PATH. Prefix-match against the argv string; refusal is loud and audit-logged.
allowed_claude_tools: — restricted set of MCP/builtin tools Claude can use. Written into the container's ~/.claude/settings.json at build time. Tools omitted from the list are unavailable; tools listed but wrapped (e.g., bash plus forbid_commands) get the wrapper's restrictions.

3. Enforcement lives in the container, not in boring's host process¶

boring (the host CLI) never sits in the loop at push-time, command-time, or tool-call-time. That loop has to be inside the container, because that's where the agent and the non-engineer actually do work, and boring's host process isn't watching when they do.

What boring generates at container-build time, from the resolved guardrails: block:

.git/hooks/pre-push — shell script that reads the resolved forbid_branches: list, inspects the refs being pushed, and exits non-zero with a clear message on match. Installed into every repo the container mounts. Honors core.hooksPath.
/usr/local/bin/<cmd> wrappers — for each entry in forbid_commands:, a shim script earlier on PATH than the real binary. Shim parses argv, refuses on match, otherwise execs the real tool.
~/.claude/settings.json — allowed_claude_tools: translated into Claude's tool-allowlist config. Per-profile, container-local, regenerated on rebuild.

Generated artifacts are owned by boring. Editing them by hand inside the container is supported but will not survive a rebuild — they regenerate from the profile, which is the source of truth.

4. Egress allowlist is repositioned, not eliminated¶

ARD-0001's egress allowlist (and the --learn-mode observation flow) remains the right answer for the "AI exfiltrates data" model. For v1's Shopify case it isn't load-bearing because the code being edited isn't sensitive. It moves from v1 ship-blocker to v1.x.

This is a reasoned skip, not an oversight. ARD-0001's egress section stays valid as written; the work is deferred, not rejected. A cross-link from ARD-0001's egress section back to this ARD belongs on the next edit of ARD-0001.

Closed by ARD-0011. v0.4 ships egress enforcement (iptables-in-container with NET_ADMIN capability) + --learn-mode together — the deferral above is lifted by ARD-0008's release plan. The cross-link this paragraph called for is now installed in ARD-0001's egress section.

5. Audience-specific credentials are a secret-URI concern, not a guardrails concern¶

Three audiences for the same Shopify theme profile:

Audience	`SHOPIFY_THEME_TOKEN` resolution
Internal team	`!secret op://<org-vault>/<project>/THEME_TOKEN`
External collaborator	`!secret op://<their-vault>/<project>/THEME_TOKEN` (scoped, per-person)
Maintainer	No token. Host bind-mount of `~/.config/shopify` per ARD-0004.

This is handled entirely by the ARD-0002 secret-resolver and ARD-0004's mounts: field. guardrails: is repo-state — it means the same thing for every user of the profile. Per-audience differentiation belongs in secret URIs and overlays, not in guardrails. (See Alternatives.)

Consequences¶

Positive¶

Honest about what v1 actually defends against. The v1 demo story matches v1 reality: "non-engineer and AI can iterate on the theme without accidentally shipping to prod."
Guardrails are concrete and enforceable. Pre-push hooks and command wrappers are mechanical, in-container, regenerated from the profile. Not policy in docs.
guardrails: is broadly reusable. Any profile (Django, Rails, internal tooling) gets the same field. Branch-gate and command-gate failure modes aren't Shopify-specific.
The "ARD-0001 was wrong about v1" admission strengthens the ARD habit. Designs evolve; ARDs track the evolution rather than papering over it. This is exactly what the convention exists for.

Negative¶

The egress allowlist — a real differentiator vs. "fancy devcontainer.json" — is deferred. v1 demos become even harder to distinguish from "a devcontainer with extra steps." The pitch narrows to "AI/non-engineer scoped access to existing repos," which is more honest but less impressive.
More upfront schema and codegen. guardrails: adds a third generator output (hooks, wrappers, ~/.claude/settings.json) on top of docker-compose.yml and devcontainer.json.

Neutral¶

ARD-0001's egress section stays valid. It's a v1.x feature now, not a v1 feature. Cross-linking it back to this ARD makes the deferral discoverable.
data_sensitivity and ephemeral DB volumes stay designed-but-unimplemented for v1, same as in ARD-0004. They wake up when the Django case wakes up.

Alternatives Considered (rejected)¶

Skip guardrails for v1; document branch rules in the profile's README. Rejected: docs rot, and accidental damage is the failure mode we're explicitly trying to prevent. Markdown is exactly what a project's CLAUDE.local.md already tries — adding more of the same isn't the fix.
Implement egress + guardrails together for v1. Rejected: egress is a multi-week iptables/proxy prototype (ARD-0001's open item #3); guardrails are a one-day pre-push-hook + command-wrapper feature. Pay the cheap, urgent cost now; defer the expensive, less-urgent one.
Per-user guardrails (audience 1 relaxed, audience 2 strict). Rejected: guardrails are repo-state — they live in the profile and mean the same thing for everyone using it. "No pushing to main" is a property of the repo, not of the human. Per-user behavior here is an anti-pattern; if a user needs to bypass, they fork the profile or use the user-local overlay, both of which are visible and reviewable.
Enforce guardrails in boring's host process. Rejected: boring isn't in the loop when the user pushes or runs a command inside the container. The enforcer has to live where the action happens.

Implementation Order (additions to ARD-0004's order)¶

Insert between ARD-0004's step #4 (cmd_open wiring) and step #5 (real Shopify theme dogfood):

4a. guardrails: schema parsing in lib/profile.sh — alongside mounts:, forward_ports:, theme:. Validation, overlay merge, normalized-JSON emit. Preset-derived defaults from theme: shopify (seeds forbid_branches: with the deploy-repo's protected refs).
4b. Compose generator emits guardrails artifacts into the container — .git/hooks/pre-push for every mounted repo, /usr/local/bin/ wrappers for forbid_commands:, ~/.claude/settings.json for allowed_claude_tools:. Generated at container-build time from the resolved profile.

ARD-0004's step #6 (egress enforcement mechanism) stays deferred to v1.x. The rest of ARD-0004's order is unchanged.