ARD-0005: Security model inversion — contain the non-engineer + AI from production systems¶
- Status: Accepted
- Date: 2026-05-23
- Deciders: Tom (Claude facilitating)
- Amends: ARD-0001 — security framing — and ARD-0004 — implementation order
- Extended by: ARD-0009 (closes §3 codegen deferral), ARD-0011 (closes §4 egress deferral), ARD-0016 (extends containment past container boundary), ARD-0017 (agent workflow defaults), ARD-0018 (extension trust boundary), ARD-0019 (security model applies to second surface), ARD-0022 (silent execution as trust UX)
- Related: [[ard-0001-v1-architecture]], [[ard-0004-shopify-first-as-dogfood-path]]
Context¶
ARD-0001 framed boring's security model as AI containment to prevent exfiltration of prod-shape data: egress allowlists, ephemeral DB volumes derived from data_sensitivity, observation-derived allowlist learning, audit logs of sensitive restores. That framing made sense for the Django-style use case — real customer data restored locally via dbx, a powerful agent loose in the same container, network egress the obvious leak channel.
ARD-0004 picked Shopify-first for v1. That decision quietly inverted the threat model and we hadn't named it yet.
Talking through the actual v1 users — an internal team editing a production Shopify theme repo, external collaborators doing the same, the maintainer simulating both — surfaced what the failure mode really looks like:
- The repo being edited is Liquid templates, not prod data. There is nothing meaningful for an AI to exfiltrate.
- The repo does deploy to a live storefront. Two-repo deploy gates frequently exist in Shopify projects (a source repo paired with a deploy repo Shopify auto-commits into), and project-level rules in
CLAUDE.local.mdare dense with "NEVER push directly to the deploy repo," "NEVER push to the live-preview branch unless explicitly asked." Those rules exist because pushing the wrong branch ships to production. - Those rules live in markdown that the AI reads and the human is told to read. Both are fallible. The non-engineer in particular has no priors for which
git pushis the dangerous one.
The v1 failure mode is therefore a non-engineer + AI accidentally damaging production systems, not an AI exfiltrating data. Both are real long-term threats — but for v1, the second one isn't load-bearing and the first one is the one that bites this week.
Naming this honestly is the point of the ARD. ARD-0001 wasn't wrong about the long-term security model; it was wrong about which threat v1 is actually defending against.
Decision¶
1. v1 commits to "contain the non-engineer + AI from prod," not "contain the AI from exfiltrating data"¶
Both threats are real and both will eventually be addressed. v1 explicitly picks the first as the load-bearing security story, because:
- The Shopify-first dogfood (ARD-0004) puts no sensitive data in the container in the first place.
- The deploy-gate failure mode is concrete, frequent, and currently mitigated only by markdown discipline.
- A boring that prevents an accidental push to a deploy-mirror branch is materially safer than the status quo. A boring that prevents Liquid template exfiltration is not.
2. Guardrails are a first-class profile schema field¶
New guardrails: block in .boring/profile.yaml, parsed by lib/profile.sh alongside mounts:, forward_ports:, and theme::
guardrails:
forbid_branches:
- main
- dev-preview
forbid_commands:
- "gh pr merge"
- "shopify theme push --live"
- "git push origin main"
allowed_claude_tools:
- read
- edit
- grep
# bash present but wrapped (see below)
Field semantics:
forbid_branches:— branch names that the container's pre-push git hook refuses outright. Defaults derived from thetheme:preset (e.g.theme: shopifyseedsmain, common deploy-preview branches, and any branch that maps to a project's deploy-mirror repo).forbid_commands:— CLI invocations the in-container shell refuses via a wrapper that shadows the real binary on PATH. Prefix-match against the argv string; refusal is loud and audit-logged.allowed_claude_tools:— restricted set of MCP/builtin tools Claude can use. Written into the container's~/.claude/settings.jsonat build time. Tools omitted from the list are unavailable; tools listed but wrapped (e.g.,bashplusforbid_commands) get the wrapper's restrictions.
3. Enforcement lives in the container, not in boring's host process¶
boring (the host CLI) never sits in the loop at push-time, command-time, or tool-call-time. That loop has to be inside the container, because that's where the agent and the non-engineer actually do work, and boring's host process isn't watching when they do.
What boring generates at container-build time, from the resolved guardrails: block:
.git/hooks/pre-push— shell script that reads the resolvedforbid_branches:list, inspects the refs being pushed, and exits non-zero with a clear message on match. Installed into every repo the container mounts. Honorscore.hooksPath./usr/local/bin/<cmd>wrappers — for each entry inforbid_commands:, a shim script earlier on PATH than the real binary. Shim parses argv, refuses on match, otherwise execs the real tool.~/.claude/settings.json—allowed_claude_tools:translated into Claude's tool-allowlist config. Per-profile, container-local, regenerated on rebuild.
Generated artifacts are owned by boring. Editing them by hand inside the container is supported but will not survive a rebuild — they regenerate from the profile, which is the source of truth.
4. Egress allowlist is repositioned, not eliminated¶
ARD-0001's egress allowlist (and the --learn-mode observation flow) remains the right answer for the "AI exfiltrates data" model. For v1's Shopify case it isn't load-bearing because the code being edited isn't sensitive. It moves from v1 ship-blocker to v1.x.
This is a reasoned skip, not an oversight. ARD-0001's egress section stays valid as written; the work is deferred, not rejected. A cross-link from ARD-0001's egress section back to this ARD belongs on the next edit of ARD-0001.
Closed by ARD-0011. v0.4 ships egress enforcement (iptables-in-container with
NET_ADMINcapability) +--learn-modetogether — the deferral above is lifted by ARD-0008's release plan. The cross-link this paragraph called for is now installed in ARD-0001's egress section.
5. Audience-specific credentials are a secret-URI concern, not a guardrails concern¶
Three audiences for the same Shopify theme profile:
| Audience | SHOPIFY_THEME_TOKEN resolution |
|---|---|
| Internal team | !secret op://<org-vault>/<project>/THEME_TOKEN |
| External collaborator | !secret op://<their-vault>/<project>/THEME_TOKEN (scoped, per-person) |
| Maintainer | No token. Host bind-mount of ~/.config/shopify per ARD-0004. |
This is handled entirely by the ARD-0002 secret-resolver and ARD-0004's mounts: field. guardrails: is repo-state — it means the same thing for every user of the profile. Per-audience differentiation belongs in secret URIs and overlays, not in guardrails. (See Alternatives.)
Consequences¶
Positive¶
- Honest about what v1 actually defends against. The v1 demo story matches v1 reality: "non-engineer and AI can iterate on the theme without accidentally shipping to prod."
- Guardrails are concrete and enforceable. Pre-push hooks and command wrappers are mechanical, in-container, regenerated from the profile. Not policy in docs.
guardrails:is broadly reusable. Any profile (Django, Rails, internal tooling) gets the same field. Branch-gate and command-gate failure modes aren't Shopify-specific.- The "ARD-0001 was wrong about v1" admission strengthens the ARD habit. Designs evolve; ARDs track the evolution rather than papering over it. This is exactly what the convention exists for.
Negative¶
- The egress allowlist — a real differentiator vs. "fancy devcontainer.json" — is deferred. v1 demos become even harder to distinguish from "a devcontainer with extra steps." The pitch narrows to "AI/non-engineer scoped access to existing repos," which is more honest but less impressive.
- More upfront schema and codegen.
guardrails:adds a third generator output (hooks, wrappers,~/.claude/settings.json) on top ofdocker-compose.ymlanddevcontainer.json.
Neutral¶
- ARD-0001's egress section stays valid. It's a v1.x feature now, not a v1 feature. Cross-linking it back to this ARD makes the deferral discoverable.
data_sensitivityand ephemeral DB volumes stay designed-but-unimplemented for v1, same as in ARD-0004. They wake up when the Django case wakes up.
Alternatives Considered (rejected)¶
- Skip guardrails for v1; document branch rules in the profile's README. Rejected: docs rot, and accidental damage is the failure mode we're explicitly trying to prevent. Markdown is exactly what a project's
CLAUDE.local.mdalready tries — adding more of the same isn't the fix. - Implement egress + guardrails together for v1. Rejected: egress is a multi-week iptables/proxy prototype (ARD-0001's open item #3); guardrails are a one-day pre-push-hook + command-wrapper feature. Pay the cheap, urgent cost now; defer the expensive, less-urgent one.
- Per-user guardrails (audience 1 relaxed, audience 2 strict). Rejected: guardrails are repo-state — they live in the profile and mean the same thing for everyone using it. "No pushing to
main" is a property of the repo, not of the human. Per-user behavior here is an anti-pattern; if a user needs to bypass, they fork the profile or use the user-local overlay, both of which are visible and reviewable. - Enforce guardrails in boring's host process. Rejected: boring isn't in the loop when the user pushes or runs a command inside the container. The enforcer has to live where the action happens.
Implementation Order (additions to ARD-0004's order)¶
Insert between ARD-0004's step #4 (cmd_open wiring) and step #5 (real Shopify theme dogfood):
- 4a.
guardrails:schema parsing inlib/profile.sh— alongsidemounts:,forward_ports:,theme:. Validation, overlay merge, normalized-JSON emit. Preset-derived defaults fromtheme: shopify(seedsforbid_branches:with the deploy-repo's protected refs). - 4b. Compose generator emits guardrails artifacts into the container —
.git/hooks/pre-pushfor every mounted repo,/usr/local/bin/wrappers forforbid_commands:,~/.claude/settings.jsonforallowed_claude_tools:. Generated at container-build time from the resolved profile.
ARD-0004's step #6 (egress enforcement mechanism) stays deferred to v1.x. The rest of ARD-0004's order is unchanged.