ARD-0023: A tasks: primitive for long-running processes inside the dev container¶
- Status: Proposed
- Date: 2026-05-24
- Deciders: Tom (Claude facilitating)
- Extends: ARD-0007 — adds a fifth profile primitive (
tasks:) alongsideservices:,volumes:,setup:, andrestore:; thesetup:semantics are unchanged - Related: [[ard-0001-v1-architecture]], [[ard-0006-profile-is-the-trust-anchor]], [[ard-0009-guardrails-codegen-architecture]], [[ard-0013-headless-boring-run]], [[ard-0017-agent-workflow-rules-derived-from-guardrails]]
Context¶
boring open today brings a sandbox to ready — dev container built, sidecars healthy, secrets resolved, setup: chain run, audit collector live. It does not start the application. For a contributor opening boring open against (say) an Immich clone, the next thing they have to do is:
docker exec -it -u dev immich-example-dev-1 bash
# inside, in two separate panes:
pnpm --filter immich start:dev # API on :2283
pnpm --filter immich-web dev # web on :3000
This is the whole point of opening the repo, and boring leaves it as homework. Three forces collide:
setup:is a one-shot postCreateCommand by design. Each entry runs sequentially, must return zero, and the chain ends with thesetup-completemarker file. Puttingpnpm --filter immich start:devinsetup:would block forever, the marker would never fire,_cmd_open_verify_setupwould time out, andboring openwould report failure even though the app is happily running. Backgrounding (&) makes it worse: nothing reaps the child, nothing surfaces its logs, andset -edoesn't catch its eventual crash.- The dev container's main command is
sleep infinity. That's deliberate — boring builds a sandbox, not a packaged service. The container exists for someone (a human or an agent) to attach to. Replacingsleep infinitywith a process supervisor changes the abstraction. - Editor-coupled solutions don't generalize. Immich's own
.devcontainer/devcontainer.jsonsolves this with VSCodetasksthat haverunOptions.runOn: folderOpen. That works if and only if the developer opens the folder in VSCode. Anyone using boring from the CLI, from a JetBrains IDE, with a remote-attach Vim, or underboring run(ARD-0013) gets nothing.
The "contributor sandbox" use case — which the v1.0 docs lean into via the Immich example and the broader code-as-thinking-medium framing from ARD-0008 — needs a primitive for "after setup, launch these processes and keep them running until boring tears down."
Decision¶
1. Add a tasks: block to the profile schema¶
A new top-level array, each entry a long-running process to launch in the dev container after setup: succeeds:
tasks:
- name: api
run: pnpm --filter immich start:dev
- name: web
run: pnpm --filter immich-web dev
depends_on: [api]
Schema rules:
name:required, kebab-case, unique within the profile, used as the tmux window name and the on-disk log file basename.run:required, a single shell command string. Runs as thedevuser from/workspacewith the same env thesetup:chain sees.cdsemantics matchsetup:— each task is its own subshell.depends_on:optional, an array of other task names. boring launches tasks in topological order; cycles are a hard schema error. No condition modes in v1 (noservice_healthyequivalent) — see Alternatives. A task that needs another task to be "ready" is responsible for retrying its own connections, the same way it would on a developer's laptop.- Tasks are launched after
setup:completes and afterrestore:runs (ARD-0012). The same lifecycle stage as thesetup-completemarker, but downstream of it. - A profile with no
tasks:block behaves exactly as today — no behavioral change for the four existing examples.
2. Supervision via tmux inside the container¶
boring launches each task in a named window of a single tmux session (session: boring-tasks). tmux ships in every preset's base image (already used as a transitive dep of iproute2 and friends; apt-get install -y tmux is a one-line addition to the four affected Dockerfiles).
The supervisor shape:
- Per task: a tmux window named after
task.name, runningbash -lc "exec <run>"so the process is PID 1 of the window and signals propagate cleanly. stdout/stderr go to tmux scrollback and are tee'd to/var/log/boring/tasks/<name>.logforboring open --logs <name>retrieval (see §3). - Crash policy in v1: no auto-restart. If a task exits (cleanly or not), the tmux window stays open showing the exit code; the user re-runs the command manually or fixes the bug and re-attaches. Auto-restart is intentionally deferred — see Consequences and Alternatives.
- Teardown: on
boring close(a new command — see §4) or SIGINT to theboring openforeground, the audit-collector trap also sendstmux kill-session -t boring-tasks, which SIGHUPs every window. Sidecars come down via the existingdocker compose down.
Why tmux specifically: it's ubiquitous, depends on nothing, gives the user a real terminal multiplexer to inspect/restart tasks by hand, and matches how developers already drive Procfile-style stacks (overmind, hivemind, foreman are all tmux wrappers under the hood). Choosing tmux directly skips the supervisor-of-a-supervisor layer.
3. CLI surface: boring open --tasks/--no-tasks, boring attach, boring logs¶
Three CLI changes, each small:
boring opengains a--no-tasksflag that runs setup but skips task launch. Default behavior with a profile that declarestasks:is to launch them. A profile with notasks:block sees no change either way.boring attach(new command): execstmux attach -t boring-tasksinside the dev container. This is the primary affordance — afterboring openprints "Ready," the user runsboring attachin another terminal and lands in the task session.boring logs <name>(new command): tails/var/log/boring/tasks/<name>.logfrom the host without entering the container. Useful for non-interactive contexts (CI,boring run, agents reading their own task output).
boring run (ARD-0013) explicitly does not launch tasks: — it's a fresh-container one-shot Claude invocation and has no use for long-running side processes. The tasks: block is silently ignored under boring run.
4. boring close as the explicit teardown verb¶
Today, tearing down a boring open sandbox means Ctrl-C in the boring open foreground (which the audit-collector trap catches). That's fine for the no-tasks case but becomes confusing when there's a tmux session to clean up: the user might attach in terminal B, detach (not exit), and then Ctrl-C in terminal A — they'd expect their tmux session to die, and it does, but the affordance is muddled.
Add boring close [path]: a new command that finds the running compose project for the profile, sends tmux kill-session -t boring-tasks to the dev container, runs docker compose down, stops the audit collector, and exits. Ctrl-C in boring open's foreground does the same thing (existing behavior preserved); boring close lets the user tear down from a different shell without finding the original boring open process.
5. The agent-workflow snippet (ARD-0017) gains a "tasks are running in tmux" line¶
ARD-0017's per-profile snippet is regenerated whenever the profile changes. When the profile declares tasks:, append a one-line note to the generated /usr/local/boring/agent/workflow.md per-profile section:
Long-running processes for this profile are running in a tmux session named
boring-tasks. List them withtmux list-windows -t boring-tasks; tail logs withboring logs <name>from the host. Do not kill the session — the user owns lifecycle.
This keeps in-container agents from being confused when they see pnpm processes they didn't start, and gives them the right vocabulary for reading task output.
Consequences¶
Positive¶
- Closes the "boring open and the app isn't running" gap. The Immich example, and every future example for a non-trivial codebase, becomes one-command-and-attach instead of one-command-plus-three-manual-steps. The
code-as-thinking-mediumstory (ARD-0008) gets meaningfully stronger when the thinking medium boots up running. - Decouples from the editor. Same workflow regardless of VSCode / JetBrains / Vim / pure CLI / agent. Immich's existing
.devcontainer/devcontainer.jsonsolution only works in VSCode; ours works everywhere. - Schema addition is small and additive. No existing primitive changes semantics, no existing profile breaks. The four current examples need no edits.
- tmux is the right amount of abstraction. The user can
tmux attach, fix a broken pane by hand, scroll back through logs, restart a process — all using muscle memory they already have. We don't reinvent a supervisor. - Logs persist on the host.
/var/log/boring/tasks/<name>.log(tee'd from each window) survives container restarts and is greppable withoutdocker exec.boring logs <name>is a friendly wrapper.
Negative¶
- Adds a primitive to a deliberately-lean schema. Per ARD-0001, the profile should be small enough to fit on one screen.
tasks:is the fifth top-level array; we're approaching the point where the schema starts feeling busy. - Couples boring to tmux. Anyone debugging task lifecycle ends up learning tmux semantics (window vs. pane, attach vs. detach, kill-window vs. kill-session). That's a small ramp but it's non-zero; users who already hate tmux will hate this.
- "No auto-restart in v1" is going to bite someone. A pnpm dev server that crashes during a hot-reload sometimes wants to be restarted. We're deferring this on purpose (see Alternatives §3) but the first issue filed against
tasks:will be "my task crashed and didn't come back." - Teardown has a new verb (
boring close) that didn't exist before. Two ways to tear down (Ctrl-C in foreground,boring closefrom elsewhere) is mild API surface bloat. The alternative — keep Ctrl-C as the only way — pins users to the originalboring openterminal. - Headless flows (
boring run, CI) silently skiptasks:. Documented, but a footgun for someone who expects their headless test run to have a dev API server available. We'll need an example in theboring rundocs.
Neutral¶
tasks:overlaps conceptually withservices:but they're operationally distinct.services:is for compose-managed sidecars (Postgres, Redis, ML);tasks:is for processes inside the dev container that share its filesystem and have direct access to the bind-mounted source tree. Trying to unify them would require either putting application code into a separate compose service (heavy, breaks bind-mount editor flow) or letting compose services share the dev container's network/PID namespace (annetwork_mode: service:devhack that breaks the trust model). They stay separate.- The five-primitive schema (
services:,volumes:,setup:,restore:,tasks:) maps to a natural mental model: infrastructure I depend on, data I keep, one-shots to prepare the sandbox, prod-shape data to load, long-running things to start. Each primitive has one clear answer to "should this go here?"
Alternatives Considered¶
1. Lean on a Procfile + overmind/foreman inside the container¶
Add tasks: as name: cmd pairs that get emitted to a generated Procfile, then run overmind start (or foreman start) inside the container. Two real advantages: a richer ecosystem (auto-restart, structured logging, overmind connect <name> for attaching to one process), and tasks running under overmind look identical to the developer's local laptop setup if they already use it.
Rejected because:
- Adds a runtime dependency (overmind is Go-binary, foreman is a Ruby gem, honcho is Python). Each preset's Dockerfile grows.
- The supervisor's behavior becomes part of boring's contract — we'd inherit overmind's quirks (its tmux-window-naming choices, its env-handling, its log-tee format) without being able to fix them.
- "tmux directly" is what overmind/hivemind/foreman are under the hood. We can skip the wrapper.
Reconsider in a future ARD if tasks: grows enough surface (per-task healthchecks, structured restart policies, per-task env overrides) that owning the supervisor logic becomes unattractive.
2. Per-task compose services with network_mode: service:dev¶
Model each task as its own docker-compose service that shares the dev container's network/PID namespace, so localhost:2283 from the dev container reaches the API task. Compose already handles supervision (restart:), logging (docker compose logs), and lifecycle.
Rejected because:
- The bind-mount story breaks. Tasks need read/write access to /workspace, which means every task service has to redeclare the same volumes block as the dev container. Drift waiting to happen.
- The trust model gets harder. Each task is its own container with its own permissions; the simple "the dev container runs as UID 1000 with these caps" story splinters.
- Composes that share PID namespaces are awkward (the restart: policy doesn't compose well with network_mode: service:dev; docker compose down ordering gets fragile).
- More fundamentally: tasks aren't services. They're processes that share the workspace. Modeling them as services conflates two concerns the existing schema already separates.
3. Add auto-restart in v1¶
A restart: field per task (never, on-failure, always) with sensible defaults.
Deferred (not rejected) because:
- v1 wants to ship and the manual-restart story (tmux attach, re-run the command in the dead pane) is acceptable.
- Restart policies invite knobs (max_retries, backoff, restart_delay) that turn tasks: into a mini systemd. Better to ship the minimum surface, see how it's used, and add the right knobs in v1.x based on real complaints than guess them upfront.
Revisit in a follow-up ARD once tasks: has been in real-world use for a release cycle.
4. Punt entirely — document the manual workflow in each example README¶
Status quo. The README tells users "after boring open, run pnpm --filter immich start:dev and pnpm --filter immich-web dev." Cost: zero engineering. Value: zero, because that's what we have today and the question that prompted this ARD is "why doesn't boring do that for me."
Rejected because: the gap is real and the framing in ARD-0008 (boring as thinking-medium) becomes hollow when the sandbox boots ready but the app doesn't.
Implementation Order¶
This ARD ships as a single coherent slice; pieces are not independently useful.
- Schema: add
tasks:validation inlib/profile.sh(cycle detection independs_on, name uniqueness,run:non-empty). Tests intests/covering both happy path and each rejection class. - Preset Dockerfiles:
apt-get install -y tmuxadded to the four affected Dockerfiles (shopify,django-node,node-postgres,node,python). One layer change per preset, no version-pin contention. - Codegen: emit
/etc/boring/tasks/launch.shfromlib/compose.sh(or newlib/tasks.sh) — a script that opens the tmux session and creates one window per task in topological order. Bind-mounted RO into the container alongsideboring-runtime/. cmd_openintegration: after_cmd_open_verify_setupconfirms the marker, execbash /etc/boring/tasks/launch.shin the dev container iftasks:is non-empty. Appendtmux kill-sessionto the audit-collector trap.- CLI:
boring attachandboring logs <name>as new subcommands in the top-level dispatcher.boring closeas the third new verb (and the alternative to Ctrl-C). - ARD-0017 integration: extend the per-profile snippet codegen to append the "tasks are in tmux" hint when
tasks:is non-empty. - Examples: update
examples/immich/.boring/profile.yamlto declaretasks:for the API and web servers; update its README to point atboring attachinstead of the manualpnpm --filterlines. The other three examples (minimal,django-postgres,node-with-redis) gain notasks:block — they're intentionally smaller demos. boring doctor: add a tmux-present check for any profile declaringtasks:.
Target release: v0.7 (between ARD-0017's v0.6 codegen slice and the v1.0 cut).