ARD-0023: A `tasks:` primitive for long-running processes inside the dev container¶

Status: Proposed
Date: 2026-05-24
Deciders: Tom (Claude facilitating)
Extends: ARD-0007 — adds a fifth profile primitive (tasks:) alongside services:, volumes:, setup:, and restore:; the setup: semantics are unchanged
Related: [[ard-0001-v1-architecture]], [[ard-0006-profile-is-the-trust-anchor]], [[ard-0009-guardrails-codegen-architecture]], [[ard-0013-headless-boring-run]], [[ard-0017-agent-workflow-rules-derived-from-guardrails]]

Context¶

boring open today brings a sandbox to ready — dev container built, sidecars healthy, secrets resolved, setup: chain run, audit collector live. It does not start the application. For a contributor opening boring open against (say) an Immich clone, the next thing they have to do is:

docker exec -it -u dev immich-example-dev-1 bash
# inside, in two separate panes:
pnpm --filter immich start:dev      # API on :2283
pnpm --filter immich-web dev        # web on :3000

This is the whole point of opening the repo, and boring leaves it as homework. Three forces collide:

setup: is a one-shot postCreateCommand by design. Each entry runs sequentially, must return zero, and the chain ends with the setup-complete marker file. Putting pnpm --filter immich start:dev in setup: would block forever, the marker would never fire, _cmd_open_verify_setup would time out, and boring open would report failure even though the app is happily running. Backgrounding (&) makes it worse: nothing reaps the child, nothing surfaces its logs, and set -e doesn't catch its eventual crash.
The dev container's main command is sleep infinity. That's deliberate — boring builds a sandbox, not a packaged service. The container exists for someone (a human or an agent) to attach to. Replacing sleep infinity with a process supervisor changes the abstraction.
Editor-coupled solutions don't generalize. Immich's own .devcontainer/devcontainer.json solves this with VSCode tasks that have runOptions.runOn: folderOpen. That works if and only if the developer opens the folder in VSCode. Anyone using boring from the CLI, from a JetBrains IDE, with a remote-attach Vim, or under boring run (ARD-0013) gets nothing.

The "contributor sandbox" use case — which the v1.0 docs lean into via the Immich example and the broader code-as-thinking-medium framing from ARD-0008 — needs a primitive for "after setup, launch these processes and keep them running until boring tears down."

Decision¶

1. Add a `tasks:` block to the profile schema¶

A new top-level array, each entry a long-running process to launch in the dev container after setup: succeeds:

tasks:
  - name: api
    run: pnpm --filter immich start:dev
  - name: web
    run: pnpm --filter immich-web dev
    depends_on: [api]

Schema rules:

name: required, kebab-case, unique within the profile, used as the tmux window name and the on-disk log file basename.
run: required, a single shell command string. Runs as the dev user from /workspace with the same env the setup: chain sees. cd semantics match setup: — each task is its own subshell.
depends_on: optional, an array of other task names. boring launches tasks in topological order; cycles are a hard schema error. No condition modes in v1 (no service_healthy equivalent) — see Alternatives. A task that needs another task to be "ready" is responsible for retrying its own connections, the same way it would on a developer's laptop.
Tasks are launched after setup: completes and after restore: runs (ARD-0012). The same lifecycle stage as the setup-complete marker, but downstream of it.
A profile with no tasks: block behaves exactly as today — no behavioral change for the four existing examples.

2. Supervision via tmux inside the container¶

boring launches each task in a named window of a single tmux session (session: boring-tasks). tmux ships in every preset's base image (already used as a transitive dep of iproute2 and friends; apt-get install -y tmux is a one-line addition to the four affected Dockerfiles).

The supervisor shape:

Per task: a tmux window named after task.name, running bash -lc "exec <run>" so the process is PID 1 of the window and signals propagate cleanly. stdout/stderr go to tmux scrollback and are tee'd to /var/log/boring/tasks/<name>.log for boring open --logs <name> retrieval (see §3).
Crash policy in v1: no auto-restart. If a task exits (cleanly or not), the tmux window stays open showing the exit code; the user re-runs the command manually or fixes the bug and re-attaches. Auto-restart is intentionally deferred — see Consequences and Alternatives.
Teardown: on boring close (a new command — see §4) or SIGINT to the boring open foreground, the audit-collector trap also sends tmux kill-session -t boring-tasks, which SIGHUPs every window. Sidecars come down via the existing docker compose down.

Why tmux specifically: it's ubiquitous, depends on nothing, gives the user a real terminal multiplexer to inspect/restart tasks by hand, and matches how developers already drive Procfile-style stacks (overmind, hivemind, foreman are all tmux wrappers under the hood). Choosing tmux directly skips the supervisor-of-a-supervisor layer.

3. CLI surface: `boring open --tasks/--no-tasks`, `boring attach`, `boring logs`¶

Three CLI changes, each small:

boring open gains a --no-tasks flag that runs setup but skips task launch. Default behavior with a profile that declares tasks: is to launch them. A profile with no tasks: block sees no change either way.
boring attach (new command): execs tmux attach -t boring-tasks inside the dev container. This is the primary affordance — after boring open prints "Ready," the user runs boring attach in another terminal and lands in the task session.
boring logs <name> (new command): tails /var/log/boring/tasks/<name>.log from the host without entering the container. Useful for non-interactive contexts (CI, boring run, agents reading their own task output).

boring run (ARD-0013) explicitly does not launch tasks: — it's a fresh-container one-shot Claude invocation and has no use for long-running side processes. The tasks: block is silently ignored under boring run.

4. `boring close` as the explicit teardown verb¶

Today, tearing down a boring open sandbox means Ctrl-C in the boring open foreground (which the audit-collector trap catches). That's fine for the no-tasks case but becomes confusing when there's a tmux session to clean up: the user might attach in terminal B, detach (not exit), and then Ctrl-C in terminal A — they'd expect their tmux session to die, and it does, but the affordance is muddled.

Add boring close [path]: a new command that finds the running compose project for the profile, sends tmux kill-session -t boring-tasks to the dev container, runs docker compose down, stops the audit collector, and exits. Ctrl-C in boring open's foreground does the same thing (existing behavior preserved); boring close lets the user tear down from a different shell without finding the original boring open process.

5. The agent-workflow snippet (ARD-0017) gains a "tasks are running in tmux" line¶

ARD-0017's per-profile snippet is regenerated whenever the profile changes. When the profile declares tasks:, append a one-line note to the generated /usr/local/boring/agent/workflow.md per-profile section:

Long-running processes for this profile are running in a tmux session named boring-tasks. List them with tmux list-windows -t boring-tasks; tail logs with boring logs <name> from the host. Do not kill the session — the user owns lifecycle.

This keeps in-container agents from being confused when they see pnpm processes they didn't start, and gives them the right vocabulary for reading task output.

Consequences¶

Positive¶

Closes the "boring open and the app isn't running" gap. The Immich example, and every future example for a non-trivial codebase, becomes one-command-and-attach instead of one-command-plus-three-manual-steps. The code-as-thinking-medium story (ARD-0008) gets meaningfully stronger when the thinking medium boots up running.
Decouples from the editor. Same workflow regardless of VSCode / JetBrains / Vim / pure CLI / agent. Immich's existing .devcontainer/devcontainer.json solution only works in VSCode; ours works everywhere.
Schema addition is small and additive. No existing primitive changes semantics, no existing profile breaks. The four current examples need no edits.
tmux is the right amount of abstraction. The user can tmux attach, fix a broken pane by hand, scroll back through logs, restart a process — all using muscle memory they already have. We don't reinvent a supervisor.
Logs persist on the host. /var/log/boring/tasks/<name>.log (tee'd from each window) survives container restarts and is greppable without docker exec. boring logs <name> is a friendly wrapper.

Negative¶

Adds a primitive to a deliberately-lean schema. Per ARD-0001, the profile should be small enough to fit on one screen. tasks: is the fifth top-level array; we're approaching the point where the schema starts feeling busy.
Couples boring to tmux. Anyone debugging task lifecycle ends up learning tmux semantics (window vs. pane, attach vs. detach, kill-window vs. kill-session). That's a small ramp but it's non-zero; users who already hate tmux will hate this.
"No auto-restart in v1" is going to bite someone. A pnpm dev server that crashes during a hot-reload sometimes wants to be restarted. We're deferring this on purpose (see Alternatives §3) but the first issue filed against tasks: will be "my task crashed and didn't come back."
Teardown has a new verb (boring close) that didn't exist before. Two ways to tear down (Ctrl-C in foreground, boring close from elsewhere) is mild API surface bloat. The alternative — keep Ctrl-C as the only way — pins users to the original boring open terminal.
Headless flows (boring run, CI) silently skip tasks:. Documented, but a footgun for someone who expects their headless test run to have a dev API server available. We'll need an example in the boring run docs.

Neutral¶

tasks: overlaps conceptually with services: but they're operationally distinct. services: is for compose-managed sidecars (Postgres, Redis, ML); tasks: is for processes inside the dev container that share its filesystem and have direct access to the bind-mounted source tree. Trying to unify them would require either putting application code into a separate compose service (heavy, breaks bind-mount editor flow) or letting compose services share the dev container's network/PID namespace (an network_mode: service:dev hack that breaks the trust model). They stay separate.
The five-primitive schema (services:, volumes:, setup:, restore:, tasks:) maps to a natural mental model: infrastructure I depend on, data I keep, one-shots to prepare the sandbox, prod-shape data to load, long-running things to start. Each primitive has one clear answer to "should this go here?"

Alternatives Considered¶

1. Lean on a Procfile + overmind/foreman inside the container¶

Add tasks: as name: cmd pairs that get emitted to a generated Procfile, then run overmind start (or foreman start) inside the container. Two real advantages: a richer ecosystem (auto-restart, structured logging, overmind connect <name> for attaching to one process), and tasks running under overmind look identical to the developer's local laptop setup if they already use it.

Rejected because: - Adds a runtime dependency (overmind is Go-binary, foreman is a Ruby gem, honcho is Python). Each preset's Dockerfile grows. - The supervisor's behavior becomes part of boring's contract — we'd inherit overmind's quirks (its tmux-window-naming choices, its env-handling, its log-tee format) without being able to fix them. - "tmux directly" is what overmind/hivemind/foreman are under the hood. We can skip the wrapper.

Reconsider in a future ARD if tasks: grows enough surface (per-task healthchecks, structured restart policies, per-task env overrides) that owning the supervisor logic becomes unattractive.

2. Per-task compose services with `network_mode: service:dev`¶

Model each task as its own docker-compose service that shares the dev container's network/PID namespace, so localhost:2283 from the dev container reaches the API task. Compose already handles supervision (restart:), logging (docker compose logs), and lifecycle.

Rejected because: - The bind-mount story breaks. Tasks need read/write access to /workspace, which means every task service has to redeclare the same volumes block as the dev container. Drift waiting to happen. - The trust model gets harder. Each task is its own container with its own permissions; the simple "the dev container runs as UID 1000 with these caps" story splinters. - Composes that share PID namespaces are awkward (the restart: policy doesn't compose well with network_mode: service:dev; docker compose down ordering gets fragile). - More fundamentally: tasks aren't services. They're processes that share the workspace. Modeling them as services conflates two concerns the existing schema already separates.

3. Add auto-restart in v1¶

A restart: field per task (never, on-failure, always) with sensible defaults.

Deferred (not rejected) because: - v1 wants to ship and the manual-restart story (tmux attach, re-run the command in the dead pane) is acceptable. - Restart policies invite knobs (max_retries, backoff, restart_delay) that turn tasks: into a mini systemd. Better to ship the minimum surface, see how it's used, and add the right knobs in v1.x based on real complaints than guess them upfront.

Revisit in a follow-up ARD once tasks: has been in real-world use for a release cycle.

4. Punt entirely — document the manual workflow in each example README¶

Status quo. The README tells users "after boring open, run pnpm --filter immich start:dev and pnpm --filter immich-web dev." Cost: zero engineering. Value: zero, because that's what we have today and the question that prompted this ARD is "why doesn't boring do that for me."

Rejected because: the gap is real and the framing in ARD-0008 (boring as thinking-medium) becomes hollow when the sandbox boots ready but the app doesn't.

Implementation Order¶

This ARD ships as a single coherent slice; pieces are not independently useful.

Schema: add tasks: validation in lib/profile.sh (cycle detection in depends_on, name uniqueness, run: non-empty). Tests in tests/ covering both happy path and each rejection class.
Preset Dockerfiles: apt-get install -y tmux added to the four affected Dockerfiles (shopify, django-node, node-postgres, node, python). One layer change per preset, no version-pin contention.
Codegen: emit /etc/boring/tasks/launch.sh from lib/compose.sh (or new lib/tasks.sh) — a script that opens the tmux session and creates one window per task in topological order. Bind-mounted RO into the container alongside boring-runtime/.
cmd_open integration: after _cmd_open_verify_setup confirms the marker, exec bash /etc/boring/tasks/launch.sh in the dev container if tasks: is non-empty. Append tmux kill-session to the audit-collector trap.
CLI: boring attach and boring logs <name> as new subcommands in the top-level dispatcher. boring close as the third new verb (and the alternative to Ctrl-C).
ARD-0017 integration: extend the per-profile snippet codegen to append the "tasks are in tmux" hint when tasks: is non-empty.
Examples: update examples/immich/.boring/profile.yaml to declare tasks: for the API and web servers; update its README to point at boring attach instead of the manual pnpm --filter lines. The other three examples (minimal, django-postgres, node-with-redis) gain no tasks: block — they're intentionally smaller demos.
boring doctor: add a tmux-present check for any profile declaring tasks:.

Target release: v0.7 (between ARD-0017's v0.6 codegen slice and the v1.0 cut).

ARD-0023: A tasks: primitive for long-running processes inside the dev container¶