Break the loop. Keep the vibe.
Build, run, and share declarative agent workflows.
Quick Start · Workflow · How It Works · Build Your Own · Built-in Pipelines · Vision Paper
Every agent session starts from zero. You prompt, review, re-prompt, lose context, repeat. The workflow that finally worked? Gone. The prompting strategy you perfected? Lives in your head, not your repo.
This is the conversational loop, and it doesn't compound. Whether you're building features, running security audits, migrating databases, or reviewing code, the pattern is the same: ephemeral sessions that can't be saved, shared, or improved.
DKMV is a framework for declarative agent workflows. Define tasks as YAML components, run them in isolated Docker containers with Claude Code, and compound your workflows over time. Every refinement you make permanently improves every future run.
# 01-scan.yaml — a single task in a component
name: scan
description: Scan codebase for security vulnerabilities
commit: true
prompt: |
Scan the codebase for OWASP Top 10 vulnerabilities.
Write results to `.agent/scan_results.json`.
outputs:
- path: scan_results.json
required: true
save: truedkmv run ./my-security-audit --branch feature/security --var checklist_path=./owasp.md| Ad-hoc Agent Sessions | DKMV Components | |
|---|---|---|
| Sessions | Ephemeral, lost on close | Declarative, versioned in git |
| Improvement | Depends on your memory | Components compound over time |
| Results | Vary by prompting skill | Reproducible from same inputs |
| Cost | Unpredictable token spend | Per-task budgets and turn limits |
| Sharing | "Just prompt it like this..." | Share components like you share code |
Prerequisites: Python 3.12+, Docker, uv, and an Anthropic API key or Claude Code subscription
git clone https://github.com/your-org/dkmv.git && cd dkmv
uv sync
uv run dkmv build # build the sandbox imageDKMV works through git branches. You create a branch, push your inputs (PRD, code, docs), and point DKMV at it. Each command clones the branch inside a fresh container, does its work, commits, and pushes. The next command picks up where the last one left off.
Run dkmv init in your project's root directory. It detects credentials, checks Docker, and creates .dkmv/ with your project config.
cd ~/your-project
dkmv initOnce initialized, --repo becomes optional on all commands — DKMV reads it from .dkmv/config.json.
Create a feature branch with whatever inputs the agent needs — a PRD, existing code, design docs — and push it to GitHub.
git checkout -b feature/auth
# add your PRD, implementation docs, or code changes
git add . && git commit -m "docs: add auth PRD"
git push -u origin feature/authPass --branch to every command. Each component runs in its own isolated container and communicates exclusively through git.
# PRD -> structured implementation docs
dkmv plan --branch feature/auth --prd docs/prds/auth.md
# Implementation docs -> working code (phase by phase)
dkmv dev --branch feature/auth --impl-docs docs/implementation/auth/
# Evaluate -> fix -> re-evaluate
dkmv qa --branch feature/auth --impl-docs docs/implementation/auth/
# Generate docs + open a PR
dkmv docs --branch feature/authYou can run any component individually — plan without dev, or qa on code you wrote yourself. The pipeline is a sequence, not a requirement.
Include extra context files:
Pass local files or directories that aren't in the repo but the agent should see. These are copied into .agent/context/ inside the container and referenced in the agent's instructions.
dkmv plan --branch feature/auth --prd requirements.md \
--context ./client-api-spec.yaml \
--context ./architecture-notes/
dkmv dev --branch feature/auth --impl-docs docs/implementation/auth/ \
--context ./legacy-migration-guide.md--context is available on all commands (plan, dev, qa, docs, run). You can pass it multiple times. Directories are copied recursively. Binary files are skipped automatically.
Run a custom component:
dkmv run ./my-component --branch feature/auth --var some_input=valueSee Build Your Own to create custom components, or Built-in Pipelines for details on each built-in component.
Key design decisions:
- One container per component. Fresh Docker container every time. No shared state, no cross-contamination. Same inputs = same execution environment.
- Git branches as the interface. Components communicate exclusively through git. Every change is a commit with a diff. The whole workflow is auditable with standard git tooling.
- Real-time streaming. Watch the agent think, write code, and run tests in your terminal as it happens — even for 30+ minute tasks.
- Fail-fast. If any task fails, remaining tasks are skipped. No wasted compute on a broken run.
- Host
.dkmv/maps to container.agent/. Clean separation between host-side config and container-side workspace.
Components are the core unit of DKMV. A component is a directory of YAML task files that define what an agent should do, in what order, with what inputs and constraints. Security audits, migrations, refactors, code reviews, data pipelines — anything you can describe to a coding agent.
Create a directory with numbered YAML task files:
my-security-audit/
├── component.yaml # Optional: shared settings, pause points
├── 01-scan.yaml # Task 1: scan for vulnerabilities
├── 02-fix.yaml # Task 2: apply fixes
└── 03-verify.yaml # Task 3: verify fixes
Tasks execute in filename order. All tasks share the same container.
Each task defines context, prompt, and outputs:
name: scan
description: Scan codebase for security vulnerabilities
commit: true
push: false
model: claude-sonnet-4-6 # per-task model override
max_turns: 80
max_budget_usd: 2.00
inputs:
- name: checklist
type: file
src: "{{ checklist_path }}" # Jinja2 template from --var flags
dest: checklist.md
outputs:
- path: scan_results.json
required: true
save: true
instructions: |
- Focus on OWASP Top 10 vulnerabilities
- Check dependencies for known CVEs
- Report findings in JSON format
prompt: |
Scan the codebase for security vulnerabilities.
Use the checklist at `.agent/checklist.md` as a guide.
Write results to `.agent/scan_results.json`.For multi-task components with shared settings and interactive pauses:
name: my-component
description: My custom component
model: claude-sonnet-4-6
max_turns: 80
timeout_minutes: 25
max_budget_usd: 2.00
agent_md_file: "{{ impl_docs_path }}/CLAUDE.md"
inputs:
- name: docs
type: file
src: "{{ impl_docs_path }}"
dest: impl_docs
tasks:
- file: 01-evaluate.yaml
pause_after: true # pause here for user input
- file: 02-fix.yaml
- file: 03-verify.yaml# Run from a path
dkmv run ./my-security-audit --branch feature/security --var checklist_path=./owasp.md
# Register for easy reuse
dkmv register security-audit ./my-security-audit
# Run by name
dkmv run security-audit --branch feature/security --var checklist_path=./owasp.mdVariables come from three sources and are available in all YAML fields via Jinja2:
| Source | Variables | Example |
|---|---|---|
| CLI arguments | repo, branch, feature_name |
{{ branch }} |
| Runtime | component, model, run_id |
{{ run_id }} |
--var flags |
Any key-value pair | {{ prd_path }} |
| Previous tasks | Status, cost, outputs | {{ tasks.scan.status }} |
Every execution parameter resolves through a three-level cascade:
Task YAML -> CLI flags -> Global config
(highest) (lowest)
Set a cheap model for planning and an expensive one for implementation, while keeping sensible defaults everywhere else.
DKMV ships with four components that form a complete development pipeline. These are standard YAML components — you can read, fork, and modify them. Each runs in its own isolated container; the only bridge between them is the git branch.
Converts a PRD into structured implementation documents. Runs 5 sequential tasks with a pause point after analysis so you can make architectural decisions before implementation begins.
dkmv plan --branch feature/auth --prd requirements.md
dkmv plan --branch feature/auth --prd requirements.md --design-docs ./docs/ # include existing design docs
dkmv plan --branch feature/auth --prd requirements.md --auto # skip pause pointsProduces: docs/implementation/{feature}/ with features.md, user_stories.md, phased task files, tasks.md, README.md, and a GUIDE.md that guides the Dev agent.
Takes implementation docs and implements each phase sequentially — writing code, running tests, committing, and pushing. Each phase gets its own budget and turn limit.
dkmv dev --branch feature/auth --impl-docs docs/implementation/auth/
dkmv dev --branch feature/auth --impl-docs docs/implementation/auth/ --feature-name auth-systemProduces: Working code on a feature branch with commits per phase.
Runs an evaluate -> fix -> re-evaluate loop. The first task evaluates in a clean session (read-only). You then choose what to do:
QA Evaluation: FAIL
Tests: 142 total, 138 passed, 4 failed
Issues: 2 critical, 1 high, 3 medium
What would you like to do?
1. Fix issues and re-evaluate (Recommended)
2. Ship as-is (skip fixes)
3. Abort
The re-evaluation runs in a fresh session to avoid bias — the evaluator never sees the fix process.
dkmv qa --branch feature/auth --impl-docs docs/implementation/auth/
dkmv qa --branch feature/auth --impl-docs docs/implementation/auth/ --autoProduces: qa_evaluation.json, qa_report.json, and corresponding markdown reports.
Explores the codebase and generates or updates documentation. Can create a GitHub pull request.
dkmv docs --branch feature/auth --impl-docs docs/implementation/auth/Produces: Updated documentation files, optional PR.
Authentication
DKMV supports two authentication methods:
| Method | Best For | Setup |
|---|---|---|
| API Key | Pay-per-token usage (Claude) | Set ANTHROPIC_API_KEY |
| OAuth | Claude Code subscription (flat rate) | Log in with claude (credentials auto-detected from Keychain / ~/.claude/.credentials.json) |
| Codex | OpenAI Codex CLI | Set CODEX_API_KEY or OPENAI_API_KEY |
dkmv init walks you through choosing and configuring your auth method. You can also set CLAUDE_CODE_OAUTH_TOKEN explicitly. GitHub tokens are auto-discovered from GITHUB_TOKEN, GH_TOKEN, or gh auth token.
When running with Codex, set DKMV_AGENT=codex or pass --agent codex on any run command. OPENAI_API_KEY is automatically used as a fallback for CODEX_API_KEY.
Configuration
Configuration resolves through a cascade (highest priority first):
CLI flags -> Environment variables -> .env file -> .dkmv/config.json -> Built-in defaults
| Variable | Default | Description |
|---|---|---|
ANTHROPIC_API_KEY |
— | Anthropic API key |
CLAUDE_CODE_OAUTH_TOKEN |
— | OAuth token (alternative to API key) |
GITHUB_TOKEN |
— | GitHub token for private repos and PRs |
CODEX_API_KEY |
— | OpenAI API key for Codex agent (OPENAI_API_KEY also accepted) |
DKMV_AGENT |
claude |
Default agent (claude or codex) |
DKMV_MODEL |
claude-sonnet-4-6 |
Default model |
DKMV_MAX_TURNS |
100 |
Max turns per task |
DKMV_IMAGE |
dkmv-sandbox:latest |
Docker sandbox image |
DKMV_OUTPUT_DIR |
./outputs |
Run output directory |
DKMV_TIMEOUT |
30 |
Timeout per run (minutes) |
DKMV_MEMORY |
8g |
Container memory limit |
DKMV_MAX_BUDGET_USD |
— | Cost cap per invocation (USD) |
Managing Runs
Every run is persisted with full config, prompts, stream events, and results.
dkmv runs # list recent runs
dkmv runs --component dev --status completed # filter
dkmv show <run-id> # inspect a run
dkmv attach <run-id> # shell into a running container
dkmv stop <run-id> # stop a container
dkmv clean # remove all DKMV containers.dkmv/runs/<run-id>/
├── config.json # run configuration snapshot
├── result.json # final result (status, cost, duration)
├── tasks_result.json # per-task results
├── stream.jsonl # raw Claude Code stream events
├── prompt_<task>.md # prompt sent for each task
├── claude_md_<task>.md # instructions provided to each task
└── <artifact>.json # saved task outputs
Project Setup
dkmv init # guided, interactive
dkmv init --yes # accept all defaults
dkmv init --repo https://... --name my-projectWhat it creates:
.dkmv/
├── config.json # project config (repo, defaults, credential sources)
├── components.json # custom component registry
└── runs/ # run outputs
What it enables:
--repobecomes optional on all commands- Run outputs stored in
.dkmv/runs/instead of./outputs - Custom components registered by name
dkmv components # list all (built-in + custom)
dkmv register security-audit ./path/to/it # register
dkmv unregister security-audit # unregisterCLI Reference
dkmv plan --branch <name> --prd <path> [--repo <url>] [--design-docs <dir>]
[--feature-name <name>] [--context <path> ...] [--auto]
[--agent <name>] [--model <m>] [--max-turns <n>] [--timeout <min>]
[--max-budget-usd <n>] [--memory <size>]
[--start-task <name-or-index>] [--keep-alive] [--verbose]
dkmv dev --branch <name> --impl-docs <dir> [--repo <url>] [--feature-name <name>]
[--context <path> ...] [--agent <name>] [--model <m>] [--max-turns <n>]
[--timeout <min>] [--max-budget-usd <n>] [--memory <size>]
[--start-phase <n>] [--start-task <name-or-index>]
[--keep-alive] [--verbose]
dkmv qa --branch <name> --impl-docs <dir> [--repo <url>] [--feature-name <name>]
[--context <path> ...] [--auto] [--agent <name>] [--model <m>]
[--max-turns <n>] [--timeout <min>] [--max-budget-usd <n>]
[--memory <size>] [--start-task <name-or-index>]
[--keep-alive] [--verbose]
dkmv docs --branch <name> [--repo <url>] [--impl-docs <dir>] [--create-pr] [--pr-base <branch>]
[--context <path> ...] [--agent <name>] [--model <m>] [--max-turns <n>]
[--timeout <min>] [--max-budget-usd <n>] [--memory <size>]
[--start-task <name-or-index>] [--keep-alive] [--verbose]
dkmv run <component> --branch <name> [--repo <url>] [--feature-name <name>]
[--var KEY=VALUE ...] [--context <path> ...] [--agent <name>]
[--model <m>] [--max-turns <n>] [--timeout <min>] [--max-budget-usd <n>]
[--memory <size>] [--start-task <name-or-index>]
[--keep-alive] [--verbose]
--repois optional when the project is initialized withdkmv init.--contextaccepts files or directories. Pass multiple times for multiple paths.--agentselects the AI agent backend (claudeorcodex). OverridesDKMV_AGENT.--memorysets the Docker container memory limit (e.g.8g,16g). OverridesDKMV_MEMORY. Default:8g.--start-taskskips to a specific task by name or 1-based index.--start-phase(dev only) skips to a phase number.
dkmv init [--yes] [--repo <url>] [--name <name>]
dkmv components
dkmv register <name> <path> [--force]
dkmv unregister <name>dkmv runs [--component <name>] [--status <status>] [--limit <n>]
dkmv show <run-id>
dkmv attach <run-id>
dkmv stop <run-id>
dkmv cleandkmv build [--no-cache] [--claude-version <version>] [--codex-version <version>]DKMV implements Component-Oriented Development — a paradigm where the developer's primary artifact shifts from code written interactively to component definitions: reusable specifications of what an agent should do, in what order, under what constraints, and with what inputs and outputs.
The conversational paradigm is fundamentally limited by three properties:
- Non-composability. You can't package a workflow that worked and reuse it on the next task.
- Non-compoundability. Improvements to your agent orchestration are lost between sessions.
- Non-reproducibility. Two developers prompting the same agent produce different results.
COD solves all three. Pipelines are composable (they're YAML files in a repo), compoundable (each refinement persists), and reproducible (same inputs = same orchestration).
The core thesis: your development pipeline should compound over time, not reset with each session.
Read the full vision paper: vision_paper.md
uv run pytest # run tests
uv run pytest --cov --cov-fail-under=80 # with coverage
uv run pytest -m "not e2e" # skip E2E (need Docker + API key)
uv run ruff check . && uv run ruff format --check . # lint
uv run mypy dkmv/ # type checkdkmv/
├── cli.py # Typer CLI
├── config.py # DKMVConfig (pydantic-settings)
├── project.py # Project config, find_project_root(), get_repo()
├── init.py # dkmv init (credential discovery, Rich UX)
├── registry.py # Component registry (.dkmv/components.json)
├── adapters/
│ ├── __init__.py # get_adapter(), infer_agent_from_model(), validate_agent_model()
│ ├── base.py # AgentAdapter Protocol, StreamResult dataclass
│ ├── claude.py # ClaudeCodeAdapter
│ └── codex.py # CodexCLIAdapter
├── core/
│ ├── models.py # Shared types (BaseResult, SandboxConfig)
│ ├── sandbox.py # SandboxManager (Docker via SWE-ReX)
│ ├── runner.py # RunManager (run tracking, persistence)
│ └── stream.py # StreamParser (stream-json -> Rich terminal)
├── tasks/
│ ├── models.py # TaskDefinition, TaskInput, TaskOutput, CLIOverrides
│ ├── loader.py # TaskLoader (Jinja2 -> YAML -> Pydantic)
│ ├── runner.py # TaskRunner (single-task execution)
│ ├── component.py # ComponentRunner (multi-task orchestration)
│ ├── discovery.py # resolve_component() — built-in, path, or registry
│ ├── manifest.py # ComponentManifest (component.yaml)
│ └── pause.py # Interactive pause points
├── builtins/
│ ├── dev/ # implement-phase.yaml + component.yaml
│ ├── plan/ # 01-analyze -> 05-evaluate-fix + component.yaml
│ ├── qa/ # 01-evaluate, 02-fix, 03-re-evaluate
│ └── docs/ # 01-generate.yaml + component.yaml
└── images/
└── Dockerfile # dkmv-sandbox image (includes Claude Code + Codex CLI)
Built by Tawab Safi · Licensed under Apache 2.0