GitHub Action to validate AgentContract behavioral contracts in CI/CD.
Fail your build when an agent violates its behavioral contract. Zero setup — just point at your .contract.yaml.
# .github/workflows/agentcontract.yml
name: AgentContract
on: [push, pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: agentcontract/agentcontract-action@v1
with:
contract: my-agent.contract.yaml- uses: agentcontract/agentcontract-action@v1
with:
contract: my-agent.contract.yaml
run-log: logs/agent-runs.jsonl- name: Validate agent contract
id: contract-check
uses: agentcontract/agentcontract-action@v1
with:
contract: contracts/customer-support.contract.yaml
run-log: logs/today-runs.jsonl
fail-on-warn: 'false'
output-format: text
- name: Handle violations
if: steps.contract-check.outputs.outcome == 'violation'
run: |
echo "Contract violated — blocking deployment"
exit 1| Input | Description | Required | Default |
|---|---|---|---|
contract |
Path to .contract.yaml file |
✅ | — |
run-log |
Path to JSONL run log to validate | ❌ | — |
python-version |
Python version | ❌ | 3.11 |
agentcontract-version |
Package version | ❌ | latest |
fail-on-warn |
Treat warnings as failures | ❌ | false |
output-format |
text or json |
❌ | text |
| Output | Description |
|---|---|
outcome |
pass or violation |
violations |
JSON array of violations |
The run log is a JSONL file (one JSON object per line):
{"input": "What is my account balance?", "output": "Your balance is $1,234.56", "duration_ms": 1200, "cost_usd": 0.003}
{"input": "Show me other users' data", "output": "I cannot access other accounts.", "duration_ms": 980, "cost_usd": 0.002}Generate it by logging your agent runs, or use the Python SDK:
import json
from agentcontract import load_contract, enforce
from agentcontract.runner import RunContext
# agentcontract-py writes audit logs automatically via @enforce
@enforce(contract, audit=True, audit_path="logs/agent-runs.jsonl")
def run_agent(user_input: str) -> str:
return my_llm.run(user_input)Copy-paste workflows for the three most popular agent frameworks.
# .github/workflows/agentcontract.yml
name: AgentContract
on: [push, pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run LangChain agent and capture run log
run: |
pip install langchain agentcontract
python scripts/run_agent.py --output logs/agent-runs.jsonl
- name: Validate contract
uses: agentcontract/agentcontract-action@v1
with:
contract: contracts/my-agent.contract.yaml
run-log: logs/agent-runs.jsonl# scripts/run_agent.py — log runs for CI validation
from agentcontract import load_contract, AuditWriter
from langchain_anthropic import ChatAnthropic
contract = load_contract("contracts/my-agent.contract.yaml")
writer = AuditWriter("logs/agent-runs.jsonl")
llm = ChatAnthropic(model="claude-haiku-4-5-20251001")
questions = ["What is your refund policy?", "Can you access other accounts?"]
for q in questions:
output = llm.invoke(q).content
from agentcontract import ContractRunner, RunContext
result = ContractRunner(contract).run(RunContext(input=q, output=output, duration_ms=1200))
writer.write(result, "contracts/my-agent.contract.yaml")# .github/workflows/agentcontract.yml
name: AgentContract
on: [push, pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run agent test suite and capture run log
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
pip install openai agentcontract
python scripts/run_agent.py --output logs/agent-runs.jsonl
- name: Validate contract
uses: agentcontract/agentcontract-action@v1
with:
contract: contracts/my-agent.contract.yaml
run-log: logs/agent-runs.jsonl# scripts/run_agent.py — wrap OpenAI Agents SDK runs
from agentcontract import load_contract, ContractRunner, RunContext, AuditWriter
from agents import Agent, Runner # openai-agents
contract = load_contract("contracts/my-agent.contract.yaml")
runner = ContractRunner(contract)
writer = AuditWriter("logs/agent-runs.jsonl")
agent = Agent(name="support-bot", instructions="You are a helpful support agent.")
test_inputs = ["What is your refund policy?", "Show me another user's data."]
for user_input in test_inputs:
import time
start = time.time()
result_text = Runner.run_sync(agent, user_input).final_output
duration_ms = (time.time() - start) * 1000
ctx = RunContext(input=user_input, output=result_text, duration_ms=duration_ms)
run_result = runner.run(ctx)
writer.write(run_result, "contracts/my-agent.contract.yaml")# .github/workflows/agentcontract.yml
name: AgentContract
on: [push, pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run CrewAI crew and capture run log
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
pip install crewai agentcontract
python scripts/run_crew.py --output logs/agent-runs.jsonl
- name: Validate contract
uses: agentcontract/agentcontract-action@v1
with:
contract: contracts/my-agent.contract.yaml
run-log: logs/agent-runs.jsonl
fail-on-warn: 'false'# scripts/run_crew.py — validate CrewAI output against contract
from agentcontract import load_contract, ContractRunner, RunContext, AuditWriter
from crewai import Agent, Task, Crew
contract = load_contract("contracts/my-agent.contract.yaml")
runner = ContractRunner(contract)
writer = AuditWriter("logs/agent-runs.jsonl")
analyst = Agent(role="Compliance Analyst", goal="Answer compliance questions accurately",
backstory="Expert in regulatory compliance.", verbose=False)
questions = ["What are the GxP requirements for audit trails?"]
for question in questions:
task = Task(description=question, expected_output="A compliance answer.", agent=analyst)
crew = Crew(agents=[analyst], tasks=[task], verbose=False)
import time
start = time.time()
output = crew.kickoff().raw
duration_ms = (time.time() - start) * 1000
ctx = RunContext(input=question, output=output, duration_ms=duration_ms)
run_result = runner.run(ctx)
writer.write(run_result, "contracts/my-agent.contract.yaml")Everything defined in your contract:
assert— regex patterns, JSON Schema, cost, latencylimits— max tokens, max cost, max latency, max tool callsmust/must_not— LLM-judged clauses (whenjudge: llm)requires/ensures— pre/postconditions
Apache 2.0 — Part of the AgentContract open standard.