| name | systematic-debugging |
|---|---|
| description | Use when debugging a failing test, build error, or runtime issue that isn't immediately obvious. Guides a 4-phase root cause analysis instead of random fix attempts. |
| user-invocable | true |
| argument-hint | <error description or failing test> |
Structured 4-phase debugging process. Stops you from guessing and forces root cause identification.
- Test failures with unclear cause
- Build errors after dependency or schema changes
- Runtime exceptions in backend, frontend, or ingestion
- Issues that persist after 2+ fix attempts
Do not change any code yet. Only observe and collect data.
- Read the full error — not just the last line. Stack traces, build logs, test output.
- Reproduce reliably:
# Java mvn test -pl <module> -Dtest=FailingTest#failingMethod # Python source env/bin/activate && cd ingestion python -m pytest tests/unit/path/to/test.py::test_name -v # TypeScript cd openmetadata-ui/src/main/resources/ui yarn test path/to/test.spec.ts # E2E yarn playwright:run --grep "test name"
- Check what changed recently:
git log --oneline -20 git diff HEAD~5 -- <relevant-directory>
- Document findings: Write down the exact error, when it started, and what you've observed.
Based on the evidence, list 2-3 possible causes ranked by likelihood:
1. [Most likely] Schema change in X broke generated model Y
Evidence: error mentions field Z, git log shows schema modified 3 commits ago
2. [Possible] Dependency version mismatch after recent upgrade
Evidence: ClassNotFoundException for new API
3. [Unlikely] Test environment state pollution
Evidence: test passes in isolation but fails in suite
Do not jump to fixing yet. Verify the top hypothesis first.
Test your top hypothesis with the smallest possible experiment:
- Read the suspected code — don't guess what it does
- Add targeted logging or use a debugger if needed
- Check one hypothesis at a time — don't make multiple changes simultaneously
| Symptom | Likely Cause | Verification |
|---|---|---|
ImportError in Python |
Models not regenerated after schema change | make generate then retry |
TypeScript compile error on generated types |
Schema changed, types not rebuilt | yarn parse-schema then retry |
Java NullPointerException in entity |
Missing field in Flyway migration | Check bootstrap/sql/migrations/ |
| Test passes alone, fails in suite | Shared state mutation between tests | Run with --forked (Java) or -x (pytest) |
404 Not Found on API endpoint |
Missing @Path annotation or route config |
Check resource class registration |
| UI component renders blank | Missing i18n key | Check locale/languages/en-us.json |
Connection refused in integration test |
Docker service not running | docker compose -f docker/development/docker-compose.yml ps |
Now — and only now — write the fix:
- Fix only the root cause. Don't "improve" surrounding code.
- Run the failing test — it must pass now.
- Run the broader test suite for the affected module to check for regressions.
- Explain why the fix works in your commit message or response.
- After 3 failed fix attempts: Stop trying random fixes. Re-examine your hypotheses. The root cause is likely different from what you think.
- After 5 failed attempts: Question your assumptions about the architecture. Read more broadly — adjacent modules, recent PRs, dependency changelogs.
- If the fix requires changing test infrastructure: Pause and discuss with the user. The problem might be in the test setup, not the code.
- Shotgun debugging: Changing multiple things at once hoping something works.
- Symptom treating: Adding
try/catchorif nullchecks to suppress errors instead of fixing the cause. - Blame the framework: Before assuming a bug in Dropwizard, Pydantic, or React, verify your own code is correct.
- Skipping reproduction: "I think I know what it is" — prove it with a failing test first.