Summary
Build an automated end-to-end test suite for the SQL Sanitizer plugin (originally delivered via #1065, currently at plugins/sql_sanitizer/) that drives the plugin management API to activate, configure, and deactivate the plugin for a given user/tenant, and verifies that SQL injection patterns in tool arguments and outputs are handled correctly across the full matrix of configuration settings.
Packaging prerequisite: before (or alongside) this work, the plugin should be moved out of the main repo into the cpex-plugins repository as a standalone cpex-sql-sanitizer PyPI package (following the #3965 pattern used for pii_filter, secrets_detection, etc.), and hardened — audit of detection patterns against a current SQLi corpus, multi-dialect coverage review, clearer severity/mode semantics, and performance/latency characterization. The e2e suite should target the hardened, cpex-*-packaged version.
Motivation
Scope
Packaging & Hardening (prerequisite)
- Move
plugins/sql_sanitizer/ to the cpex-plugins repo as cpex-sql-sanitizer.
- Audit detection rules against a current SQLi corpus (union-based, boolean-based, time-based, stacked-query, second-order, NoSQL-injection variants where applicable).
- Review and document supported SQL dialects (PostgreSQL, MySQL, SQLite, MSSQL, Oracle) and their known gaps.
- Define explicit modes (
block, sanitize, flag-only) with unambiguous semantics.
- Benchmark per-invocation overhead; publish in the plugin README.
- Add the package to the
[plugins] extra in pyproject.toml; update any kind: references and docs.
Test Harness
- Stand up a ContextForge instance (test container or in-process app) with observability enabled.
- Create a test user/tenant, mint a JWT, and drive all plugin state changes through the plugin management / bindings API — no static YAML edits.
- Wrap common actions (enable plugin, set config, bind to tool, invoke tool, assert on response) in reusable pytest fixtures / helpers.
Configuration Matrix
At minimum, each scenario should vary and assert on:
- Plugin state: disabled → enabled → disabled (confirm state transitions take effect on subsequent invocations without restart).
- Mode:
block, sanitize, flag-only (or whatever modes the hardened plugin exposes).
- SQL dialect: PostgreSQL, MySQL, SQLite, MSSQL (at minimum) — tested individually where dialect config is supported.
- Attack categories: classic union-based, boolean-blind, time-based-blind, stacked queries, comment-injection, encoded payloads — tested individually and in combinations.
- Scope / binding: plugin bound globally vs. per-tool vs. per-tenant; confirm non-bound tools/tenants are unaffected.
- Hook coverage:
tool_pre_invoke (argument sanitization) and tool_post_invoke (output scrubbing of reflected SQL) — verify handling on both inbound args and outbound responses as configured.
- False-positive guard: benign queries and natural-language strings that look SQL-adjacent must pass through unaltered in
sanitize / flag-only modes.
Assertions
For each scenario:
- The response / tool-call payload matches the expected block / sanitize / flag behavior.
- Violations (when applicable) are recorded with the expected category, dialect, and confidence.
- Disabling the plugin mid-test restores pass-through behavior on the next invocation.
- Other users / tools with different bindings are unaffected (isolation check).
- Observability signals (spans, structured logs) reflect plugin activity — useful smoke test that the plugin ran at all.
Proposed Location
Acceptance Criteria
References
Summary
Build an automated end-to-end test suite for the SQL Sanitizer plugin (originally delivered via #1065, currently at
plugins/sql_sanitizer/) that drives the plugin management API to activate, configure, and deactivate the plugin for a given user/tenant, and verifies that SQL injection patterns in tool arguments and outputs are handled correctly across the full matrix of configuration settings.Motivation
Scope
Packaging & Hardening (prerequisite)
plugins/sql_sanitizer/to thecpex-pluginsrepo ascpex-sql-sanitizer.block,sanitize,flag-only) with unambiguous semantics.[plugins]extra inpyproject.toml; update anykind:references and docs.Test Harness
Configuration Matrix
At minimum, each scenario should vary and assert on:
block,sanitize,flag-only(or whatever modes the hardened plugin exposes).tool_pre_invoke(argument sanitization) andtool_post_invoke(output scrubbing of reflected SQL) — verify handling on both inbound args and outbound responses as configured.sanitize/flag-onlymodes.Assertions
For each scenario:
Proposed Location
tests/e2e/plugins/test_sql_sanitizer_e2e.py(new), with shared fixtures intests/e2e/plugins/conftest.py(reusing the harness established for the PII e2e suite in [TESTING]: Automated end-to-end tests for PII filter plugin via plugin management API #4221).make test-e2e-pluginstarget (or extension of existing e2e target) so the suite can be run independently of unit tests.Acceptance Criteria
plugins/sql_sanitizer/relocated tocpex-pluginsrepo ascpex-sql-sanitizer, published to PyPI, and pinned in the[plugins]extra.References
cpex-*plugin packaging patternbinding_reference_idand expanded plugin config schemas