Skip to content

[BUG]: Silent success-metric traps in A2A 200-OK path and plugin TOOL_POST_INVOKE mutation #4211

@jonpspri

Description

@jonpspri

🐞 Bug Report

Two pre-existing silent-success traps surfaced during the PR #4204 review cycle for #4202. Both cause metrics_buffer.record_tool_metric(success=True) to be recorded for responses the caller sees as errors, inflating tool success rates and masking real failures in observability. Neither is a #4202 regression — they predate that PR — but they share the same "success flag diverges from tool_result.is_error" root cause the #4202 REST-branch fix addressed, and are natural follow-ups.

🔍 Trap 1 — A2A 200-OK branch ignores JSONRPC error envelopes

Location: mcpgateway/services/tool_service.py — the A2A branch around the elif tool_integration_type == "A2A" dispatch (grep for a2a_agent_endpoint_url). The 200-OK branch sets success = True and tool_result = ToolResult(..., is_error=False) unconditionally, regardless of whether the agent's response_data carries a JSONRPC {"error": {...}} envelope or an application-level failure payload.

Impact: Any A2A agent that legitimately surfaces errors in the body of a 200-OK response (common for JSONRPC-over-HTTP) has every call recorded as success=True. The non-200 HTTP branch also never re-sets success after initialising it to False, so it relies on scope fall-through rather than explicit assignment.

Proposed fix: Route A2A upstream responses through _coerce_to_tool_result (the canonical helper introduced in #4204 / #4210 step 0) and then compute success = not tool_result.is_error, symmetric with the fix applied to REST (success = not getattr(tool_result, "is_error", False) after the optional validator call) and direct-proxy (success = not tool_result.is_error after _coerce_to_tool_result). Separately, teach the A2A branch to detect {"error": ...} / {"isError": true} envelopes in the body and promote them to is_error=True before computing success.

Regression guard: Mirror test_rest_payload_shaped_error_records_success_false_in_metrics for A2A — mock the A2A HTTP client to return a 200 with a JSONRPC error body and assert record_tool_metric receives success=False.

🔍 Trap 2 — Plugin TOOL_POST_INVOKE hook can silently flip error state

Location: mcpgateway/services/tool_service.py — the TOOL_POST_INVOKE plugin invocation after the branch dispatch (roughly lines 5446-5473, grep for TOOL_POST_INVOKE). The plugin returns a modified_payload that can replace tool_result wholesale; the reconstruction at that site calls ToolResult(content=modified_result["content"], structured_content=structured) which drops is_error entirely (Pydantic defaults it to False).

Impact:

  • A plugin that legitimately transforms a successful result into an error payload (e.g. a validation plugin detecting an un-caught bad response) has its intent silently erased — is_error defaults to False and success — already computed earlier — isn't re-synced.
  • The reverse: a plugin that flips an upstream error into a successful recovery still records success=False because success was frozen before the plugin ran.

Proposed fix:

  1. Preserve plugin intent during reconstruction: is_error=modified_result.get("isError", modified_result.get("is_error", False)).
  2. After the post-invoke block, recompute success = not tool_result.is_error so metrics and span attributes reflect the plugin-modified state.

Regression guards: Two tests — one with a plugin that flips is_error True→False (asserts success=True), one that flips False→True (asserts success=False).

📎 Related

🧠 Scope note

Both traps are small, local fixes individually (a dozen lines each) but touching the A2A and plugin-post-invoke paths is enough of a behavioural shift to warrant a dedicated PR, a dedicated regression test, and explicit review — rather than folding them into an already-large #4202 PR.

Metadata

Metadata

Labels

SHOULDP2: Important but not vital; high-value items that are not crucial for the immediate releasebugSomething isn't workingobservabilityObservability, logging, monitoring

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions