Skip to content

[FEATURE]: Add plugin violation details to OpenTelemetry spans #4267

@vishu-bh

Description

@vishu-bh

🧭 Type of Feature

  • Enhancement to existing functionality
  • New feature or capability
  • New MCP-compliant server
  • New component or integration
  • Developer tooling or test improvement
  • Packaging, automation and deployment (ex: pypi, docker, quay.io, kubernetes, terraform)
  • Other (please describe below)

🧭 Epic

Title: Enhanced Observability for Plugin Violations
Goal: Provide detailed visibility into plugin violations through OpenTelemetry spans, enabling operators to understand why requests were blocked and troubleshoot policy enforcement issues.
Why now: Currently, only a boolean flag (plugin.had_violation=true) is sent to OTEL endpoints without any context about what rule was violated, why it was blocked, or what content triggered the violation. This makes debugging and monitoring plugin behavior difficult.


🧑🏻‍💻 User Story 1

As a: Platform operator monitoring ContextForge in production
I want: Detailed violation information in my observability platform (Jaeger/Zipkin/Langfuse)
So that: I can quickly identify which policies are blocking requests and understand the root cause without digging through application logs

✅ Acceptance Criteria

Scenario: Plugin violation details are captured in OTEL spans
  Given a plugin detects a policy violation
  When the plugin returns a PluginViolation result
  Then the OTEL span should include violation.reason, violation.code, and violation.description attributes
  And the span should be marked with error status
  And the violation details should be sanitized before export

Scenario: Violation details are visible in observability platform
  Given a request was blocked by a plugin
  When I query the trace in my observability platform
  Then I should see the violation code (e.g., "PROHIBITED_CONTENT")
  And I should see the violation reason and description
  And I should see which plugin detected the violation

🧑🏻‍💻 User Story 2

As a: Security engineer reviewing policy enforcement
I want: Aggregated metrics on violation types and frequencies
So that: I can identify patterns in blocked requests and tune policies accordingly

✅ Acceptance Criteria

Scenario: Violation metrics are queryable
  Given multiple plugin violations have occurred
  When I query spans by violation.code attribute
  Then I should see all violations of that type
  And I should be able to aggregate by plugin.name and violation.code
  And I should see trends over time in my observability dashboard

📐 Technical Implementation

Current State (mcpgateway/plugins/framework/manager.py:475-491):

# Only boolean flags are sent
otel_span.set_attribute("plugin.had_violation", result.violation is not None)
otel_span.set_attribute("plugin.modified_payload", result.modified_payload is not None)
otel_span.set_attribute("plugin.continue_processing", result.continue_processing)
otel_span.set_attribute("plugin.stopped_chain", not result.continue_processing)

Proposed Enhancement:

if otel_span is not None:
    otel_span.set_attribute("plugin.had_violation", result.violation is not None)
    otel_span.set_attribute("plugin.modified_payload", result.modified_payload is not None)
    otel_span.set_attribute("plugin.continue_processing", result.continue_processing)
    otel_span.set_attribute("plugin.stopped_chain", not result.continue_processing)
    
    # Add violation details when present
    if result.violation:
        set_span_attribute(otel_span, "plugin.violation.reason", result.violation.reason)
        set_span_attribute(otel_span, "plugin.violation.code", result.violation.code)
        set_span_attribute(otel_span, "plugin.violation.description", result.violation.description)
        
        # Add HTTP status code if present (e.g., 429 for rate limiting)
        if result.violation.http_status_code:
            set_span_attribute(otel_span, "plugin.violation.http_status_code", result.violation.http_status_code)
        
        # Add MCP error code if present
        if result.violation.mcp_error_code:
            set_span_attribute(otel_span, "plugin.violation.mcp_error_code", result.violation.mcp_error_code)
        
        # Sanitize and add violation details (avoid PII/sensitive data)
        if result.violation.details:
            # Use existing sanitize_trace_attribute_value for each detail
            sanitized_details = {
                k: sanitize_trace_attribute_value(f"plugin.violation.details.{k}", v)
                for k, v in result.violation.details.items()
            }
            set_span_attribute(otel_span, "plugin.violation.details", sanitized_details)
        
        # Mark span as error for better visibility
        set_span_error(otel_span, result.violation.description, record_exception=False)

Files to Modify:

  • mcpgateway/plugins/framework/manager.py - Add violation details to span attributes
  • mcpgateway/observability.py - Ensure set_span_attribute handles nested dicts properly
  • tests/unit/mcpgateway/plugins/framework/test_manager.py - Add tests for violation attribute capture

🔗 MCP Standards Check

  • Change adheres to current MCP specifications
  • No breaking changes to existing MCP-compliant integrations
  • Uses existing observability infrastructure (set_span_attribute, set_span_error)

🔄 Alternatives Considered

  1. Log-only approach: Keep current behavior and rely on application logs

    • ❌ Rejected: Logs are separate from traces, making correlation difficult
  2. Add violation as span event instead of attributes: Use span.add_event("plugin.violation", attributes={...})

    • ⚠️ Possible alternative: Events are good for point-in-time occurrences, but attributes are better for filtering/aggregation
  3. Create separate violation spans: Create child spans for each violation

    • ❌ Rejected: Adds complexity and span overhead; attributes are sufficient

📓 Additional Context

Related Code:

  • PluginViolation model: mcpgateway/plugins/framework/models.py:1265-1330
  • Observability service: mcpgateway/services/observability_service.py
  • Existing sanitization: mcpgateway/utils/trace_redaction.py

Security Considerations:

  • Must sanitize violation details before export to prevent PII leakage
  • Use existing sanitize_trace_attribute_value() and sanitize_trace_text() functions
  • Respect OTEL_CAPTURE_IDENTITY_ATTRIBUTES and OTEL_EMIT_LANGFUSE_ATTRIBUTES settings

Benefits:

  • Better debugging of policy enforcement
  • Improved security monitoring and alerting
  • Ability to create dashboards showing violation trends
  • Faster incident response when requests are blocked

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestobservabilityObservability, logging, monitoringpluginstriageIssues / Features awaiting triage

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions