fix: redact file keys and node IDs from telemetry error messages#356
fix: redact file keys and node IDs from telemetry error messages#356shaun0927 wants to merge 4 commits intoGLips:mainfrom
Conversation
The telemetry rollout explicitly promised that file IDs would not be sent, but the current implementation forwards raw error messages after only credential redaction. That leaks file keys and node IDs through actionable endpoint and missing-node messages. This change redacts those identifier patterns before events leave the process and adds focused regression tests. Constraint: Error messages still need enough structure for grouping and debugging after identifier redaction Rejected: Drop error_message entirely | maintainers are actively investing in error analytics and still need sanitized context Confidence: high Scope-risk: narrow Reversibility: clean Directive: Any new telemetry pathway that sends error text must pass through the same identifier redaction rules Tested: ./node_modules/.bin/vitest run src/tests/telemetry-redaction.test.ts; ./node_modules/.bin/tsc --noEmit; ./node_modules/.bin/eslint src/telemetry/client.ts src/tests/telemetry-redaction.test.ts Not-tested: Full pnpm test suite; downstream PostHog dashboards after redaction
The initial regression tests covered file fetch and missing-node messages, but the same identifier leakage can also happen through image render endpoints that use `/images/<fileKey>?ids=...`. This adds that shape explicitly so the sanitizer contract matches the real error surfaces. Constraint: The tests should validate sanitized output without depending on live PostHog traffic Rejected: Fold this into a single broad snapshot test | targeted cases are clearer and easier to maintain as new error formats appear Confidence: high Scope-risk: narrow Reversibility: clean Directive: New telemetry error-message tests should cover each distinct identifier-bearing message shape that can reach captureEvent Tested: ./node_modules/.bin/vitest run src/tests/telemetry-redaction.test.ts; ./node_modules/.bin/tsc --noEmit; ./node_modules/.bin/eslint src/tests/telemetry-redaction.test.ts Not-tested: Full pnpm test suite; live PostHog ingestion after deployment
The sanitizer claims to handle both node-id and ids query parameters, so add explicit coverage for the ids shape too. That makes the branch easier to review and reduces the risk of a later regex tweak only preserving the currently asserted case. Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep telemetry redaction tests aligned with every identifier shape the sanitizer advertises Tested: /Users/jh0927/Workspace/figma-context-mcp-diff-audit/node_modules/.bin/vitest run src/tests/telemetry-redaction.test.ts; /Users/jh0927/Workspace/figma-context-mcp-diff-audit/node_modules/.bin/eslint src/tests/telemetry-redaction.test.ts Not-tested: Full pnpm test suite
|
Thanks for catching this — the privacy gap is real and worth closing. I'd like to take it in a slightly different shape before merging though. Redact at the source, not at the telemetry boundary. The codebase already has structured error construction ( Tests should exercise the real leak paths. The current tests construct synthetic strings shaped like the real errors. That couples them to the regex implementation rather than the actual contract. Better: feed an actual Concretely, the If you're up for restructuring along those lines I'd be glad to merge it. Happy to keep the regex pass as a belt-and-braces fallback, but I'd want the primary fix to be structural so the privacy claim isn't one copy-edit away from being false again. |
Maintainer feedback on GLips#356 asked for a structural fix: errors should carry a safe shape, and telemetry should emit that — rather than forwarding `error.message` and scrubbing it on the way out with a regex that chases current phrasings. This extends the existing error-meta contract with a `safe_message` template. Producers that know their error strings can contain Figma file keys or node IDs attach an identifier-free template at throw time: - `fetch-json.ts` tags HTTP errors with `Fetch failed with status {code}` - `fetch-json.ts` tags proxy/network wraps with `Network connection failed ({code})` - `figma.ts` tags the 429, 403, and generic request wraps with endpoint-free templates - `design-extractor.ts` tags the missing-node path with an ID-free template Telemetry's `captureEvent` now reads `meta.safe_message` in preference to `error.message`. The regex pass in `client.ts` stays in place as a belt-and-braces fallback for any error path that forgets to tag a template (or that we don't own, e.g. third-party libraries). Crucially, the test suite no longer constructs synthetic strings shaped like the real errors. It drives the actual producers — a mocked 403 through `FigmaService.getRawFile` / `getNodeRenderUrls`, and the real null-node path through `simplifyRawFigmaObject` — and asserts on what telemetry would capture. That way a future reword of any upstream error message (e.g. "Could not find node X") cannot silently reopen the leak: the contract the tests enforce is the template + category, not the regex. The original untagged-error regression test stays as a fourth case, explicitly covering the fallback path. Constraint: Keep user-facing `error.message` content actionable (endpoint context for 403, proxy hint for network errors) — only redact in the telemetry view Rejected: Drop `error_message` entirely from telemetry | maintainers want actionable error analytics; templated strings preserve that without identifiers Rejected: Scrub the message after the fact only | silent breakage on any message reword is the specific failure mode we're closing Confidence: high Scope-risk: narrow Reversibility: clean Directive: Any new thrown error whose message can contain Figma file keys, node IDs, or endpoint paths must attach a `safe_message` template via `tagError` Tested: ./node_modules/.bin/vitest run src/tests/telemetry-redaction.test.ts src/tests/error-meta.test.ts; ./node_modules/.bin/tsc --noEmit; ./node_modules/.bin/eslint src/telemetry/client.ts src/telemetry/capture.ts src/utils/error-meta.ts src/utils/fetch-json.ts src/services/figma.ts src/extractors/design-extractor.ts src/tests/telemetry-redaction.test.ts Not-tested: Full pnpm test suite (unrelated pre-existing stdio/server `spawn tsx ENOENT` failures present on the branch before this change); live PostHog ingestion after deployment
Summary
Closes #354
/files/<fileKey>and/images/<fileKey>fragments before telemetry leaves the processnode-id=...,ids=..., andNode <nodeId> was not foundmessage shapesids=...This preserves the current telemetry design (including actionable error analytics) while bringing the implementation back in line with the privacy claim from
#342.Why this scope
I kept the patch intentionally small:
That should make it easier to merge even if
#346continues evolving the broader PostHog integration.Test plan
./node_modules/.bin/vitest run src/tests/telemetry-redaction.test.ts./node_modules/.bin/tsc --noEmit./node_modules/.bin/eslint src/tests/telemetry-redaction.test.tspnpm test