Guard report_shape_db_stats against :noproc during deployments#4142
Guard report_shape_db_stats against :noproc during deployments#4142
Conversation
The telemetry poller calls ShapeDb.statistics/1, which resolves the GenServer pid and then issues a GenServer.call. During stack restarts the Statistics server can exit between those two steps, producing an uncaught :noproc exit from the poller. Mirror the existing try/catch pattern in report_retained_wal_size so these transient races no longer surface as Sentry errors. Refs electric-sql/stratovolt#1452
Claude Code ReviewSummaryThis PR guards What's Working Well
Issues FoundSuggestions (Nice to Have)Inconsistent error containment vs. File:
For a telemetry poller callback, silently swallowing (or logging) all errors is typically the right call — a reporting failure should never crash the poller. The existing pre-PR code also had this gap, so the PR doesn't regress anything, but it's worth making the two functions consistent: catch
:exit, {:noproc, _} ->
:ok
type, reason ->
Logger.warning(
"Failed to query shape DB stats\nError: #{Exception.format(type, reason)}",
stack_id: stack_id
)
endMissing
@spec report_shape_db_stats(Electric.stack_id(), map()) :: :okIssue ConformanceThe PR references Review iteration: 1 | 2026-04-21 |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4142 +/- ##
==========================================
+ Coverage 89.20% 89.24% +0.03%
==========================================
Files 25 25
Lines 2520 2520
Branches 636 635 -1
==========================================
+ Hits 2248 2249 +1
+ Misses 270 269 -1
Partials 2 2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Electric.StackSupervisor.Telemetry.report_shape_db_stats/2callsShapeDb.statistics/1, which resolves the GenServer pid and then issues aGenServer.call. During stack restarts the underlying server can exit between those two steps, producing an uncaught:exit, :noprocfrom the telemetry poller and flooding Sentry with errors on every deployment.The sibling function
report_retained_wal_sizealready handles this race. This change mirrors itstry/catch :exit, {:noproc, _}pattern. Surgical, no behavioural change on the happy path.Refs electric-sql/stratovolt#1452.
Test plan
mix compile --warnings-as-errorsis cleanmix test test/electric/stack_supervisor_test.exspasses