metrics: export grpc stats metrics for batch client#1934
metrics: export grpc stats metrics for batch client#1934zyguan wants to merge 3 commits intotikv:masterfrom
Conversation
Signed-off-by: zyguan <zhongyangguan@gmail.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds a gRPC client-side stats.Handler for the TiKV BatchCommands RPC that assigns per-RPC stream IDs, updates per-target and per-stream Prometheus metrics (send/recv times, wire bytes, delayed picks, transparent-retries), registers new metrics, and integrates the handler into batch-enabled connection dialing with tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant GRPC as "gRPC runtime"
participant Handler as "batchClientStatsMonitor"
participant Metrics as "Prometheus"
Client->>Handler: TagRPC(full method = /tikvpb.Tikv/BatchCommands)
Handler->>Handler: allocate global stream ID, attach to context
Handler-->>Client: return context with stream ID
GRPC->>Handler: DelayedPickComplete
Handler->>Metrics: increment DelayedPick counter (target)
GRPC->>Handler: Begin (IsTransparentRetryAttempt=true)
Handler->>Metrics: increment TransparentRetry counter (target)
GRPC->>Handler: OutPayload (WireLength, SentTime)
Handler->>Metrics: add to SendWireBytes (target)
Handler->>Metrics: set SendTime (target, stream)
GRPC->>Handler: InPayload (WireLength, RecvTime)
Handler->>Metrics: add to RecvWireBytes (target)
Handler->>Metrics: set RecvTime (target, stream)
GRPC->>Handler: End
Handler->>Metrics: delete SendTime/RecvTime label entries for (target, stream)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/client/conn_monitor.go`:
- Around line 129-135: The metrics use whole-second precision via
ev.SentTime.Unix()/ev.RecvTime.Unix(), causing precision loss; update
batchClientStreamMetrics.onOutPayload and onInPayload to compute seconds with
sub-second precision by converting UnixNano() to float64 seconds (divide
UnixNano by 1e9 or by float64(time.Second)) and set sendTime/recvTime with that
value; ensure the time package is imported if needed and reference
stats.OutPayload, stats.InPayload, sendTime, recvTime, onOutPayload and
onInPayload when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 10c4c5e0-d3b9-4dea-8cf3-897c8cf24ac3
📒 Files selected for processing (4)
internal/client/conn_monitor.gointernal/client/conn_monitor_test.gointernal/client/conn_pool.gometrics/metrics.go
Signed-off-by: zyguan <zhongyangguan@gmail.com>
Signed-off-by: zyguan <zhongyangguan@gmail.com>
| if i.FullMethodName != batchCommandsMethod { | ||
| return ctx | ||
| } | ||
| return context.WithValue(ctx, batchClientStreamKey{}, atomic.AddUint64(&globalBatchClientStreamID, 1)) |
There was a problem hiding this comment.
Could we associate a conn index with streamID ?
[LGTM Timeline notifier]Timeline:
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: lcwangchao The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/hold |
This PR implements gRPC statistics monitoring for the BatchCommands streaming RPC used in TiKV's batch client. It adds a new
batchClientStatsMonitortype that implements thestats.Handlerinterface from gRPC.The monitor tracks per-stream metrics (send/receive timestamps) and per-target metrics (wire bytes sent/received, delayed pick count, transparent retry count). When a new stream is created, it assigns a unique global ID to track that stream's lifecycle. Metrics are emitted to Prometheus gauges and counters, and properly cleaned up when streams end.
This enables better observability for BatchCommands streams, allowing operators to monitor send/receive timing, traffic volume, and retry behavior.
Summary by CodeRabbit
New Features
Tests