feat(client): rate-limit and reuse RPC for connectionless instance APIs#35175
feat(client): rate-limit and reuse RPC for connectionless instance APIs#35175Pengrongkun wants to merge 4 commits intomainfrom
Conversation
- Share a single process-wide RPC client (lazy rpcOpen, closed in taos_cleanup) instead of opening/closing per call, to reduce overhead and churn. - Enforce a fixed sliding window: at most 100 calls per second (register + list combined); return TSDB_CODE_TSC_INSTANCE_API_RATE_LIMIT when exceeded. - Reset the window when the clock moves backwards (NTP) to avoid stuck state. - Always set terrno from taos_list_instances return value so successful calls clear a previous rate-limit errno. Public API remains connectionless (no TAOS*); management endpoints use firstEp/secondEp from client config.
There was a problem hiding this comment.
Code Review
This pull request introduces a client-side rate limiter (100 calls/sec) and a process-wide RPC singleton for instance-related APIs to protect the mnode and optimize resource usage. Review feedback identifies a performance bottleneck where a global mutex is held during blocking network operations, a potential race condition in the cleanup logic where the mutex is destroyed before the ready flag is reset, and inconsistent handling of RPC buffer deallocation between the registration and listing functions.
There was a problem hiding this comment.
Pull request overview
This PR optimizes connectionless client “instance” management APIs by reusing a process-wide RPC client and adding client-side rate limiting to reduce RPC open/close churn and protect mnode from bursts.
Changes:
- Add a new client error code/message for instance API rate limiting.
- Implement a global singleton RPC client for
taos_register_instance/taos_list_instances, cleaned up viataos_cleanup. - Add a fixed-window (100 calls / 1s) rate limiter shared by register+list, plus a unit/integration test.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| source/util/src/terror.c | Adds the error string for the new rate-limit error code. |
| include/util/taoserror.h | Defines TSDB_CODE_TSC_INSTANCE_API_RATE_LIMIT. |
| source/client/src/clientMain.c | Introduces global instance-RPC reuse + shared client-side rate limiting; wires cleanup into taos_cleanup. |
| source/client/test/instanceTest.cpp | Adds a test covering the new rate limiting behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
代码审查报告发现 4 个问题(1 高、1 中、2 低): 🔴 高优先级[1]
修复建议:若 🟡 中优先级[2] 限流器初始化失败会永久屏蔽所有实例 API 调用,且错误码误导
修复建议:限流器初始化失败时,记录日志后放行(fail-open)——限流器故障不应阻断合法 API 调用;若必须 fail-closed,改用 🔵 低优先级[3] 文件以 [4]
|
6953d4f to
912fcc4
Compare
There was a problem hiding this comment.
Pull request overview
Adds client-side safeguards and performance improvements for the connectionless instance management APIs by reusing a single process-wide RPC client and enforcing a per-process rate limit, while extending error reporting and tests accordingly.
Changes:
- Reuse a singleton RPC client for
taos_register_instance/taos_list_instances, with cleanup intaos_cleanup. - Add a fixed-window rate limiter (100 calls / 1s) for instance register/list calls and introduce a dedicated error code/message.
- Add a unit/integration test covering the new rate-limit behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| source/client/src/clientMain.c | Implements global instance RPC singleton + rate limiter; hooks singleton cleanup into taos_cleanup; updates register/list to use the shared RPC. |
| include/util/taoserror.h | Adds TSDB_CODE_TSC_INSTANCE_API_RATE_LIMIT error code. |
| source/util/src/terror.c | Adds human-readable error string for the new rate-limit error code. |
| source/client/test/instanceTest.cpp | Adds a new test case for instance API rate limiting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| instanceRpcRelease(); | ||
| terrno = code; | ||
| return code; |
| static TdThreadMutex gInstRpcMutex; | ||
| static TdThreadCond gInstRpcCond; | ||
| static volatile int32_t gInstRpcMutexReady = 0; | ||
| static volatile int32_t gInstRpcCondReady = 0; | ||
| static void *gInstRpc = NULL; |
| taosMsleep(1000); | ||
|
|
||
| int32_t code = TSDB_CODE_SUCCESS; | ||
| int64_t t0 = taosGetTimestampMs(); | ||
| for (int i = 0; i < 100; i++) { | ||
| code = taos_register_instance("id-1", "taosadapter", "desc:test_instance", 100); | ||
| ASSERT_EQ(code, TSDB_CODE_SUCCESS); | ||
| } | ||
| int64_t t1 = taosGetTimestampMs(); | ||
|
|
||
| code = taos_register_instance("id-1", "taosadapter", "desc:test_instance", 100); | ||
| if (t1 - t0 >= 1000) { | ||
| ASSERT_EQ(code, TSDB_CODE_SUCCESS); | ||
| } else { | ||
| ASSERT_EQ(code, TSDB_CODE_TSC_INSTANCE_API_RATE_LIMIT); | ||
| } |
Public API remains connectionless (no TAOS*); management endpoints use firstEp/secondEp from client config.
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.