Conversation
parser: 无主键时回退到第一个时间戳列 builtins: 放宽 elapsed 等函数主键检查 docs: 中文需求文档(行为对比/函数分类) test: 54条SQL + 298行期望输出(文件比较模式)
There was a problem hiding this comment.
Code Review
This pull request introduces the 'Timeline Fallback' feature, which allows timeline and window functions to automatically use the first available TIMESTAMP column when the primary key timestamp is missing. The changes include a new design document, modifications to the timestamp validation logic in builtins.c, and the addition of comprehensive test cases. Review feedback correctly identifies that several core implementation files described in the design document (such as parTranslater.c) are missing from the current patch. Additionally, suggestions were made to rename validation functions to reflect their expanded scope and to replace placeholder Jira IDs in the test documentation.
|
|
||
| ## 4. 实现要点 | ||
|
|
||
| ### 4.1 修改点1:`findAndSetTempTableColumn()` (parTranslater.c) |
| static int32_t checkPrimTS(SNode* pNode, bool* isMatch) { | ||
| int32_t code = TSDB_CODE_SUCCESS; | ||
| if (nodeType(pNode) != QUERY_NODE_COLUMN || !IS_TIMESTAMP_TYPE(getSDataTypeFromNode(pNode)->type) || | ||
| !((SColumnNode*)pNode)->isPrimTs) { | ||
| if (nodeType(pNode) != QUERY_NODE_COLUMN || !IS_TIMESTAMP_TYPE(getSDataTypeFromNode(pNode)->type)) { |
|
|
||
| Since: v3.4.1.0 | ||
| Labels: function, timeseries, timeline_fallback, subquery | ||
| Jira: TS-XXXX |
There was a problem hiding this comment.
Pull request overview
该 PR 为“时间线回退(Timeline Fallback)”补齐解析器/内置函数参数校验的支持,并新增对应的文档与基于文件对比的 SQL 回归用例,以便在子查询/UNION ALL 等无主键时间列场景下,尽可能回退到首个可用 TIMESTAMP 列建立时间线。
Changes:
- parser:临时表找不到
_rowts时回退到投射列表中的首个 TIMESTAMP 表达式;并放宽 ORDER BY / SESSION 窗口的时间线判定 - builtins:放宽
FUNC_PARAM_MUST_BE_PRIMTS的校验逻辑(允许非主键 TIMESTAMP) - tests/docs:新增中文需求设计文档与覆盖多场景的 SQL 文件对比测试,并接入 CI task
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test/ci/cases.task | 将新增的 timeline fallback 用例纳入 CI 执行列表 |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_timeline_fallback.py | 新增基于 .in/.ans 文件对比的 pytest 驱动用例 |
| test/cases/11-Functions/04-Timeseries/in/test_timeline_fallback.in | 新增 54 条 SQL,覆盖无主键/UNION/窗口/边界与仍需报错场景 |
| test/cases/11-Functions/04-Timeseries/ans/test_timeline_fallback.ans | 新增 298 行期望输出(含报错信息)用于回归对比 |
| source/libs/parser/src/parTranslater.c | 实现临时表 _rowts 回退、ORDER BY 时间线判定放宽、SESSION 列校验放宽 |
| source/libs/function/src/builtins.c | 放宽 checkPrimTS():仅要求 TIMESTAMP 列,不再要求 isPrimTs |
| docs/design/timeline-fallback-requirement.md | 新增中文需求/行为对比/实现点/测试覆盖说明文档 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (!isPrimaryKeyImpl((SNode*)pSession->pCol) && | ||
| !IS_TIMESTAMP_TYPE(((SColumnNode*)pSession->pCol)->node.resType.type)) { | ||
| PAR_ERR_RET(generateSyntaxErrMsg(&pCxt->msgBuf, TSDB_CODE_PAR_INTER_SESSION_COL)); |
There was a problem hiding this comment.
checkSessionWindow() now casts pSession->pCol to SColumnNode* to read node.resType.type without first verifying the node is a column. If pCol is ever not QUERY_NODE_COLUMN (e.g., parser produced an expression), this becomes undefined behavior and can crash. Add an explicit nodeType(pSession->pCol) == QUERY_NODE_COLUMN guard (and preferably use getSDataTypeFromNode() / IS_TIMESTAMP_TYPE() for the type check) before dereferencing.
| SNode* pOrder = ((SOrderByExprNode*)nodesListGetNode(pSelect->pOrderByList, 0))->pExpr; | ||
| if ((QUERY_NODE_TEMP_TABLE == nodeType(pSelect->pFromTable) && isPrimaryKeyImpl(pOrder)) || | ||
| (QUERY_NODE_TEMP_TABLE != nodeType(pSelect->pFromTable) && isPrimaryKeyImpl(pOrder))) { | ||
| (QUERY_NODE_TEMP_TABLE != nodeType(pSelect->pFromTable) && isPrimaryKeyImpl(pOrder)) || | ||
| TSDB_DATA_TYPE_TIMESTAMP == ((SExprNode*)pOrder)->resType.type) { | ||
| pSelect->timeLineResMode = TIME_LINE_GLOBAL; |
There was a problem hiding this comment.
resetResultTimeline() sets TIME_LINE_GLOBAL when the first ORDER BY expression has resType TIMESTAMP, even if the expression is a literal/constant or otherwise does not reference any input column. This can incorrectly treat queries like ORDER BY cast(0 as timestamp) as having a valid timeline, potentially allowing timeline/window functions to run with an undefined row order. Consider requiring the ORDER BY expr to reference at least one column (e.g., nodesExprHasColumn(pOrder)) and/or restricting this to timestamp columns (or expressions derived from columns) rather than constants.
| SNode* pOrder = ((SOrderByExprNode*)nodesListGetNode(pSetOperator->pOrderByList, 0))->pExpr; | ||
| if (isPrimaryKeyImpl(pOrder)) { | ||
| if (isPrimaryKeyImpl(pOrder) || TSDB_DATA_TYPE_TIMESTAMP == ((SExprNode*)pOrder)->resType.type) { | ||
| pSetOperator->timeLineResMode = TIME_LINE_GLOBAL; |
There was a problem hiding this comment.
translateSetOperOrderBy() uses the same resType.type == TIMESTAMP shortcut as resetResultTimeline(), which will also mark UNION/SET queries as TIME_LINE_GLOBAL even when ORDER BY is a constant timestamp expression (no column reference). That can make timeline-dependent functions/windows accept inputs with no well-defined ordering. Apply the same safeguard here (e.g., require nodesExprHasColumn(pOrder) or ensure ORDER BY is a timestamp column/expression derived from columns).
| // Track first TIMESTAMP column for fallback | ||
| if (pFirstTsExpr == NULL && TSDB_DATA_TYPE_TIMESTAMP == pExpr->resType.type) { | ||
| pFirstTsExpr = pExpr; | ||
| } | ||
| } | ||
| // Fallback: when _rowts not found but internal primary key col requested, use first TIMESTAMP col | ||
| if (!*pFound && isInternalPrimaryKey(pCol) && pFirstTsExpr != NULL) { | ||
| pFoundExpr = pFirstTsExpr; | ||
| *pFound = true; |
There was a problem hiding this comment.
In findAndSetTempTableColumn(), the fallback tracks the first projected expression whose resType is TIMESTAMP, but the requirement/doc says “第一个 TIMESTAMP 列”. As written, a computed/constant TIMESTAMP expression (e.g., now()/cast(...)) could be selected as the fallback timeline, which may not provide a meaningful or stable row order. Consider limiting the fallback to QUERY_NODE_COLUMN timestamp columns (or at least prefer real columns over non-column expressions).
| int32_t code = TSDB_CODE_SUCCESS; | ||
| if (nodeType(pNode) != QUERY_NODE_COLUMN || !IS_TIMESTAMP_TYPE(getSDataTypeFromNode(pNode)->type) || | ||
| !((SColumnNode*)pNode)->isPrimTs) { | ||
| if (nodeType(pNode) != QUERY_NODE_COLUMN || !IS_TIMESTAMP_TYPE(getSDataTypeFromNode(pNode)->type)) { |
There was a problem hiding this comment.
checkPrimTS() no longer enforces ((SColumnNode*)pNode)->isPrimTs, which broadens validation for all params marked FUNC_PARAM_MUST_BE_PRIMTS. In this file that attribute is used by elapsed() (intended) but also by the internal _irate_partial signature (see around builtins.c:3033-3046). If _irate_partial still requires an actual primary timestamp, this change will silently relax its contract. Consider introducing a separate param attribute/checker for “any TIMESTAMP column” (used by elapsed) instead of changing the semantics of MUST_BE_PRIMTS globally.
| if (nodeType(pNode) != QUERY_NODE_COLUMN || !IS_TIMESTAMP_TYPE(getSDataTypeFromNode(pNode)->type)) { | |
| if (nodeType(pNode) != QUERY_NODE_COLUMN || !IS_TIMESTAMP_TYPE(getSDataTypeFromNode(pNode)->type) || | |
| !((SColumnNode*)pNode)->isPrimTs) { |
parser: 无主键时回退到第一个时间戳列
builtins: 放宽 elapsed 等函数主键检查
docs: 中文需求文档(行为对比/函数分类)
test: 54条SQL + 298行期望输出(文件比较模式)
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.