Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 10 additions & 5 deletions tests/system-test/1-insert/boundary.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ def init(self, conn, logSql, replicaVar=1):
self.replicaVar = int(replicaVar)
tdLog.debug("start to execute %s" % __file__)
tdSql.init(conn.cursor())
tdSql.execute("alter local 'maxSQLLength' '1048576'")
self.boundary = DataBoundary()
self.dbname_length_boundary = self.boundary.DBNAME_MAX_LENGTH
self.tbname_length_boundary = self.boundary.TBNAME_MAX_LENGTH
Expand Down Expand Up @@ -313,6 +314,9 @@ def test_sql_length_boundary(self):
tdSql.execute("drop database if exists db_maxsql_large")
tdSql.execute("create database db_maxsql_large")
tdSql.execute("use db_maxsql_large")

# Fixed base timestamp (ms) to avoid duplicate ts from now+offset
base_ts_ms = 1704067200000 # 2024-01-01 00:00:00.000

cols = []
for i in range(100):
Expand All @@ -324,7 +328,7 @@ def test_sql_length_boundary(self):
insert_sql = "insert into t1 values "
values = []
for i in range(1000):
val_cols = [f"now+{i}s"] + [str(j) for j in range(100)]
val_cols = [str(base_ts_ms + i)] + [str(j) for j in range(100)]
values.append(f"({', '.join(val_cols)})")
insert_sql += ", ".join(values)

Expand All @@ -342,7 +346,7 @@ def test_sql_length_boundary(self):
base_len = len(base_sql)
tdLog.info(f"10MB sql length: {base_len}")
num_values = (target_length - base_len) // 16
values = [f"(now+{i}s,{i})" for i in range(num_values)]
values = [f"({base_ts_ms + i},{i})" for i in range(num_values)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The num_values used here is calculated (on line 348) using a hardcoded divisor of 16. With the switch to a fixed 13-digit timestamp, the average length of each value tuple (timestamp, value) is now significantly higher (approx 20-23 bytes). This makes the initial estimate for num_values inaccurate, leading to the generation of a much larger SQL string than intended before any trimming logic is applied. Consider updating the divisor to a more realistic value or calculating it dynamically based on the sample row length.

insert_sql = base_sql + ",".join(values)
current_len = len(insert_sql)
if current_len < target_length:
Expand All @@ -363,15 +367,16 @@ def test_sql_length_boundary(self):
tdSql.execute("alter local 'maxSQLLength' '10485760'")
tdSql.execute(insert_sql)
tdSql.query("select * from t2")
tdSql.checkRows(509902)
tdSql.checkRows(460732)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The expected row count 460732 is hardcoded and highly sensitive to the specific string length of the timestamps and values. If the base_ts_ms or the value generation logic changes, this test will break again. It would be more robust to verify the row count dynamically (e.g., by checking against the actual number of elements in the values list after it has been adjusted to fit the target_length) rather than using a magic number.


# Test with 64MB limit - generate INSERT SQL with length slightly less than 64MB
target_length_64mb = 64 * 1024 * 1024 - 100 # 64MB minus 100 bytes (slightly less)
base_sql_64mb = "insert into t2 values "
base_len_64mb = len(base_sql_64mb)
tdLog.info(f"64MB sql length: {base_len_64mb}")
num_values_64mb = (target_length_64mb - base_len_64mb) // 16
values_64mb = [f"(now+{i}s,{i})" for i in range(num_values_64mb)]
base_ts_64mb = base_ts_ms + num_values
values_64mb = [f"({base_ts_64mb + i},{i})" for i in range(num_values_64mb)]
insert_sql_64mb = base_sql_64mb + ",".join(values_64mb)
Comment on lines 375 to 380
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generating values_64mb builds a Python list with ~4.2M tuple strings before truncation, which can consume hundreds of MB of memory and significantly slow down or OOM CI. Consider constructing the SQL incrementally (e.g., via io.StringIO / chunked appends) and stop once the target length is reached, instead of materializing all values first.

Copilot uses AI. Check for mistakes.

# Adjust to target length
Expand All @@ -395,7 +400,7 @@ def test_sql_length_boundary(self):
tdSql.execute("alter local 'maxSQLLength' '67108864'")
tdSql.execute(insert_sql_64mb)
tdSql.query("select * from t2")
tdSql.checkRows(3524291)
tdSql.checkRows(3303225)
Comment on lines 400 to +403
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: this hardcoded total row count depends on the exact SQL string construction and previous insert size, making the test fragile across small formatting changes. Prefer asserting on computed expected counts (tuple count / count(*) delta) rather than fixed constants.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the 10MB case, the expected row count 3303225 is a hardcoded magic number that makes the test fragile. Consider calculating the expected count dynamically based on the number of rows actually included in the insert_sql_64mb.


# test out of boundary value
tdSql.error("alter local 'maxSQLLength' '67108865'")
Expand Down
Loading