Python: Add field and table name escaping for python SqlServer connector#13893
Python: Add field and table name escaping for python SqlServer connector#13893westey-m wants to merge 4 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 91%
✓ Correctness
This PR adds a consistent
escape_identifierstatic method toQueryBuilderthat properly escapes SQL Server delimited identifiers by replacing]with]]and wrapping in square brackets. All identifier interpolation points across the module are updated to use this method, preventing SQL injection through crafted identifier names. The PR also fixes a latent bug in_build_create_table_querywhere the PRIMARY KEY clause usedkey_field.nameinstead ofkey_field.storage_name or key_field.name, making it inconsistent with other field references. Tests are comprehensive and verify escaping, injection prevention, and updated query formats.
✓ Security Reliability
This diff introduces a proper SQL Server identifier escaping mechanism (
escape_identifier) that replaces]with]]and wraps identifiers in square brackets, then applies it consistently across all query-building functions. This is a solid security improvement that closes identifier injection vectors in table names, column names, and filter expressions. The implementation follows the standard SQL Server delimited identifier escaping convention, and the tests include explicit injection prevention checks. No security or reliability issues were found in the changed code.
✓ Test Coverage
This PR properly introduces identifier escaping for SQL Server to prevent SQL injection via bracket-quoted identifiers. The
escape_identifierstatic method is well-tested with unit tests including edge cases (empty string, nested brackets). All existing query-builder tests were updated with the new escaped output format. Three new tests were added specifically for escaping. However, there is a notable gap: none of the query-building integration tests exercise fields wherestorage_nameis set and differs fromname. This is especially relevant because the PRIMARY KEY clause in_build_create_table_querywas changed fromkey_field.nametokey_field.storage_name or key_field.name— a semantic change beyond just escaping — and this path is completely untested.
✓ Design Approach
The PR correctly centralizes SQL Server identifier escaping into a single
escape_identifierstatic method that handles the]→]]edge case and wraps identifiers in square brackets. This replaces an inconsistent mix of double-quote ("field") and unescaped bracket ([field]) quoting across the query-building functions, and also fixes a pre-existing bug in_build_create_table_querywhere the PRIMARY KEY declaration usedkey_field.namealone while the column definition usedkey_field.storage_name or key_field.name. The approach is well-placed onQueryBuilder(the SQL-building utility class), and all call sites from module-level functions usingQueryBuilder.escape_identifier(...)are valid sinceQueryBuilderis a module-level class defined before those functions. The hardcoded table aliasestandsin the MERGE query are not user input and correctly left unescaped. No design or correctness issues found.
Suggestions
- Add a query-builder test (e.g., for
_build_create_table_queryor_build_merge_query) with a key field whosestorage_namediffers fromname(ideally containing]). This would exercise both thestorage_name or namefallback logic and the escaping integration, since the PRIMARY KEY clause was changed fromkey_field.nametokey_field.storage_name or key_field.name— a semantic change beyond escaping that currently lacks dedicated test coverage.
Automated review by westey-m's agents
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Pull request overview
Adds consistent SQL Server identifier escaping (table/field names) in the Python SqlServer connector to better handle special characters and reduce injection risk from untrusted identifiers, and updates unit tests to validate the new quoting behavior.
Changes:
- Introduced
QueryBuilder.escape_identifier()and applied it across query construction paths (DDL/DML/filter parsing). - Updated table-name building to escape schema/table components and adjusted generated SQL to use bracket-quoted identifiers.
- Expanded/updated unit tests to assert escaped output and bracket-quoting behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| python/semantic_kernel/connectors/sql_server.py | Adds identifier escaping helper and applies it throughout query building and filter parsing. |
| python/tests/unit/connectors/memory/test_sql_server.py | Updates expected SQL strings and adds tests for identifier escaping/bracket handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 91%
✓ Correctness
This PR systematically adds proper SQL Server identifier escaping via a new
escape_identifierstatic method that bracket-quotes identifiers (escaping]to]]). It addresses all query-building functions: CREATE TABLE, MERGE (upsert), SELECT, DELETE, and VECTOR_DISTANCE search queries. The OBJECT_ID string-literal context is separately handled by additionally escaping single quotes. A pre-existing bug where the PRIMARY KEY clause usedkey_field.nameinstead ofkey_field.storage_name or key_field.nameis also fixed. The list-comprehension-for-side-effects anti-pattern is replaced with a proper for loop. All changes are well-tested, including injection and bracket-escaping edge cases. No correctness issues found.
✓ Security Reliability
This PR systematically adds SQL Server bracket-quoting via a new
escape_identifierhelper to all dynamically constructed identifiers (field names, schema/table names) throughout the SQL Server connector. It correctly addresses three previously reported issues: the OBJECT_ID single-quote injection, the list-comprehension-for-side-effects anti-pattern, and the duplicatedescape_identifiercall in UPDATE SET. The escaping logic (]→]]inside[…]) follows the SQL Server standard for delimited identifiers. The OBJECT_ID string-literal context additionally escapes'→''via a separate.replace()call. All interpolated non-identifier values (distance_function, dimensions, skip, top) come from controlled sources (hardcoded maps or Pydantic-validated integers). Test coverage for the new escaping behavior is adequate, including an injection-attempt test case. No new security or reliability issues were found.
✓ Test Coverage
The PR adds proper SQL Server identifier escaping via
escape_identifier()and updates all query-building functions to use it. Test coverage for the new escaping is solid: the newescape_identifierunit test covers edge cases, table name bracket-escaping and SQL injection prevention are tested, and the OBJECT_ID single-quote fix has a dedicated test. However, the diff also silently fixes a bug where PRIMARY KEY and OUTPUT clauses usedkey_field.nameinstead ofkey_field.storage_name or key_field.name, but no test exercises a key field whosestorage_namediffers fromname, leaving this correctness fix unverified.
✓ Design Approach
This PR correctly addresses a prior SQL injection concern by centralising identifier quoting through a new
escape_identifierstatic helper (bracket-quoting with]→]]) and fixing theOBJECT_IDstring-literal context by additionally escaping'→''on the formatted table ref. Previously flagged issues (list-comprehension side-effects, doubleescape_identifiercomputation in the UPDATE SET list) are also resolved. The design is sound:distance_functionis sourced from a hardcodedDISTANCE_FUNCTION_MAP,SCORE_FIELD_NAMEis a module-level constant, and thetable_identifieraliases (s/t) are hardcoded — none are user-controlled. No new design problems are introduced.
Suggestions
- Add a test for
_build_create_table_query(and/or_build_merge_query) where the key field has astorage_namedifferent fromname(e.g.,VectorStoreField("key", name="id", type="str", storage_name="pk_id")). The diff changeskey_field.name→key_field.storage_name or key_field.namein the PRIMARY KEY and OUTPUT clauses — this correctness fix currently has no test coverage and could regress unnoticed.
Automated review by westey-m's agents
Motivation and Context
Description
Contribution Checklist