Skip to content

Python: Add field and table name escaping for python SqlServer connector#13893

Open
westey-m wants to merge 4 commits intomicrosoft:mainfrom
westey-m:python-sql-escape
Open

Python: Add field and table name escaping for python SqlServer connector#13893
westey-m wants to merge 4 commits intomicrosoft:mainfrom
westey-m:python-sql-escape

Conversation

@westey-m
Copy link
Copy Markdown
Contributor

Motivation and Context

Description

  • Add field and table name escaping for python SqlServer connector

Contribution Checklist

Copilot AI review requested due to automatic review settings April 20, 2026 17:11
@westey-m westey-m requested a review from a team as a code owner April 20, 2026 17:11
@moonbox3 moonbox3 added the python Pull requests for the Python Semantic Kernel label Apr 20, 2026
@westey-m westey-m marked this pull request as draft April 20, 2026 17:12
@github-actions github-actions bot changed the title Add field and table name escaping for python SqlServer connector Python: Add field and table name escaping for python SqlServer connector Apr 20, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 91%

✓ Correctness

This PR adds a consistent escape_identifier static method to QueryBuilder that properly escapes SQL Server delimited identifiers by replacing ] with ]] and wrapping in square brackets. All identifier interpolation points across the module are updated to use this method, preventing SQL injection through crafted identifier names. The PR also fixes a latent bug in _build_create_table_query where the PRIMARY KEY clause used key_field.name instead of key_field.storage_name or key_field.name, making it inconsistent with other field references. Tests are comprehensive and verify escaping, injection prevention, and updated query formats.

✓ Security Reliability

This diff introduces a proper SQL Server identifier escaping mechanism (escape_identifier) that replaces ] with ]] and wraps identifiers in square brackets, then applies it consistently across all query-building functions. This is a solid security improvement that closes identifier injection vectors in table names, column names, and filter expressions. The implementation follows the standard SQL Server delimited identifier escaping convention, and the tests include explicit injection prevention checks. No security or reliability issues were found in the changed code.

✓ Test Coverage

This PR properly introduces identifier escaping for SQL Server to prevent SQL injection via bracket-quoted identifiers. The escape_identifier static method is well-tested with unit tests including edge cases (empty string, nested brackets). All existing query-builder tests were updated with the new escaped output format. Three new tests were added specifically for escaping. However, there is a notable gap: none of the query-building integration tests exercise fields where storage_name is set and differs from name. This is especially relevant because the PRIMARY KEY clause in _build_create_table_query was changed from key_field.name to key_field.storage_name or key_field.name — a semantic change beyond just escaping — and this path is completely untested.

✓ Design Approach

The PR correctly centralizes SQL Server identifier escaping into a single escape_identifier static method that handles the ]]] edge case and wraps identifiers in square brackets. This replaces an inconsistent mix of double-quote ("field") and unescaped bracket ([field]) quoting across the query-building functions, and also fixes a pre-existing bug in _build_create_table_query where the PRIMARY KEY declaration used key_field.name alone while the column definition used key_field.storage_name or key_field.name. The approach is well-placed on QueryBuilder (the SQL-building utility class), and all call sites from module-level functions using QueryBuilder.escape_identifier(...) are valid since QueryBuilder is a module-level class defined before those functions. The hardcoded table aliases t and s in the MERGE query are not user input and correctly left unescaped. No design or correctness issues found.

Suggestions

  • Add a query-builder test (e.g., for _build_create_table_query or _build_merge_query) with a key field whose storage_name differs from name (ideally containing ]). This would exercise both the storage_name or name fallback logic and the escaping integration, since the PRIMARY KEY clause was changed from key_field.name to key_field.storage_name or key_field.name — a semantic change beyond escaping that currently lacks dedicated test coverage.

Automated review by westey-m's agents

@moonbox3
Copy link
Copy Markdown
Collaborator

moonbox3 commented Apr 20, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
connectors
   sql_server.py5228982%275, 369–371, 391, 410, 421, 423, 446, 492–495, 511, 523–526, 548, 550, 594–600, 606–607, 609–610, 614–628, 631–636, 639–643, 648, 654–655, 658, 664–665, 667–670, 813, 832–834, 840, 846, 851, 856–859, 865, 900, 943, 950–952, 1121–1122, 1124, 1126, 1147
TOTAL28593564780% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
3874 23 💤 0 ❌ 0 🔥 1m 37s ⏱️

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds consistent SQL Server identifier escaping (table/field names) in the Python SqlServer connector to better handle special characters and reduce injection risk from untrusted identifiers, and updates unit tests to validate the new quoting behavior.

Changes:

  • Introduced QueryBuilder.escape_identifier() and applied it across query construction paths (DDL/DML/filter parsing).
  • Updated table-name building to escape schema/table components and adjusted generated SQL to use bracket-quoted identifiers.
  • Expanded/updated unit tests to assert escaped output and bracket-quoting behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
python/semantic_kernel/connectors/sql_server.py Adds identifier escaping helper and applies it throughout query building and filter parsing.
python/tests/unit/connectors/memory/test_sql_server.py Updates expected SQL strings and adds tests for identifier escaping/bracket handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread python/semantic_kernel/connectors/sql_server.py Outdated
Comment thread python/semantic_kernel/connectors/sql_server.py Outdated
Comment thread python/semantic_kernel/connectors/sql_server.py Outdated
@westey-m westey-m marked this pull request as ready for review April 21, 2026 08:39
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 91%

✓ Correctness

This PR systematically adds proper SQL Server identifier escaping via a new escape_identifier static method that bracket-quotes identifiers (escaping ] to ]]). It addresses all query-building functions: CREATE TABLE, MERGE (upsert), SELECT, DELETE, and VECTOR_DISTANCE search queries. The OBJECT_ID string-literal context is separately handled by additionally escaping single quotes. A pre-existing bug where the PRIMARY KEY clause used key_field.name instead of key_field.storage_name or key_field.name is also fixed. The list-comprehension-for-side-effects anti-pattern is replaced with a proper for loop. All changes are well-tested, including injection and bracket-escaping edge cases. No correctness issues found.

✓ Security Reliability

This PR systematically adds SQL Server bracket-quoting via a new escape_identifier helper to all dynamically constructed identifiers (field names, schema/table names) throughout the SQL Server connector. It correctly addresses three previously reported issues: the OBJECT_ID single-quote injection, the list-comprehension-for-side-effects anti-pattern, and the duplicated escape_identifier call in UPDATE SET. The escaping logic (]]] inside […]) follows the SQL Server standard for delimited identifiers. The OBJECT_ID string-literal context additionally escapes ''' via a separate .replace() call. All interpolated non-identifier values (distance_function, dimensions, skip, top) come from controlled sources (hardcoded maps or Pydantic-validated integers). Test coverage for the new escaping behavior is adequate, including an injection-attempt test case. No new security or reliability issues were found.

✓ Test Coverage

The PR adds proper SQL Server identifier escaping via escape_identifier() and updates all query-building functions to use it. Test coverage for the new escaping is solid: the new escape_identifier unit test covers edge cases, table name bracket-escaping and SQL injection prevention are tested, and the OBJECT_ID single-quote fix has a dedicated test. However, the diff also silently fixes a bug where PRIMARY KEY and OUTPUT clauses used key_field.name instead of key_field.storage_name or key_field.name, but no test exercises a key field whose storage_name differs from name, leaving this correctness fix unverified.

✓ Design Approach

This PR correctly addresses a prior SQL injection concern by centralising identifier quoting through a new escape_identifier static helper (bracket-quoting with ]]]) and fixing the OBJECT_ID string-literal context by additionally escaping ''' on the formatted table ref. Previously flagged issues (list-comprehension side-effects, double escape_identifier computation in the UPDATE SET list) are also resolved. The design is sound: distance_function is sourced from a hardcoded DISTANCE_FUNCTION_MAP, SCORE_FIELD_NAME is a module-level constant, and the table_identifier aliases (s/t) are hardcoded — none are user-controlled. No new design problems are introduced.

Suggestions

  • Add a test for _build_create_table_query (and/or _build_merge_query) where the key field has a storage_name different from name (e.g., VectorStoreField("key", name="id", type="str", storage_name="pk_id")). The diff changes key_field.namekey_field.storage_name or key_field.name in the PRIMARY KEY and OUTPUT clauses — this correctness fix currently has no test coverage and could regress unnoticed.

Automated review by westey-m's agents

Comment thread python/tests/unit/connectors/memory/test_sql_server.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Pull requests for the Python Semantic Kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants