Skip to content

feat: add column mapping support for add_column#2389

Open
william-ch-databricks wants to merge 4 commits intodelta-io:mainfrom
william-ch-databricks:stack/alter-table-5-column-mapping-add
Open

feat: add column mapping support for add_column#2389
william-ch-databricks wants to merge 4 commits intodelta-io:mainfrom
william-ch-databricks:stack/alter-table-5-column-mapping-add

Conversation

@william-ch-databricks
Copy link
Copy Markdown
Contributor

@william-ch-databricks william-ch-databricks commented Apr 14, 2026

🥞 Stacked PR

Use this link to review incremental changes.


What changes are proposed in this pull request?

Extends ALTER TABLE ... ADD COLUMN to work on tables with column mapping enabled. New columns get a fresh column ID and a UUID-based physical name assigned, and delta.columnMapping.maxColumnId is updated in table properties.

How was this change tested?

  • Unit tests covering name mode, id mode, and the missing-maxColumnId error case.
  • Integration tests that add a column to a column-mapping table and verify the metadata is assigned correctly, plus round-trip data read after add.

@william-ch-databricks william-ch-databricks force-pushed the stack/alter-table-5-column-mapping-add branch 17 times, most recently from 36cd0da to d943ca2 Compare April 17, 2026 21:15
github-merge-queue Bot pushed a commit that referenced this pull request Apr 17, 2026
…2385)

## Stacked PR
Use this
[link](https://github.com/delta-io/delta-kernel-rs/pull/2385/files) to
review incremental changes.
-
[**stack/alter-table-1-refactor-state**](#2385)
[[Files
changed](https://github.com/delta-io/delta-kernel-rs/pull/2385/files)]
-
[stack/alter-table-2-supports-data-files](#2386)
[[Files
changed](https://github.com/delta-io/delta-kernel-rs/pull/2386/files/029d6672081d32caea314dca61f3688e841f3036..50130c30c4378997fbf3d0ef7426ce23992e5c85)]
-
[stack/alter-table-3-framework-add-column](#2387)
[[Files
changed](https://github.com/delta-io/delta-kernel-rs/pull/2387/files/50130c30c4378997fbf3d0ef7426ce23992e5c85..eaa5277d025434aa78b79beed1a7cedfe82aa621)]
-
[stack/alter-table-4-set-nullable](#2388)
[[Files
changed](https://github.com/delta-io/delta-kernel-rs/pull/2388/files/eaa5277d025434aa78b79beed1a7cedfe82aa621..7506e1273ef349fbf628e4a6db5d0e7c1b4cbd8b)]
-
[stack/alter-table-5-column-mapping-add](#2389)
[[Files
changed](https://github.com/delta-io/delta-kernel-rs/pull/2389/files/7506e1273ef349fbf628e4a6db5d0e7c1b4cbd8b..d943ca2945da1ec9c354db57c889d93456381dee)]
-
[stack/alter-table-6-drop-column](#2390)
[[Files
changed](https://github.com/delta-io/delta-kernel-rs/pull/2390/files/d943ca2945da1ec9c354db57c889d93456381dee..40e3a1ff9e2915b456d0f0a76a50eb67f27a1ab1)]
-
[stack/alter-table-7-rename-column](#2391)
[[Files
changed](https://github.com/delta-io/delta-kernel-rs/pull/2391/files/40e3a1ff9e2915b456d0f0a76a50eb67f27a1ab1..628af3c02767f008a4e3c80a146aec7e5a3ac0b3)]

---------
## What changes are proposed in this pull request?

Splits Transaction's snapshot into two concerns:
- `read_snapshot_opt: Option<SnapshotRef>` -- the pre-commit table state
(None for CREATE TABLE)
- `effective_table_config: TableConfiguration` -- the config this commit
will produce

This separates "what did I read?" (conflict detection, post-commit
snapshots) from "what will
this commit produce?" (schema, protocol, stats, write context).
Write-path call sites read from
`effective_table_config`; read-path call sites use `read_snapshot()`.

Also adds `should_emit_protocol` / `should_emit_metadata` flags to
replace the old
`is_create_table()` checks for Protocol/Metadata action emission, and
replaces the synthetic
pre-commit snapshot in CREATE TABLE with direct `TableConfiguration`
construction.

This is a pure refactor with no behaviour change.

## How was this change tested?

All existing tests pass. Added unit tests for
`LogSegment::new_for_version_zero` (valid input,
non-zero version rejection, non-commit file rejection).
Introduce a SupportsDataFiles marker trait implemented by ExistingTable
and CreateTable (but not the future AlterTable). Move public data file
methods (add_files_schema, stats_schema, stats_columns,
get_write_context, add_files) behind this trait bound.

Internal generate_* methods remain on impl<S> since they handle empty
data gracefully and are called from the shared commit path.

This enables compile-time prevention of data file operations on
metadata-only transaction types like AlterTable.
@william-ch-databricks william-ch-databricks force-pushed the stack/alter-table-5-column-mapping-add branch from d943ca2 to 8e8b06a Compare April 17, 2026 21:59
Introduce the ALTER TABLE transaction framework with type-state builder
pattern for schema evolution operations.

New types:
- AlterTable marker type (does not implement SupportsDataFiles)
- AlterTableTransaction type alias
- AlterTableTransactionBuilder with Ready/Modifying/Renaming states
- SchemaOperation enum (AddColumn, DropColumn, RenameColumn, SetNullable)
- SchemaEvolutionResult for evolved schema + metadata

Entry point: Snapshot::alter_table() returns AlterTableTransactionBuilder

Type-state enforcement:
- Ready -> add_column/drop_column/set_nullable -> Modifying (chainable)
- Ready -> rename_column -> Renaming (terminal)
- Modifying/Renaming -> build() -> AlterTableTransaction

This PR implements AddColumn on non-column-mapping tables only.
Remaining operations (set_nullable, drop_column, rename_column) and
column mapping support will follow in subsequent PRs.
@william-ch-databricks william-ch-databricks force-pushed the stack/alter-table-5-column-mapping-add branch from 8e8b06a to 892f04d Compare April 21, 2026 04:18
Implement the SetNullable schema operation which changes a column's
nullability from NOT NULL to nullable. If the column is already nullable,
this is a no-op.

Only the safe direction is allowed (NOT NULL -> nullable) per the Delta
protocol.
@william-ch-databricks william-ch-databricks force-pushed the stack/alter-table-5-column-mapping-add branch from 892f04d to 19fbe4b Compare April 22, 2026 04:30
When column mapping is enabled (mode = name or id), newly added columns
via ALTER TABLE ADD COLUMN are now assigned column mapping metadata
(unique ID and UUID-based physical name). The maxColumnId table property
is updated in the evolved metadata configuration.

Reuses the existing assign_field_column_mapping function (now pub(crate))
which handles recursive assignment for nested types.
@william-ch-databricks william-ch-databricks force-pushed the stack/alter-table-5-column-mapping-add branch from 19fbe4b to ca02602 Compare April 22, 2026 05:06
@william-ch-databricks william-ch-databricks marked this pull request as ready for review April 22, 2026 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant