Skip to content

Fix startup Delta package resolution for Spark 4.x runtimes#25

Open
rjain21 wants to merge 1 commit intodelta-io:mainfrom
rjain21:main
Open

Fix startup Delta package resolution for Spark 4.x runtimes#25
rjain21 wants to merge 1 commit intodelta-io:mainfrom
rjain21:main

Conversation

@rjain21
Copy link
Copy Markdown

@rjain21 rjain21 commented Apr 11, 2026

Summary

This PR fixes the default startup path for the Docker image so Delta is launched with the correct Spark compatible Maven artifact at runtime.

Problem

The startup flow used a hardcoded Delta artifact/version combination that did not match Spark 4.1 runtime behavior.

In Spark 4.x, Delta artifacts are Spark version specific. Using the generic artifact caused JVM linkage failures (NoSuchMethodError) during Delta write initialization.

Root Cause

  • Startup script hardcoded Delta version and artifact naming.
  • Spark 4.1 requires a Spark-specific artifact coordinate.

Changes

  • Removed hardcoded Delta version assignment in startup.sh.
  • Added environment-overridable default for Delta version:
    • DELTA_SPARK_VERSION defaults to 4.1.0.
  • Added runtime Spark major.minor detection.
  • Added artifact selection logic:
    • Spark 4.1 -> delta-spark_4.1_2.13:${DELTA_SPARK_VERSION}
    • Spark 4.0 -> delta-spark_4.0_2.13:${DELTA_SPARK_VERSION}
    • fallback -> delta-spark_2.13:${DELTA_SPARK_VERSION}
  • Updated README with a minimal correction for Spark 4.x artifact guidance.

Validation

  • Added integration tests in tests/test_docker.sh to validate startup artifact selection for:
    • simulated Spark 4.1 runtime
    • simulated Spark 4.0 runtime
  • Built image locally and ran test suite:
    • 24 passed, 0 failed

Copilot AI review requested due to automatic review settings April 11, 2026 23:55
* Remove hardcoded Delta package/version from startup path.

* Default DELTA_SPARK_VERSION to 4.1.0 while allowing env override.

*Detect Spark major.minor at runtime and choose artifact:
4.1 -> delta-spark_4.1_2.13
4.0 -> delta-spark_4.0_2.13
fallback -> delta-spark_2.13

* Keep existing Spark session configs unchanged.

* Update README with Spark-specific artifact guidance.

* Add integration tests to validate startup artifact selection for Spark 4.1 and 4.0.

Signed-off-by: Rajesh Jain <73859950+rjain21@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Docker image entrypoint (startup.sh) to choose a Spark-compatible Delta Lake Maven artifact at runtime (avoiding Spark 4.x linkage failures) and adds integration tests + README guidance to validate/document the selection behavior.

Changes:

  • Detects the runtime Spark major.minor version in startup.sh and selects an appropriate delta-spark_* artifact coordinate.
  • Adds Docker integration tests to validate artifact selection for simulated Spark 4.1 and 4.0 runtimes.
  • Updates README guidance for Spark 4.x Delta artifact coordinates.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
startup.sh Adds Spark version detection and selects Spark-specific Delta Maven artifacts for Spark 4.0/4.1.
tests/test_docker.sh Adds integration tests that mock Spark binaries to verify startup.sh package selection logic.
README.md Updates documentation to guide users toward Spark-version-specific Delta artifacts for Spark 4.x.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread startup.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants