Skip to content

Fix omjob pod/label naming length constraints#27143

Merged
pmbrull merged 8 commits intoopen-metadata:mainfrom
Darshan3690:fix/omjob-label-length-main
Apr 16, 2026
Merged

Fix omjob pod/label naming length constraints#27143
pmbrull merged 8 commits intoopen-metadata:mainfrom
Darshan3690:fix/omjob-label-length-main

Conversation

@Darshan3690
Copy link
Copy Markdown
Contributor

@Darshan3690 Darshan3690 commented Apr 7, 2026

Describe your changes

Fixes #27004 @pmbrull @harshach

This PR addresses an issue where long pipeline/job names could generate
Kubernetes label values exceeding the 63-character limit, leading to pod
creation failures.


Summary of Changes

Core Fix: Kubernetes Label Constraint Enforcement

  • Fixed critical SHA-256 hash byte formatting bug in pod/label name truncation

    • Added & 0xff mask to byte formatting in KubernetesNameBuilder.hash()
    • Added & 0xff mask to byte formatting in LabelBuilder.hash()
    • Ensures proper 2-hex-digit encoding per byte for deterministic hash
    • Prevents label collision issues when names exceed 63 characters
  • Enforced Kubernetes 63-character label limit across generated resources

    • Updated pod name generation to ensure label-safe values
    • Improved label sanitization and truncation logic with hash-based approach
    • Preserved uniqueness using deterministic hash suffixes
    • No impact on existing valid (short) names

Code Quality Improvements

  • Extracted duplicate hash logic into shared HashUtils utility class
  • Fixed critical StringIndexOutOfBoundsException in hash truncation
  • Removed redundant substring operations
  • Applied consistent Java formatting with Spotless
  • Applied consistent TypeScript formatting with ESLint/Prettier

Playwright Test Reliability

  • Fixed HTTP 409 "Entity already exists" handling in TableClass.ts
  • Service creation now gracefully handles conflicts in sharded test runs
  • Improved test stability without functional changes

Root Cause

Kubernetes enforces a 63-character limit on label values, while previous
logic only constrained pod names to the DNS limit (253 characters).

Additionally, signed byte formatting in hash functions produced incorrect hex
encoding (e.g., ffffff80 instead of 80), reducing hash quality and
increasing collision risk for deterministic name truncation.


Solution

  • Fixed byte formatting: Use String.format("%02x", value & 0xff) for
    correct 2-digit hex per byte
  • Introduced MAX_LABEL_LENGTH = 63 constant
  • Updated name generation to enforce label-safe constraints via hash-based truncation
  • Added consistent sanitization to remove trailing non-alphanumeric characters
  • Created shared HashUtils class to eliminate code duplication
  • Added idempotent service creation handling for Playwright tests

Testing

  • Added/updated unit tests for long-name scenarios in PodManagerTest.java
  • All tests pass successfully
  • Verified hash encoding produces correct hex strings
  • Playwright tests now handle service creation conflicts gracefully

Impact

Before

  • Pod creation failed for long names due to label length violations
  • Kubernetes validation errors on label values > 63 chars
  • Hash truncation could produce incorrect values (signed byte overflow)
  • Playwright tests could fail on service creation conflicts
  • Duplicate hash logic across multiple classes

After

  • Names are safely truncated within limits with proper hashing
  • No validation errors for label constraints
  • Deterministic hash encoding ensures collision prevention
  • Playwright tests handle conflicts gracefully
  • Clean, DRY code with shared utilities

Type of change

  • Bug fix (critical runtime exception)
  • Code quality improvement (DRY, formatting)
  • CI/Infrastructure improvement (test reliability)

Checklist

  • Hash byte formatting fixed in both builder classes
  • Critical StringIndexOutOfBoundsException resolved
  • Duplicate hash code eliminated via HashUtils
  • Tests added/updated and passing
  • Java formatting applied (Spotless)
  • TypeScript formatting applied (ESLint/Prettier)
  • Playwright 409 handling implemented
  • Backward compatible
  • No breaking changes
  • CI checks passing
  • Code review feedback addressed (10/10 issues resolved)

Copilot AI review requested due to automatic review settings April 7, 2026 19:47
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes Kubernetes 63-character label-value constraint violations in the OMJob operator by truncating/sanitizing OMJob-derived label values and ensuring scheduled OMJob names/pod names stay within label-safe limits.

Changes:

  • Sanitize + truncate omjob.pipelines.openmetadata.org/name label value to ≤63 chars (and trim trailing separators after truncation).
  • Constrain operator-generated pod names for -main/-exit to be label-safe (≤63 chars including suffix).
  • Constrain scheduled OMJob names (CronOMJob → OMJob) to ≤63 chars while preserving the timestamp suffix; add/adjust unit tests for long-name scenarios.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
openmetadata-k8s-operator/src/main/java/org/openmetadata/operator/util/LabelBuilder.java Sanitizes/truncates OMJob name label value to comply with K8s label limits.
openmetadata-k8s-operator/src/main/java/org/openmetadata/operator/service/PodManager.java Generates label-safe pod names for main/exit pods; adds constants and warning log on truncation.
openmetadata-k8s-operator/src/main/java/org/openmetadata/operator/controller/CronOMJobReconciler.java Builds scheduled OMJob names capped to 63 chars while keeping epoch suffix; ensures labels map is mutable before adding run-id.
openmetadata-k8s-operator/src/test/java/org/openmetadata/operator/unit/LabelBuilderTest.java Adds coverage for long OMJob name sanitization/truncation and trailing-separator trimming.
openmetadata-k8s-operator/src/test/java/org/openmetadata/operator/unit/CronOMJobReconcilerTest.java Adds coverage for scheduled OMJob name trimming while keeping timestamp suffix.
openmetadata-k8s-operator/src/test/java/org/openmetadata/operator/service/PodManagerTest.java Adds coverage for long OMJob name producing safe pod names and label values.

Comment on lines +424 to +434
String baseName = omJob.getMetadata().getName();

// Account for suffix length when calculating max base name length
// We use MAX_LABEL_LENGTH instead of MAX_POD_NAME_LENGTH to ensure
// the pod name can be used as a label value
int maxBaseLength = MAX_LABEL_LENGTH - suffix.length();

if (baseName.length() > maxBaseLength) {
// Truncate to fit within the limit and remove trailing hyphens
baseName = baseName.substring(0, maxBaseLength).replaceAll("-+$", "");
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildPodName only strips trailing hyphens after truncation. If the original OMJob name contains '.' and the truncation boundary lands on '.', the resulting pod name can end with a non-alphanumeric character, which violates Kubernetes DNS label/subdomain rules and can still fail pod creation. After truncation, ensure the baseName ends with [a-z0-9] (e.g., strip trailing "-" and "."/"_" or reuse a shared sanitizer that enforces the full name regex).

Copilot uses AI. Check for mistakes.
Comment on lines +424 to +426
String baseName = omJob.getMetadata().getName();

// Account for suffix length when calculating max base name length
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are blank lines here that contain trailing whitespace. This can cause formatting checks (e.g., Spotless) to fail; please remove trailing spaces (or run the module's formatter) before merging.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback


private static final Logger LOG = LoggerFactory.getLogger(PodManager.class);
private static final String PIPELINE_STATUS = "pipelineStatus";
private static final int MAX_POD_NAME_LENGTH = 253;
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_POD_NAME_LENGTH is introduced but not used (pod names are currently capped using MAX_LABEL_LENGTH). Consider removing it to avoid dead code, or apply it where appropriate if you still intend to enforce the 253-char DNS subdomain limit elsewhere.

Suggested change
private static final int MAX_POD_NAME_LENGTH = 253;

Copilot uses AI. Check for mistakes.
}

if (baseName.length() > maxBaseLength) {
baseName = baseName.substring(0, maxBaseLength).replaceAll("-+$", "");
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildScheduledOMJobName truncates the CronOMJob baseName and only strips trailing hyphens. If the truncation boundary lands on '.', the resulting OMJob name can end with '.', which violates Kubernetes resource name validation (must end with an alphanumeric). After truncation, strip all trailing non-alphanumeric separators (at least '-' and '.') or validate the final name against the DNS subdomain rules.

Suggested change
baseName = baseName.substring(0, maxBaseLength).replaceAll("-+$", "");
baseName = baseName.substring(0, maxBaseLength).replaceAll("[-.]+$", "");

Copilot uses AI. Check for mistakes.
@harshach harshach added the safe to test Add this label to run secure Github workflows on PRs label Apr 7, 2026
@Darshan3690
Copy link
Copy Markdown
Contributor Author

add safe to test label @harshach

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

🟡 Playwright Results — all passed (27 flaky)

✅ 3638 passed · ❌ 0 failed · 🟡 27 flaky · ⏭️ 84 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 477 0 3 4
🟡 Shard 2 645 0 2 7
🟡 Shard 3 644 0 9 1
🟡 Shard 4 626 0 3 22
🟡 Shard 5 613 0 3 42
🟡 Shard 6 633 0 7 8
🟡 27 flaky test(s) (passed on retry)
  • Features/CustomizeDetailPage.spec.ts › Domain - customization should work (shard 1, 1 retry)
  • Features/MetricCustomUnitFlow.spec.ts › Should create metric and test unit of measurement updates (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/ChangeSummaryBadge.spec.ts › Automated badge should appear on entity description with Automated source (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 2 retries)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Features/Table.spec.ts › Tags term should be consistent for search (shard 3, 2 retries)
  • Flow/AddRoleAndAssignToUser.spec.ts › Verify assigned role to new user (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DataContractsSemanticRules.spec.ts › Validate Owner Rule Any_In (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Rename domain with data products attached at domain and subdomain levels (shard 4, 1 retry)
  • Pages/EntityDataConsumer.spec.ts › No edit owner permission (shard 5, 1 retry)
  • Pages/ExplorePageRightPanel.spec.ts › Should perform CRUD and Removal operations for table (shard 5, 1 retry)
  • Pages/Glossary.spec.ts › Add and Remove Assets (shard 5, 2 retries)
  • Pages/Glossary.spec.ts › Create term with related terms, tags and owners during creation (shard 6, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › verify create lineage for entity - Table (shard 6, 2 retries)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)
  • Pages/Login.spec.ts › Refresh should work (shard 6, 1 retry)
  • Pages/UserDetails.spec.ts › Create team with domain and verify visibility of inherited domain in user profile after team removal (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Copilot AI review requested due to automatic review settings April 8, 2026 07:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

@Darshan3690
Copy link
Copy Markdown
Contributor Author

hi @pmbrull @harshach sir check the pr.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Comment thread .github/workflows/playwright-postgresql-e2e.yml
Comment thread .github/workflows/playwright-postgresql-e2e.yml
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Darshan3690 and others added 2 commits April 15, 2026 07:31
- Fix SHA-256 hash byte formatting with & 0xff mask for proper
  2-hex-digit encoding per byte
- Enforce Kubernetes 63-character label limit via hash-based truncation
- Extract shared hash utility to HashUtils
- Add comprehensive tests for truncation, uniqueness, and edge cases

Fixes open-metadata#27004
…t substring

- Guard ensureValidLabelValue fallback against StringIndexOutOfBounds
- Add tests for separator-only inputs exercising the fallback path
- Remove redundant .substring() since HashUtils.hash() already returns 6 chars
- Use 253 (K8s DNS subdomain limit) instead of 260 in PodManagerTest
- Fix wrong assertion in LabelBuilderTest (podSelector has 2 entries)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

- Add HTTP 409 idempotency handling in TableClass.create() for sharded Playwright tests
- Apply Java Spotless formatting to fix checkstyle violations
- Apply Playwright formatting: organize-imports, eslint --fix, prettier --write
- Resolve all 10 code review findings:
  ✅ StringIndexOutOfBoundsException in hash truncation (already fixed)
  ✅ Redundant substring operation (already fixed)
  ✅ Duplicate hash code extraction (already done)
  ✅ Playwright 409 conflict handling (now added)
  ✅ Java formatting compliance (now applied)
  ✅ TypeScript formatting compliance (now applied)

PR is now ready for merge with all CI checks expected to pass.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Apr 15, 2026

Code Review ✅ Approved 10 resolved / 10 findings

Adjusted omjob pod and label naming constraints to prevent length overflows. Resolved issues concerning repository bloat, duplicate utility logic, dangling statements, and potential runtime errors in string processing.

✅ 10 resolved
Edge Case: Fallback name "omjob" could theoretically exceed limit

📄 openmetadata-k8s-operator/src/main/java/org/openmetadata/operator/service/PodManager.java:436-438 📄 openmetadata-k8s-operator/src/main/java/org/openmetadata/operator/controller/CronOMJobReconciler.java:254-256
In both PodManager.buildPodName (line 437) and CronOMJobReconciler.buildScheduledOMJobName (line 254), when truncation + trailing-hyphen stripping produces an empty string, the fallback "omjob" (5 chars) is used without re-checking that "omjob".length() <= maxBaseLength. If suffix were ever longer than 58 characters, the final name would exceed 63 chars.

In practice this is safe today because suffixes are -main/-exit (5 chars) and -<epoch> (~11 chars), so maxBaseLength is always well above 5. However, the code would silently break if a longer suffix were added in the future. A defensive assertion or length check after the fallback would make this more robust.

Quality: Committed .m2/ Maven cache and debug.json bloat the repository

📄 openmetadata-ui/src/main/resources/ui/debug.json
The PR adds ~1700 files from a local .m2/repository/ Maven cache directory and a 101K-line openmetadata-ui/src/main/resources/ui/debug.json to the repository. These are local build artifacts that should never be committed. The .m2/ directory alone contains hundreds of third-party dependency JARs, POMs, and SHA1 files. This massively bloats the repository and will persist in git history even if removed later.

This appears to be an accidental commit of local build artifacts.

Quality: Excessive documentation files for a small code change

📄 DOCUMENTATION_MANIFEST.md 📄 ISSUE_RESOLUTION_REPORT.md 📄 KUBERNETES_POD_NAME_FIX_SUMMARY.md 📄 README_START_HERE.md 📄 SOLUTION_SUMMARY.md 📄 VERIFICATION_CHECKLIST.md 📄 openmetadata-k8s-operator/CODE_CHANGES.md 📄 openmetadata-k8s-operator/DOCUMENTATION_INDEX.md 📄 openmetadata-k8s-operator/README_POD_NAME_FIX.md 📄 openmetadata-k8s-operator/VISUAL_GUIDE.md
The PR adds 9 new markdown files at the repo root and operator subdirectory (DOCUMENTATION_MANIFEST.md, ISSUE_RESOLUTION_REPORT.md, KUBERNETES_POD_NAME_FIX_SUMMARY.md, QUICK_REFERENCE.md, README_START_HERE.md, SOLUTION_SUMMARY.md, VERIFICATION_CHECKLIST.md, CODE_CHANGES.md, DOCUMENTATION_INDEX.md, README_POD_NAME_FIX.md, VISUAL_GUIDE.md, etc.) for what is a ~30-line code change in LabelBuilder.java. These files contain redundant information and clutter the repository root. Most of this should be in the PR description or a single changelog entry.

Bug: Dangling statements outside function in persona.ts

📄 openmetadata-ui/src/main/resources/ui/playwright/utils/persona.ts:166-167
Lines 166-167 in persona.ts contain two statements (await removeDefaultResponse; and await waitForAllLoadersToDisappear(...)) that are outside any function body. They appear after the closing brace of removePersonaDefault on line 164. This will cause a TypeScript compilation error (top-level await outside a module entry point, or simply a syntax error depending on the TS config).

These lines look like they were accidentally duplicated from lines 163 and should have been part of the function body, or they are leftover from a bad merge/rebase.

Quality: Duplicate sanitizeEmailSegment with divergent behavior

📄 openmetadata-ui/src/main/resources/ui/playwright/utils/common.ts:558-565 📄 openmetadata-ui/src/main/resources/ui/playwright/support/user/UserClass.ts:39-53
Two sanitizeEmailSegment functions exist with the same name but materially different behavior:

  • common.ts:558 replaces non-alphanumeric chars with '' (empty string)
  • UserClass.ts:39 replaces non-alphanumeric chars with '.' (dot)

This means the same conceptual operation produces different outputs depending on which file you're in (e.g., 'foo-bar''foobar' vs 'foo.bar'). Since UserClass.ts already imports from common.ts, consider extracting a single shared helper or giving the functions distinct names to avoid confusion.

...and 5 more resolved from earlier reviews

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

@Darshan3690
Copy link
Copy Markdown
Contributor Author

Darshan3690 commented Apr 16, 2026

Hi @pmbrull @harshach @akash-jain-10 @PubChimps
CHECK THE PR IS READY FOR REVIEW .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

omjob-operator pod names too long

4 participants