Fixes #27004: enforce Kubernetes 63-char label limit in LabelBuilder with hash-preserving truncation by mohitjeswani01 · Pull Request #27181 · open-metadata/OpenMetadata

mohitjeswani01 · 2026-04-08T19:49:44Z

Description :

I worked on the omjob-operator label generation logic because long pipeline names caused
Kubernetes pod creation to fail with a 422 error — Kubernetes enforces a strict 63-character
limit on label values, but OMJob names were being used as label values without any length
enforcement.

Root Cause

The LabelBuilder.buildBaseLabels() method was setting raw OMJob names as label values:

LABEL_OMJOB_NAME → set to raw omJob.getMetadata().getName() (no limit check)
LABEL_APP_PIPELINE → set to raw pipelineName (no limit check)

A sanitizeLabelValue() method already existed in LabelBuilder.java but was never called
for label values anywhere in the codebase.

Additionally, buildPodSelector() used the same raw name — meaning even if labels were fixed,
selectors would stop matching them, causing pod lookups to silently fail.

What I Changed

LabelBuilder.java only — surgical, focused fix:

Upgraded sanitizeLabelValue() to use hash-preserving truncation instead of plain
prefix cut:
- Values ≤ 63 chars → unchanged (zero impact on existing pipelines)
- Values > 63 chars → 55-char prefix + - + 7-char MD5 hash = always ≤ 63 chars
- Deterministic: same input always produces same output
- Collision-resistant: two similarly-named pipelines produce different label values
Fixed buildBaseLabels() — wrapped both raw label values with sanitizeLabelValue()
Fixed buildPodSelector() — applies same sanitization so selectors always match
labels (critical for findPod() and deletePods() to work correctly after truncation)
Added constants MAX_LABEL_LENGTH = 63 and MAX_LABEL_PREFIX_LENGTH = 55

Why hash-preserving truncation?

Plain prefix truncation (what other PRs use) causes collision risk — two pipelines named
om-job-data-mart-warehouse-production-alpha and om-job-data-mart-warehouse-production-beta
would truncate to the same label value, causing the operator to select or delete the wrong pods.
The MD5 suffix makes each truncated value unique and deterministic.

How I tested

Added 10 new unit tests to LabelBuilderTest.java covering:

Short names pass through unchanged
Long names truncated to ≤ 63 chars
Result always ends with alphanumeric (valid K8s label)
Determinism — same input → same output every time
Collision resistance — different long names → different outputs
buildBaseLabels() produces valid label values for long OMJob names
buildPodSelector() value always matches buildBaseLabels() value (consistency)
null and empty string inputs handled safely

Type of change:

Bug fix

Checklist:

I have read the CONTRIBUTING document.
My PR title is Fixes #27004: enforce Kubernetes 63-char label limit in LabelBuilder with hash-preserving truncation
I have commented on my code, particularly in hard-to-understand areas.
For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

I have added a test that covers the exact scenario we are fixing.

…reserving truncation

github-actions · 2026-04-08T19:50:12Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Copilot

Pull request overview

This PR updates the openmetadata-k8s-operator label generation to enforce Kubernetes’ 63-character label value limit by introducing hash-preserving truncation in LabelBuilder, and ensures pod selectors use the same sanitized values so pod lookup/cleanup continues to work.

Changes:

Apply sanitizeLabelValue() to OMJob name and pipeline name labels in buildBaseLabels().
Apply the same sanitization in buildPodSelector() to keep selectors aligned with labels.
Extend unit tests around truncation/determinism/collision-resistance behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
openmetadata-k8s-operator/src/main/java/org/openmetadata/operator/util/LabelBuilder.java	Enforces 63-char label value limit via hash-suffix truncation and applies sanitization to labels/selectors.
openmetadata-k8s-operator/src/test/java/org/openmetadata/operator/unit/LabelBuilderTest.java	Adds/updates unit tests covering sanitization behavior, truncation, and selector/label consistency.

Copilot · 2026-04-08T19:52:02Z

  public static String sanitizeLabelValue(String value) {
    if (value == null || value.isEmpty()) {
      return "";
    }

-    // Replace invalid characters with hyphens and truncate to 63 chars
+    // Replace invalid characters with hyphens, collapse runs, strip leading/trailing hyphens
    String sanitized =
        value.replaceAll("[^a-zA-Z0-9\\-_.]", "-").replaceAll("-+", "-").replaceAll("^-|-$", "");

-    return sanitized.length() > 63 ? sanitized.substring(0, 63) : sanitized;
+    if (sanitized.length() <= MAX_LABEL_LENGTH) {
+      return sanitized;
+    }
+
+    // Truncate to prefix length and strip trailing non-alphanumeric chars
+    String prefix =
+        sanitized.substring(0, MAX_LABEL_PREFIX_LENGTH).replaceAll("[^a-zA-Z0-9]+$", "");
+    String hash = md5Hash(value).substring(0, 7);
+    return prefix + "-" + hash;


sanitizeLabelValue() docstring states outputs match the Kubernetes label value regex (must start/end with an alphanumeric), but the implementation only strips leading/trailing hyphens. Inputs like ".abc" or "abc" will return values that still start with '.'/'' and therefore violate the regex. For long values, prefix can also become empty (e.g., if the first 55 chars are non-alphanumeric), producing an output starting with '-' which is invalid. Consider stripping leading/trailing non-alphanumeric characters (not just '-') and adding a fallback when the sanitized/prefix becomes empty so the result always matches the K8s regex.

Fixed in latest commit — now strips all leading/trailing non-alphanumeric
characters (not just hyphens) to ensure full K8s label regex compliance.
Added empty-string fallback to "omjob" for both sanitized value and prefix.

Copilot · 2026-04-08T19:52:03Z

+  // --- New tests for hash-preserving truncation (issue #27004) ---
+
+  @Test
+  void testSanitizeLabelValue_shortName_unchanged() {
+    String input = "my-pipeline-name";
+    String result = LabelBuilder.sanitizeLabelValue(input);
+    assertEquals(input, result);
+  }
+
+  @Test
+  void testSanitizeLabelValue_exactly63Chars_unchanged() {
+    String input = "a".repeat(63);
+    String result = LabelBuilder.sanitizeLabelValue(input);
+    assertEquals(63, result.length());
+    assertEquals(input, result);
+  }
+
+  @Test
+  void testSanitizeLabelValue_longName_truncatedTo63() {
+    String input = "om-job-data-mart-warehouse-production-metadata-jwef2ikn-ada040a6";
+    String result = LabelBuilder.sanitizeLabelValue(input);
+    assertTrue(
+        result.length() <= 63,
+        "Label value must be <= 63 chars, got: " + result.length());
+  }
+
+  @Test
+  void testSanitizeLabelValue_longName_endsWithAlphanumeric() {
+    String input = "om-job-data-mart-warehouse-production-metadata-jwef2ikn-ada040a6";
+    String result = LabelBuilder.sanitizeLabelValue(input);
+    assertTrue(
+        result.matches(".*[a-zA-Z0-9]$"),
+        "Label value must end with alphanumeric, got: " + result);
+  }
+
+  @Test
+  void testSanitizeLabelValue_deterministic() {
+    String input = "om-job-data-mart-warehouse-production-metadata-jwef2ikn-ada040a6";
+    String result1 = LabelBuilder.sanitizeLabelValue(input);
+    String result2 = LabelBuilder.sanitizeLabelValue(input);
+    assertEquals(result1, result2, "Same input must always produce same output");
+  }
+
+  @Test
+  void testSanitizeLabelValue_differentLongNames_differentOutputs() {
+    String input1 = "om-job-data-mart-warehouse-production-metadata-pipeline-alpha-01";
+    String input2 = "om-job-data-mart-warehouse-production-metadata-pipeline-beta-02";
+    String result1 = LabelBuilder.sanitizeLabelValue(input1);
+    String result2 = LabelBuilder.sanitizeLabelValue(input2);
+    assertNotEquals(
+        result1, result2, "Different long names must not collide after truncation");
+  }


Current tests cover length limits and trailing alphanumeric for a specific long name, but they don't assert the documented K8s label regex constraint that values must also start with an alphanumeric. Adding cases like ".starts-with-dot" / "_starts-with-underscore" (and a case where sanitization would otherwise produce an empty prefix) would prevent regressions and would have caught the current non-compliance in sanitizeLabelValue().

Added two new tests covering inputs starting with '.' and '_' to prevent
regressions on leading non-alphanumeric characters.

mohitjeswani01 · 2026-04-08T19:54:28Z

@harshach @pmbrull could you please add the safe to test
label to trigger CI? Thank you! 🙏

…ests - Strip leading/trailing non-alphanumeric chars to ensure label values match K8s regex [a-zA-Z0-9]([a-zA-Z0-9._-]*[a-zA-Z0-9])? - Add fallback to 'omjob' when sanitized value or prefix is empty - Add tests for inputs starting with '.' and '_'

github-actions · 2026-04-08T20:02:43Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

github-actions · 2026-04-08T20:12:37Z

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

github-actions · 2026-04-08T22:11:37Z

🟡 Playwright Results — all passed (20 flaky)

✅ 3598 passed · ❌ 0 failed · 🟡 20 flaky · ⏭️ 207 skipped

Shard	Passed	Flaky	Skipped
🟡 Shard 1	450	7	2
🟡 Shard 2	641	1	32
🟡 Shard 3	649	2	26
🟡 Shard 4	618	4	47
🟡 Shard 5	606	1	67
🟡 Shard 6	634	5	33

🟡 20 flaky test(s) (passed on retry)

Features/DataAssetRulesDisabled.spec.ts › Verify the SearchIndex entity item action after rules disabled (shard 1, 1 retry)
Features/CustomizeDetailPage.spec.ts › Ml Model - customization should work (shard 1, 1 retry)
Flow/Metric.spec.ts › Verify Related Metrics Update (shard 1, 1 retry)
Flow/Tour.spec.ts › Tour should work from help section (shard 1, 1 retry)
Flow/Tour.spec.ts › Tour should work from welcome screen (shard 1, 1 retry)
Pages/SearchIndexApplication.spec.ts › Search Index Application (shard 1, 1 retry)
Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
Flow/ExploreDiscovery.spec.ts › Should not display soft deleted assets in search suggestions (shard 3, 1 retry)
Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
Pages/DataContractInheritance.spec.ts › Delete Asset Contract - Falls back to showing inherited contract from Data Product (shard 4, 1 retry)
Pages/DataContractsSemanticRules.spec.ts › Validate Owner Rule Not_In (shard 4, 1 retry)
Pages/Domains.spec.ts › Comprehensive domain rename with ALL relationships preserved (shard 4, 1 retry)
Pages/ExploreTree.spec.ts › Verify Database and Database Schema available in explore tree (shard 5, 1 retry)
Pages/Lineage/DataAssetLineage.spec.ts › verify create lineage for entity - Metric (shard 6, 2 retries)
Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
Pages/ProfilerConfigurationPage.spec.ts › Non admin user (shard 6, 1 retry)
Pages/UserDetails.spec.ts › Admin user can edit teams from the user profile (shard 6, 1 retry)
Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)

📦 Download artifacts

How to debug locally

# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

github-actions · 2026-04-09T07:14:29Z

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

gitar-bot · 2026-04-09T07:16:20Z

Code Review ✅ Approved

Enforces Kubernetes 63-character label limit in LabelBuilder using hash-preserving truncation to fix #27004. No issues found.

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Copilot · 2026-04-09T07:16:31Z

+  // Kubernetes label value limits
+  private static final int MAX_LABEL_LENGTH = 63;
+  private static final int MAX_LABEL_PREFIX_LENGTH = 55; // 55 + "-" + 7 hash chars = 63


MAX_LABEL_PREFIX_LENGTH is a derived value (max label length minus separator minus hash length). Keeping it hard-coded risks drift if the hash length/format changes later. Consider introducing a HASH_SUFFIX_LENGTH constant and computing MAX_LABEL_PREFIX_LENGTH from MAX_LABEL_LENGTH to make the truncation invariant self-maintaining.

Copilot · 2026-04-09T07:16:31Z

+   * Sanitize label value to be Kubernetes-compliant.
+   *
+   * <p>Kubernetes label values must be at most 63 characters and match
+   * [a-zA-Z0-9]([a-zA-Z0-9._-]*[a-zA-Z0-9])?. When truncation is needed,
+   * a 7-character MD5 hash suffix is appended to avoid collisions between
+   * similarly-named pipelines.


The Javadoc regex shown doesn’t reflect Kubernetes’ allowance for an empty label value (empty is valid; if non-empty it must match the regex). Since sanitizeLabelValue returns "" for null/empty input, consider updating the comment to clarify the empty-string case to avoid confusion for future callers.

Copilot · 2026-04-09T07:16:32Z

+  @Test
+  void testSanitizeLabelValue_nullInput_returnsEmpty() {
+    assertEquals("", LabelBuilder.sanitizeLabelValue(null));
+  }
+
+  @Test
+  void testSanitizeLabelValue_emptyInput_returnsEmpty() {
+    assertEquals("", LabelBuilder.sanitizeLabelValue(""));
+  }


These null/empty input behaviors are already asserted in testSanitizeLabelValue() above. Duplicating them as separate tests adds noise without increasing coverage; consider removing the redundant tests or converting testSanitizeLabelValue() into a parameterized test that includes all cases in one place.

github-actions · 2026-04-09T07:16:55Z

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

PubChimps · 2026-04-15T23:18:01Z

Thank you @mohitjeswani01, but we will be closing this PR as we already had a submission here, please make sure that you are assigned an issue and that there is not an existing PR out for an issue before you submit something for it, we will reopen this PR if necessary

mohitjeswani01 · 2026-04-16T03:08:19Z

Thank you @mohitjeswani01, but we will be closing this PR as we already had a submission here, please make sure that you are assigned an issue and that there is not an existing PR out for an issue before you submit something for it, we will reopen this PR if necessary

Understood @PubChimps thanks alot for your review ! would love to dive in if this reopens by any chance.

fix(k8s-operator): enforce Kubernetes 63-char label limit with hash-p…

4c88c63

…reserving truncation

Copilot AI review requested due to automatic review settings April 8, 2026 19:49

Copilot started reviewing on behalf of mohitjeswani01 April 8, 2026 19:50 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

mohitjeswani01 mentioned this pull request Apr 8, 2026

omjob-operator pod names too long #27004

Closed

harshach added the safe to test Add this label to run secure Github workflows on PRs label Apr 8, 2026

harshach temporarily deployed to test April 8, 2026 20:20 — with GitHub Actions Inactive

Merge branch 'main' into fix/27004-omjob-pod-name-length

62ea6b5

Copilot AI review requested due to automatic review settings April 9, 2026 07:12

Copilot started reviewing on behalf of mohitjeswani01 April 9, 2026 07:12 View session

Merge branch 'main' into fix/27004-omjob-pod-name-length

2d84d4c

Copilot AI reviewed Apr 9, 2026

View reviewed changes

mohitjeswani01 temporarily deployed to test April 9, 2026 07:25 — with GitHub Actions Inactive

PubChimps closed this Apr 15, 2026

Conversation

mohitjeswani01 commented Apr 8, 2026

Description :

Root Cause

What I Changed

Why hash-preserving truncation?

How I tested

Type of change:

Checklist:

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

mohitjeswani01 Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

mohitjeswani01 Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

mohitjeswani01 commented Apr 8, 2026

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

github-actions Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🟡 Playwright Results — all passed (20 flaky)

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

gitar-bot Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

PubChimps commented Apr 15, 2026

Uh oh!

mohitjeswani01 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented Apr 8, 2026 •

edited

Loading

gitar-bot Bot commented Apr 9, 2026 •

edited

Loading