Issue#27307 by karthik120710 · Pull Request #27505 · open-metadata/OpenMetadata

karthik120710 · 2026-04-18T03:44:42Z

I improved the live search indexing pipeline to reduce database write amplification
and improve recovery behavior during ES/OS outages.

Changes made:

SearchIndexRetryQueue — replaced per-failure DB upsert with an in-memory
ConcurrentLinkedQueue buffer. A daemon flusher drains it every 500 ms or when
50 entries accumulate, writing them in a single @SqlBatch upsert. Added a new
SEARCH_UNAVAILABLE status so entries written during ES/OS downtime are
distinguishable from real mapping/data failures. Buffer is capped at 2000 entries
to prevent memory exhaustion; overflow is dropped and counted via a Micrometer
counter.
SearchIndexRetryWorker — drives the flusher lifecycle (startFlusher on
start, stopFlusher on stop to ensure a final flush before shutdown). Tracks
ES/OS availability transitions: when the cluster recovers, bulk-resets all
SEARCH_UNAVAILABLE rows back to PENDING for immediate retry. Added
STATUS_SEARCH_UNAVAILABLE to PURGEABLE_QUEUE_STATUSES so it is cleaned up
when a full reindex suspends streaming.
ReindexingOrchestrator — added cleanupRetryQueuePreFlight() called from
preflightFixes(). For a full reindex it deletes all purgeable statuses; for a
partial reindex it deletes only rows matching the selected entity types. This
prevents the retry worker from racing against the reindex job.
CollectionDAO.SearchIndexRetryQueueDAO — added batchUpsert() (@SqlBatch
with MySQL/Postgres variants), resetSearchUnavailableToPending(), and
deleteByEntityTypes().

No schema migration needed — status column is VARCHAR(32) with no check
constraint; SEARCH_UNAVAILABLE fits without any DDL change.

Testing: Added unit tests covering buffer enqueue/flush behavior, overflow
protection, status assignment, flusher lifecycle, availability transition resets,
and preflight cleanup logic.
#27307

Summary by Gitar

New RDF Indexing Infrastructure:
- Added RdfIndexJobDAO, RdfIndexPartitionDAO, RdfReindexLockDAO, and RdfIndexServerStatsDAO to support distributed RDF index jobs.
- Implemented necessary record types, row mappers, and connection-aware SQL methods for managing RDF job lifecycle and statistics.
Database Enhancements:
- Added countByRelationType and listAllStatesForInstance queries to CollectionDAO.
- Optimized deleteLineageBySourcePipeline to correctly handle pipeline.id and toId relations.
- Added markRunningEntriesFailedByName for managing application status transitions.

_{This will update automatically on new commits.}

… enhance preflight cleanup

… and add unit tests

github-actions · 2026-04-18T03:45:05Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

…and statuses in search index retry queue

github-actions · 2026-04-18T09:56:31Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

github-actions · 2026-04-20T12:08:04Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

github-actions · 2026-04-20T12:10:24Z

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

github-actions · 2026-04-20T14:13:27Z

🟡 Playwright Results — all passed (20 flaky)

✅ 3667 passed · ❌ 0 failed · 🟡 20 flaky · ⏭️ 89 skipped

Shard	Passed	Flaky	Skipped
🟡 Shard 1	480	1	4
🟡 Shard 2	648	5	7
🟡 Shard 3	652	7	1
🟡 Shard 4	632	2	27
🟡 Shard 5	610	1	42
🟡 Shard 6	645	4	8

🟡 20 flaky test(s) (passed on retry)

Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
Features/ColumnBulkOperations.spec.ts › should discard changes when closing drawer without saving (shard 2, 1 retry)
Features/DataProductDomainMigration.spec.ts › Data product with no assets can change domain without confirmation (shard 2, 1 retry)
Features/DataProductPersonaCustomization.spec.ts › Data Product - customize tab label should only render if it's customized by user (shard 2, 1 retry)
Features/Glossary/GlossaryHierarchy.spec.ts › should cancel move operation (shard 2, 1 retry)
Features/IncidentManager.spec.ts › Complete Incident lifecycle with table owner (shard 3, 2 retries)
Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 2 retries)
Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
Flow/CustomizeWidgets.spec.ts › My Tasks Widget (shard 3, 1 retry)
Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
Pages/DataContracts.spec.ts › Create Data Contract and validate for Store Procedure (shard 4, 1 retry)
Pages/Glossary.spec.ts › Add and Remove Assets (shard 5, 1 retry)
Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)
Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)
Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)

📦 Download artifacts

How to debug locally

# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

mohityadav766 · 2026-04-21T11:50:31Z

@karthik120710 there are integration test failures can you check
Also please add the integration test for scenarios that are being fixed

…and unit tests for search unavailable scenarios

gitar-bot · 2026-04-21T14:39:12Z

Code Review ✅ Approved 1 resolved / 1 findings

Updated deleteByEntityTypes to exclude IN_PROGRESS rows, aligning it with the full-reindex path logic. No issues found.

✅ 1 resolved

✅ Bug: deleteByEntityTypes deletes IN_PROGRESS rows, unlike full-reindex path

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java:11043-11044 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/ReindexingOrchestrator.java:179-182
For a full reindex, cleanupRetryQueuePreFlight() calls deleteByStatuses(ALL_PURGEABLE_STATUSES) which correctly preserves IN_PROGRESS and COMPLETED rows. For a partial reindex, it calls deleteByEntityTypes(...) whose SQL is DELETE FROM search_index_retry_queue WHERE entityType IN (...) — this unconditionally deletes all rows for those entity types, including IN_PROGRESS entries that the SearchIndexRetryWorker is actively processing. This inconsistency means a partial reindex can silently discard work-in-flight, leading to entities that are never indexed and never retried.

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

sonarqubecloud · 2026-04-21T15:41:50Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

karthik120710 added 2 commits April 18, 2026 08:39

feat(search): Implement batch upsert for search index retry queue and…

8417f89

… enhance preflight cleanup

feat(search): Enhance search index retry queue with preflight cleanup…

3134a27

… and add unit tests

gitar-bot Bot reviewed Apr 18, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java Outdated

feat(search): Update delete method to remove entries by entity types …

bdc6f81

…and statuses in search index retry queue

mohityadav766 added the safe to test Add this label to run secure Github workflows on PRs label Apr 20, 2026

Merge branch 'main' into issue#27307

2927bc9

mohityadav766 temporarily deployed to test April 20, 2026 12:19 — with GitHub Actions Inactive

feat(search): Enhance SearchIndexRetryQueue with new status handling …

b38db39

…and unit tests for search unavailable scenarios

karthik120710 temporarily deployed to test April 21, 2026 14:47 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue#27307#27505

Issue#27307#27505
karthik120710 wants to merge 5 commits intoopen-metadata:mainfrom
karthik120710:issue#27307

karthik120710 commented Apr 18, 2026 •

edited by gitar-bot Bot

Loading

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

mohityadav766 commented Apr 21, 2026

Uh oh!

gitar-bot Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

sonarqubecloud Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

karthik120710 commented Apr 18, 2026 • edited by gitar-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Gitar

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🟡 Playwright Results — all passed (20 flaky)

Uh oh!

mohityadav766 commented Apr 21, 2026

Uh oh!

gitar-bot Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sonarqubecloud Bot commented Apr 21, 2026

Quality Gate passed for 'open-metadata-ingestion'

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

karthik120710 commented Apr 18, 2026 •

edited by gitar-bot Bot

Loading

github-actions Bot commented Apr 20, 2026 •

edited

Loading

gitar-bot Bot commented Apr 21, 2026 •

edited

Loading