Conversation
… enhance preflight cleanup
… and add unit tests
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
…and statuses in search index retry queue
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
|
The Java checkstyle failed. Please run You can install the pre-commit hooks with |
🟡 Playwright Results — all passed (20 flaky)✅ 3667 passed · ❌ 0 failed · 🟡 20 flaky · ⏭️ 89 skipped
🟡 20 flaky test(s) (passed on retry)
How to debug locally# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip # view trace |
|
@karthik120710 there are integration test failures can you check |
…and unit tests for search unavailable scenarios
Code Review ✅ Approved 1 resolved / 1 findingsUpdated deleteByEntityTypes to exclude IN_PROGRESS rows, aligning it with the full-reindex path logic. No issues found. ✅ 1 resolved✅ Bug: deleteByEntityTypes deletes IN_PROGRESS rows, unlike full-reindex path
OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
|



I improved the live search indexing pipeline to reduce database write amplification
and improve recovery behavior during ES/OS outages.
Changes made:
SearchIndexRetryQueue— replaced per-failure DB upsert with an in-memoryConcurrentLinkedQueuebuffer. A daemon flusher drains it every 500 ms or when50 entries accumulate, writing them in a single
@SqlBatchupsert. Added a newSEARCH_UNAVAILABLEstatus so entries written during ES/OS downtime aredistinguishable from real mapping/data failures. Buffer is capped at 2000 entries
to prevent memory exhaustion; overflow is dropped and counted via a Micrometer
counter.
SearchIndexRetryWorker— drives the flusher lifecycle (startFlusheronstart,
stopFlusheron stop to ensure a final flush before shutdown). TracksES/OS availability transitions: when the cluster recovers, bulk-resets all
SEARCH_UNAVAILABLErows back toPENDINGfor immediate retry. AddedSTATUS_SEARCH_UNAVAILABLEtoPURGEABLE_QUEUE_STATUSESso it is cleaned upwhen a full reindex suspends streaming.
ReindexingOrchestrator— addedcleanupRetryQueuePreFlight()called frompreflightFixes(). For a full reindex it deletes all purgeable statuses; for apartial reindex it deletes only rows matching the selected entity types. This
prevents the retry worker from racing against the reindex job.
CollectionDAO.SearchIndexRetryQueueDAO— addedbatchUpsert()(@SqlBatchwith MySQL/Postgres variants),
resetSearchUnavailableToPending(), anddeleteByEntityTypes().No schema migration needed —
statuscolumn isVARCHAR(32)with no checkconstraint;
SEARCH_UNAVAILABLEfits without any DDL change.Testing: Added unit tests covering buffer enqueue/flush behavior, overflow
protection, status assignment, flusher lifecycle, availability transition resets,
and preflight cleanup logic.
#27307
Summary by Gitar
RdfIndexJobDAO,RdfIndexPartitionDAO,RdfReindexLockDAO, andRdfIndexServerStatsDAOto support distributed RDF index jobs.countByRelationTypeandlistAllStatesForInstancequeries toCollectionDAO.deleteLineageBySourcePipelineto correctly handlepipeline.idandtoIdrelations.markRunningEntriesFailedByNamefor managing application status transitions.This will update automatically on new commits.