Fixes #27040: Guard postDelete() with hardDelete check in IngestionPipeline, DataContract, App#27043
Conversation
…ataContract, App repositories Wraps destructive time series cleanup and external service calls in if (hardDelete) blocks to prevent data loss on soft delete. Previously, soft-deleting these entities permanently destroyed: - IngestionPipeline: pipeline status time series + deployed Airflow pipeline - DataContract: contract result time series + associated test suite - App: run status history Since restoreEntity() only flips the deleted flag, a soft-delete/restore cycle caused irrecoverable data loss. Fixes open-metadata#27040
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
Code Review ✅ ApprovedGuards postDelete() with hardDelete check in IngestionPipeline, DataContract, and App repositories to prevent unintended deletions during soft-delete operations. No issues found. OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
There was a problem hiding this comment.
Pull request overview
This PR fixes a soft-delete data loss bug in the Java service layer by ensuring entity-specific postDelete() cleanup only runs for hard deletes, aligning repository behavior with EntityRepository.postDelete() semantics.
Changes:
- Guarded
IngestionPipelineRepository.postDelete()pipeline-service teardown + pipeline status time-series deletion behindif (hardDelete). - Guarded
DataContractRepository.postDelete()test-suite deletion + contract result time-series deletion behindif (hardDelete). - Guarded
AppRepository.postDelete()app run status history deletion behindif (hardDelete).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/IngestionPipelineRepository.java | Prevents pipeline status history + external deployed pipeline deletion on soft delete. |
| openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DataContractRepository.java | Prevents contract result history + associated test suite deletion on soft delete. |
| openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/AppRepository.java | Prevents app run status history deletion on soft delete. |
| protected void postDelete(IngestionPipeline entity, boolean hardDelete) { | ||
| super.postDelete(entity, hardDelete); | ||
| // Delete deployed pipeline in the Pipeline Service Client | ||
| if (pipelineServiceClient != null) { | ||
| pipelineServiceClient.deletePipeline(entity); | ||
| } else { | ||
| LOG.debug( | ||
| "Skipping pipeline service delete for '{}' because pipeline service client is not configured.", | ||
| entity.getFullyQualifiedName()); | ||
| if (hardDelete) { | ||
| // Delete deployed pipeline in the Pipeline Service Client | ||
| if (pipelineServiceClient != null) { | ||
| pipelineServiceClient.deletePipeline(entity); | ||
| } else { | ||
| LOG.debug( | ||
| "Skipping pipeline service delete for '{}' because pipeline service client is not configured.", | ||
| entity.getFullyQualifiedName()); | ||
| } | ||
| // Clean pipeline status | ||
| daoCollection | ||
| .entityExtensionTimeSeriesDao() | ||
| .delete(entity.getFullyQualifiedName(), PIPELINE_STATUS_EXTENSION); | ||
| } |
There was a problem hiding this comment.
There’s no test coverage validating that a soft delete (hardDelete=false) no longer triggers the pipeline service deletion or the PIPELINE_STATUS time series cleanup. Since this change is meant to prevent irreversible data loss on soft-delete→restore, add a unit/integration test that (a) performs soft delete + restore and asserts time series entries remain, and (b) verifies the hard delete path still removes them.
| protected void postDelete(DataContract dataContract, boolean hardDelete) { | ||
| super.postDelete(dataContract, hardDelete); | ||
| if (!nullOrEmpty(dataContract.getQualityExpectations())) { | ||
| deleteTestSuite(dataContract); | ||
| if (hardDelete) { | ||
| if (!nullOrEmpty(dataContract.getQualityExpectations())) { | ||
| deleteTestSuite(dataContract); | ||
| } | ||
| // Clean status | ||
| daoCollection | ||
| .entityExtensionTimeSeriesDao() | ||
| .delete(dataContract.getFullyQualifiedName(), RESULT_EXTENSION); | ||
| } |
There was a problem hiding this comment.
The PR description states a test was added for this scenario, but I couldn’t find any new/updated test referencing this behavior. Either add a test that proves soft delete preserves DataContract result time series + test suite (and hard delete removes them), or update the PR description/checklist to reflect the current state.
| protected void postDelete(App entity, boolean hardDelete) { | ||
| super.postDelete(entity, hardDelete); | ||
| // Delete the status stored in the app extension | ||
| // Note that we don't want to delete the LIMITS, since we want to keep them | ||
| // between different app installations | ||
| daoCollection | ||
| .appExtensionTimeSeriesDao() | ||
| .delete(entity.getId().toString(), AppExtension.ExtensionType.STATUS.toString()); | ||
| if (hardDelete) { | ||
| // Delete the status stored in the app extension | ||
| // Note that we don't want to delete the LIMITS, since we want to keep them | ||
| // between different app installations | ||
| daoCollection | ||
| .appExtensionTimeSeriesDao() | ||
| .delete(entity.getId().toString(), AppExtension.ExtensionType.STATUS.toString()); | ||
| } |
There was a problem hiding this comment.
Consider adding a test to ensure that on soft delete (hardDelete=false) the app run STATUS time series is preserved (so restore brings back run history), while hard delete still cleans it up. This guards the regression this PR is addressing.
Describe your changes:
Fixes #27040
What: Wrapped destructive operations in
postDelete()withif (hardDelete)guards in three entity repositories:IngestionPipelineRepositoryDataContractRepositoryAppRepositoryWhy: These three subclasses unconditionally delete time series data and external resources in their
postDelete()methods — even on soft delete (hardDelete=false). SincerestoreEntity()only flips thedeletedflag back without restoring time series data, a soft-delete → restore cycle causes permanent, irrecoverable data loss.The base
EntityRepository.postDelete()correctly guards its destructive operation (RDF deletion) withif (hardDelete), but these subclasses did not follow the same pattern.How I tested: Verified that the three modified files compile cleanly and that spotless formatting passes. The fix is a straightforward conditional guard — same pattern used by the base class and other repositories like
PipelineRepository(fixed in #27037).Type of change:
Checklist:
Fixes <issue-number>: <short explanation>