[#10713] improvement(core,iceberg): Fix event ordering by standardizing dispatcher wiring#10834
[#10713] improvement(core,iceberg): Fix event ordering by standardizing dispatcher wiring#10834mehakmeet wants to merge 2 commits intoapache:mainfrom
Conversation
…ardizing dispatcher wiring Standardize dispatcher wiring in GravitinoEnv and RESTService so that all object types follow Hook(Event(Normalize(Manager))), matching the existing Policy/AccessControl pattern. This ensures CREATE events are published before SET_OWNER events for all object types. Changes: - GravitinoEnv: Flip wiring for Metalake, Catalog, Schema, Table, Fileset, Topic, Model, Tag, Job dispatchers. - RESTService: Flip wiring for Iceberg Table, View, Namespace dispatchers. - HookDispatchers: Ensure setOwner failures are gracefully handled via try-catch-log (no creation failure on ownership failure). Tests: - Added setOwner-failure tests for all core HookDispatchers. - Added setOwner-failure tests for all Iceberg HookDispatchers. - Tested in Gravitino dev environment. All core and iceberg tests pass: ./gradlew :core:test :iceberg:iceberg-rest-server:test -PskipITs
e675c1f to
63707f0
Compare
Code Coverage Report
Files
|
bharos
left a comment
There was a problem hiding this comment.
Overall +ve, some non-blocking comments to consider
| FilesetNormalizeDispatcher filesetNormalizeDispatcher = | ||
| new FilesetNormalizeDispatcher(filesetHookDispatcher, catalogManager); | ||
| this.filesetDispatcher = new FilesetEventDispatcher(eventBus, filesetNormalizeDispatcher); | ||
| new FilesetNormalizeDispatcher(filesetOperationDispatcher, catalogManager); |
There was a problem hiding this comment.
The PR description claims all types follow Hook(Event(Normalize(Manager))), but the actual wiring is:
| Pattern | Entity Types |
|---|---|
| Hook → Normalize → Event → Manager | Metalake, Catalog, Schema |
| Hook → Event → Normalize → Manager | Table, Fileset, Topic, Model |
Both achieve CREATE before SET_OWNER, so this doesn't break the PR's goal. However, the position of NormalizeDispatcher relative to EventDispatcher affects whether events see pre-normalization or post-normalization input data. This internal inconsistency could cause subtle surprises later.
We can use consistent pattern if that's possible
| importSchema(context.catalogName(), createRequest.namespace()); | ||
| IcebergOwnershipUtils.setSchemaOwner( | ||
| metalake, |
There was a problem hiding this comment.
The try-catch wraps both importSchema and setSchemaOwner. If importSchema fails, the namespace exists in the external catalog but is never registered in Gravitino's entity store, and the failure is silently logged with a misleading "set owner" message. These should be separate try-catch blocks, or importSchema failure should propagate.
| importTable(context.catalogName(), namespace, registerTableRequest.name()); | ||
|
|
||
| // Set the owner of the registered table to the current user | ||
| IcebergOwnershipUtils.setTableOwner( |
There was a problem hiding this comment.
Similar comment as IcebergNamespaceHookDispatcher.createNamespace, please check if we should fix these
| accessControlDispatcher.addUser(ident.name(), PrincipalUtils.getCurrentUserName()); | ||
| } |
There was a problem hiding this comment.
addUser is NOT wrapped in try-catch — can still cause 500 error
Do we need similar treatment like setOwner here ?
There was a problem hiding this comment.
This is a pre-existing issue, not introduced by this PR, but is now more visible since the event ordering change surfaces it.
| Owner.Type.USER); | ||
| } | ||
| } catch (Exception e) { | ||
| LOG.warn("Fail to set owner for metalake {}, metalake exists without owner", ident, e); |
There was a problem hiding this comment.
nit : "Fail to set owner" → "Failed to set owner" on all the messages?
| if (ownerManager != null) { | ||
| ownerManager.setOwner( | ||
| ident.namespace().level(0), | ||
| NameIdentifierUtil.toMetadataObject(ident, Entity.EntityType.TABLE), |
There was a problem hiding this comment.
With the new dispatcher chain order (HookDispatcher → EventDispatcher → NormalizeDispatcher → Manager), ident here is the raw caller input — it has not yet passed through NormalizeDispatcher. For TABLE scope, both the leaf name and the schema name inside ident.namespace() can be normalized, but the entity is stored under the normalized names. This setOwner call may therefore silently fail (exception swallowed by the catch block).
| if (ownerManager != null) { | ||
| ownerManager.setOwner( | ||
| ident.namespace().level(0), | ||
| NameIdentifierUtil.toMetadataObject(ident, Entity.EntityType.SCHEMA), |
There was a problem hiding this comment.
Same normalization issue as in TableHookDispatcher#createTable. For SCHEMA scope, only the leaf (schema) name is affected — the namespace metalake.catalog is not normalized by NormalizeDispatcher.
| if (ownerManager != null) { | ||
| ownerManager.setOwner( | ||
| ident.namespace().level(0), | ||
| NameIdentifierUtil.toMetadataObject(ident, Entity.EntityType.FILESET), |
There was a problem hiding this comment.
Same normalization issue as in TableHookDispatcher#createTable.
| if (ownerManager != null) { | ||
| ownerManager.setOwner( | ||
| ident.namespace().level(0), | ||
| NameIdentifierUtil.toMetadataObject(ident, Entity.EntityType.TOPIC), |
There was a problem hiding this comment.
Same normalization issue as in TableHookDispatcher#createTable.
| if (ownerManager != null) { | ||
| ownerManager.setOwner( | ||
| ident.namespace().level(0), | ||
| NameIdentifierUtil.toMetadataObject(ident, Entity.EntityType.MODEL), |
There was a problem hiding this comment.
Same normalization issue as in TableHookDispatcher#createTable.
| OwnerDispatcher ownerManager = GravitinoEnv.getInstance().ownerDispatcher(); | ||
| if (ownerManager != null) { | ||
| ownerManager.setOwner( | ||
| ident.name(), |
There was a problem hiding this comment.
(Pre-existing, unrelated to this PR) The first argument to setOwner should be the metalake name (ident.namespace().level(0)), not the model name (ident.name()).
What changes were proposed in this pull request?
Standardized dispatcher wiring so that all object types follow the Hook(Event(Normalize(Manager))) pattern, matching the existing Policy/AccessControl wiring. This ensures CREATE events are published before SET_OWNER events for all object types.
GravitinoEnv: Flipped wiring for Metalake, Catalog, Schema, Table, Fileset, Topic, Model, Tag, Job dispatchers.
RESTService: Flipped wiring for Iceberg Table, View, Namespace dispatchers.
HookDispatchers: Ensured setOwner failures are gracefully handled via try-catch-log so that ownership failure does not cause creation failure.
Why are the changes needed?
Currently, most object types wrap HookDispatcher inside EventDispatcher (Event(Hook(Manager))). Since HookDispatcher auto-sets the creator as owner after creation, the SET_OWNER event is published before the CREATE event. This results in non-causal event ordering — external consumers see SET_OWNER for an object that hasn't been "created" yet from the event stream's perspective.
Policy already uses the correct pattern where HookDispatcher wraps EventDispatcher. This PR standardizes all object types to match.
Fix: #10713
Does this PR introduce any user-facing change?
No. This is an internal event ordering fix. The API behavior is unchanged. Events are now published in causal order (CREATE before SET_OWNER).
How was this patch tested?
Added setOwner-failure unit tests for all affected HookDispatchers to verify that creation succeeds even when setOwner throws:
Core (10 tests): TestMetalakeHookDispatcher, TestSchemaHookDispatcher, TestTableHookDispatcher, TestFilesetHookDispatcher, TestTopicHookDispatcher, TestModelHookDispatcher, TestTagHookDispatcher, TestPolicyHookDispatcher, TestAccessControlHookDispatcher, TestJobHookDispatcher
Iceberg (3 tests): TestIcebergTableHookDispatcher, TestIcebergViewHookDispatcher, TestIcebergNamespaceHookDispatcher
Tested in Gravitino dev environment. All core and iceberg tests pass:
./gradlew :core:test :iceberg:iceberg-rest-server:test -PskipITs