RDF, cleanup relations and remove unnecessary bindings, add distributed mode for RDF reindex#26902
Conversation
…ed mode for RDF reindex
✅ TypeScript Types Auto-UpdatedThe generated TypeScript types have been automatically updated based on JSON schema changes in this PR. |
There was a problem hiding this comment.
Pull request overview
This PR extends the RDF knowledge graph/reindexing feature set by adding UI/schema support for RDF indexing configuration (including distributed mode), improving runtime behavior around app pages, and wiring RDF job status updates through WebSockets.
Changes:
- Added a new RDF indexing application schema and made application-run configuration UI resilient to missing schemas.
- Extended RDF graph explore API types/params to support entity/relationship filtering and filter options in responses.
- Added an RDF WebSocket broadcast channel and subscribed the App Runs History UI to RDF job status updates; optimized Airflow status fetching based on route.
Reviewed changes
Copilot reviewed 42 out of 43 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| openmetadata-ui/src/main/resources/ui/src/utils/ApplicationSchemas/RdfIndexApp.json | New UI schema for RDF indexing configuration (incl. distributed options). |
| openmetadata-ui/src/main/resources/ui/src/rest/rdfAPI.ts | Adds typed graph filter params/options and updates entity-graph API signature. |
| openmetadata-ui/src/main/resources/ui/src/components/KnowledgeGraph/KnowledgeGraph.interface.ts | Adds filter option types to KnowledgeGraph data model. |
| openmetadata-ui/src/main/resources/ui/src/context/AirflowStatusProvider/AirflowStatusProvider.tsx | Skips Airflow status fetch on routes where it’s not needed. |
| openmetadata-ui/src/main/resources/ui/src/context/AirflowStatusProvider/AirflowStatusProvider.test.tsx | Adds route-aware tests via MemoryRouter and validates skipping behavior. |
| openmetadata-ui/src/main/resources/ui/src/constants/constants.ts | Adds RDF_INDEX_JOB_BROADCAST_CHANNEL socket event constant. |
| openmetadata-ui/src/main/resources/ui/src/components/Settings/Applications/AppSchedule/AppScheduleProps.interface.ts | Makes jsonSchema optional to support apps without a schema. |
| openmetadata-ui/src/main/resources/ui/src/components/Settings/Applications/AppRunsHistory/AppRunsHistory.interface.ts | Makes jsonSchema optional to support schema-less runs/history UI. |
| openmetadata-ui/src/main/resources/ui/src/components/Settings/Applications/AppRunsHistory/AppRunsHistory.component.tsx | Disables config UI when schema missing and subscribes to RDF job WebSocket updates. |
| openmetadata-ui/src/main/resources/ui/src/components/Settings/Applications/AppRunsHistory/AppRunsHistory.test.tsx | Adds WebSocket mocking, schema-unavailable test, and RDF channel subscription test. |
| openmetadata-ui/src/main/resources/ui/src/components/Settings/Applications/AppDetails/AppDetails.component.tsx | Handles missing schema imports by clearing schema and showing a toast. |
| openmetadata-ui/src/main/resources/ui/src/components/Settings/Applications/AppDetails/ApplicationsClassBase.test.ts | Verifies RDF schema import works and includes new distributed fields. |
| openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/rdf/distributed/DistributedRdfIndexCoordinator.java | Implements distributed RDF indexing coordinator flow (partition claim/aggregate refresh). |
| openmetadata-service/src/main/java/org/openmetadata/service/OpenMetadataApplication.java | Registers the distributed RDF job participant at application startup. |
| CollectionDAO collectionDAO = jdbi.onDemand(CollectionDAO.class); | ||
| RdfDistributedJobParticipant participant = new RdfDistributedJobParticipant(collectionDAO); | ||
| environment.lifecycle().manage(participant); |
There was a problem hiding this comment.
The catch block logs only ex.getMessage() and drops the stack trace, which makes diagnosing startup failures in distributed RDF job participant registration difficult. Log the exception object as well (e.g., LOG.warn(..., ex)) or rethrow if registration is required for correctness.
There was a problem hiding this comment.
@copilot apply changes based on this feedback
| try { | ||
| const schema = await applicationsClassBase.importSchema(fqn); | ||
| setJsonSchema(schema); | ||
| } catch (_) { | ||
| setJsonSchema(undefined); | ||
| showErrorToast( | ||
| t('message.no-application-schema-found', { appName: fqn }) | ||
| ); | ||
| } |
There was a problem hiding this comment.
The schema import error is swallowed (catch (_)) and always shows the generic "no schema" toast. If the dynamic import fails for other reasons (e.g., chunk load error), this message can be misleading and makes debugging harder. Consider capturing the error and either passing it to showErrorToast (with the fallback message) or at least logging it for diagnosis.
| ] | ||
| }, | ||
| "default": [ | ||
| "all" | ||
| ], | ||
| "uiFieldType": "treeSelect", | ||
| "uniqueItems": true |
There was a problem hiding this comment.
entities.items.enum does not include the value "all", but entities.default is set to ["all"]. This makes the default invalid against the schema and can break form defaults/validation. Either add "all" to the enum or change the default to valid enum values.
| package org.openmetadata.service.apps.bundles.rdf.distributed; | ||
|
|
||
| import static org.junit.jupiter.api.Assertions.assertEquals; | ||
| import static org.junit.jupiter.api.Assertions.assertNull; | ||
| import static org.junit.jupiter.api.Assertions.assertThrows; | ||
| import static org.junit.jupiter.api.Assertions.assertTrue; |
There was a problem hiding this comment.
This new test file is missing the standard Apache 2.0 license header that other service test classes include. Please add the project license header block at the top to keep licensing checks and file headers consistent.
| public List<RdfIndexPartition> calculatePartitions(UUID jobId, Set<String> entityTypes) { | ||
| List<RdfIndexPartition> partitions = new ArrayList<>(); | ||
| for (String entityType : entityTypes) { | ||
| partitions.addAll(calculatePartitionsForEntity(jobId, entityType)); | ||
| } | ||
| return partitions; | ||
| } |
There was a problem hiding this comment.
RdfPartitionCalculator currently creates partitions without enforcing an upper bound on total partition count. Search-index partitioning enforces a MAX_TOTAL_PARTITIONS safety limit to avoid overwhelming the DB/job tables at extreme scale; consider adding a similar guard here (or otherwise bounding/streaming partition creation) to prevent creating an unbounded number of rdf_index_partition rows.
| return buffer.toString(StandardCharsets.UTF_8); | ||
| } | ||
|
|
There was a problem hiding this comment.
BufferedServletResponseWrapper tracks a configurable characterEncoding, but getCapturedBody() always decodes the buffer using UTF-8. If a handler sets a different encoding (or writes via getWriter() with a different encoding), the captured error body can be garbled in logs/exceptions. Decode using the wrapper’s current characterEncoding instead of hardcoding UTF-8.
| return buffer.toString(StandardCharsets.UTF_8); | |
| } | |
| return buffer.toString(getCurrentCharset()); | |
| } | |
| private Charset getCurrentCharset() { | |
| return characterEncoding == null | |
| ? StandardCharsets.UTF_8 | |
| : Charset.forName(characterEncoding); | |
| } |
| isRunLoading: boolean; | ||
| isDeployLoading: boolean; | ||
| }; | ||
| jsonSchema: RJSFSchema; | ||
| jsonSchema?: RJSFSchema; | ||
| onSave: (cron: string) => Promise<void>; | ||
| onDemandTrigger: () => Promise<void>; |
There was a problem hiding this comment.
Now that jsonSchema is optional, please ensure downstream consumers properly react when it transitions from undefined → defined (schema import is async). For example, AppSchedule memoizes the AppRunsHistory element but doesn’t include jsonSchema in its useMemo dependency list, which can cause the config UI to stay disabled even after the schema loads.
| $$section | ||
| ### entities $(id="entities") | ||
|
|
||
| $$ | ||
|
|
||
| $$section | ||
| ### Recreate RDF Store $(id="recreateIndex") | ||
|
|
||
| Clear the RDF store before indexing. | ||
|
|
There was a problem hiding this comment.
The markdown doc’s entities section is currently empty (it opens/close the section without any descriptive text). Since this file is user-facing documentation for the app schema, please add a short description for what the entities field does (and how an empty selection behaves) so the rendered schema docs aren’t blank for this field.
| } | ||
|
|
||
| public RdfPartitionCalculator(int partitionSize) { | ||
| this.partitionSize = Math.clamp(partitionSize, MIN_PARTITION_SIZE, MAX_PARTITION_SIZE); |
There was a problem hiding this comment.
Math.clamp(...) is not a valid JDK 21 API, so this will not compile. Replace with explicit min/max clamping (or a local helper) to bound partitionSize between MIN_PARTITION_SIZE and MAX_PARTITION_SIZE.
| this.partitionSize = Math.clamp(partitionSize, MIN_PARTITION_SIZE, MAX_PARTITION_SIZE); | |
| this.partitionSize = Math.max(MIN_PARTITION_SIZE, Math.min(partitionSize, MAX_PARTITION_SIZE)); |
| root: 'tw:focus-visible:outline-2 tw:focus-visible:outline-offset-2 tw:relative tw:overflow-hidden tw:rounded-xl tw:transition tw:duration-100', | ||
| }, |
There was a problem hiding this comment.
cardStyles.common.root dropped tw:outline-focus-ring, while other interactive components in ui-core-components consistently include it with focus-visible:outline-*. Without it, the Card focus indicator may become inconsistent or lose the intended theme styling. Consider adding tw:outline-focus-ring back to cardStyles.common.root (or ensuring an equivalent focus-ring utility is applied).
| default: { | ||
| root: 'tw:ring-1 tw:ring-inset tw:ring-secondary tw:bg-primary', | ||
| root: 'tw:border-1 tw:border-secondary tw:bg-primary', | ||
| }, | ||
| elevated: { | ||
| root: 'tw:ring-1 tw:ring-inset tw:ring-secondary tw:bg-primary tw:shadow-md', | ||
| root: 'tw:border-1 tw:border-secondary tw:bg-primary tw:shadow-md', | ||
| }, | ||
| outlined: { root: 'tw:ring-2 tw:ring-inset tw:ring-primary tw:bg-primary' }, | ||
| outlined: { root: 'tw:border-2 tw:border-primary tw:bg-primary' }, | ||
| ghost: { root: 'tw:bg-transparent' }, | ||
| }, | ||
| colors: { | ||
| default: { root: '' }, | ||
| brand: { | ||
| root: 'tw:bg-utility-brand-50 tw:ring-1 tw:ring-inset tw:ring-utility-brand-200', | ||
| root: 'tw:bg-utility-brand-50 tw:border-1 tw:border-utility-brand-200', | ||
| }, | ||
| error: { | ||
| root: 'tw:bg-utility-error-50 tw:ring-1 tw:ring-inset tw:ring-utility-error-200', | ||
| root: 'tw:bg-utility-error-50 tw:border-1 tw:border-utility-error-200', | ||
| }, | ||
| warning: { | ||
| root: 'tw:bg-utility-warning-50 tw:ring-1 tw:ring-inset tw:ring-utility-warning-200', | ||
| root: 'tw:bg-utility-warning-50 tw:border-1 tw:border-utility-warning-200', | ||
| }, | ||
| success: { | ||
| root: 'tw:bg-utility-success-50 tw:ring-1 tw:ring-inset tw:ring-utility-success-200', | ||
| root: 'tw:bg-utility-success-50 tw:border-1 tw:border-utility-success-200', | ||
| }, | ||
| }, | ||
| interactive: { root: 'tw:cursor-pointer' }, | ||
| interactiveVariants: { | ||
| default: { root: 'tw:hover:bg-primary_hover tw:hover:ring-primary' }, | ||
| default: { root: 'tw:hover:bg-primary_hover tw:hover:border-primary' }, | ||
| elevated: { root: 'tw:hover:bg-primary_hover tw:hover:shadow-lg' }, | ||
| outlined: { root: 'tw:hover:bg-secondary tw:hover:ring-primary' }, | ||
| outlined: { root: 'tw:hover:bg-secondary tw:hover:border-primary' }, | ||
| ghost: { | ||
| root: 'tw:hover:bg-secondary tw:hover:ring-1 tw:hover:ring-inset tw:hover:ring-secondary', | ||
| root: 'tw:hover:bg-secondary tw:hover:border-1 tw:hover:border-inset tw:hover:border-secondary', | ||
| }, |
There was a problem hiding this comment.
The card styles introduce Tailwind classes like tw:border-1 and tw:hover:border-inset. border-inset is not a standard Tailwind utility (Tailwind supports ring-inset, but not border-inset), and border-1 is typically border/border-[1px]. If these utilities aren't defined, the card border/hover styles will silently not apply. Consider switching to standard utilities (e.g., tw:border or tw:border-[1px], and remove/replace tw:hover:border-inset).
| outlined: { root: 'tw:hover:bg-secondary tw:hover:border-primary' }, | ||
| ghost: { | ||
| root: 'tw:hover:bg-secondary tw:hover:ring-1 tw:hover:ring-inset tw:hover:ring-secondary', | ||
| root: 'tw:hover:bg-secondary tw:hover:border-1 tw:hover:border-inset tw:hover:border-secondary', |
There was a problem hiding this comment.
⚠️ Bug: Mechanical ring→border replace produces invalid border-inset class
The ghost card hover style was changed from tw:hover:ring-inset to tw:hover:border-inset. In Tailwind CSS, ring-inset is a valid utility that renders box-shadow rings inward, but border-inset maps to border-style: inset which produces a 3D beveled border — not the intended visual effect. This appears to be a mechanical find-and-replace error during the ring→border migration.
Since borders don't need an "inset" modifier (they are always part of the box model), the tw:hover:border-inset class should simply be removed.
Suggested fix:
ghost: {
root: 'tw:hover:bg-secondary tw:hover:border-1 tw:hover:border-secondary',
},
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
| } | ||
| }; | ||
|
|
||
| // The main effect that initializes the graph after data is loaded and whenever |
There was a problem hiding this comment.
💡 Quality: Truncated comment on useEffect — sentence ends mid-thought
The comment on line 304 reads: // The main effect that initializes the graph after data is loaded and whenever — the sentence is incomplete. This was likely cut off during editing. Please finish the thought (e.g., "...and whenever the layout, depth, or filters change.").
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
|
Code Review
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar
|
🔴 Playwright Results — 1 failure(s), 45 flaky✅ 3610 passed · ❌ 1 failed · 🟡 45 flaky · ⏭️ 84 skipped
Genuine Failures (failed on all attempts)❌
|



…
Describe your changes:
Fixes
I worked on ... because ...
Type of change:
Checklist:
Fixes <issue-number>: <short explanation>Summary by Gitar
exploreEntityGraphandexportEntityGraphmethodsclampGraphDepthutility method to constrain user-provided depth parameterscompletedAttimestamp assignment to always set current time for terminal job statescompletedAtinitialization inDistributedRdfIndexCoordinatorThis will update automatically on new commits.