Skip to content

feat: provided-style extensions and manifest-libs#791

Open
jbachorik wants to merge 26 commits intodevelopfrom
jb/configurations
Open

feat: provided-style extensions and manifest-libs#791
jbachorik wants to merge 26 commits intodevelopfrom
jb/configurations

Conversation

@jbachorik
Copy link
Copy Markdown
Collaborator

@jbachorik jbachorik commented Jan 13, 2026

Summary

This PR introduces provided-style extensions and fat agent deployment for BTrace, enabling application class tracing without classpath pollution and single-JAR deployment for distributed environments.

Features

Provided-Style Extensions

  • Application class tracing without classpath injection - Access classes loaded by application classloader through reflection
  • Runtime class discovery - ClassLoadingUtil loads classes from application context at trace time
  • Safe method handles - MethodHandleCache provides efficient, cached reflective access
  • Manifest-driven library management - New declarative approach replaces error-prone libs= parameter

Embedded Extensions & Fat Agent

  • Single-JAR deployment - Bundle agent + extensions into one JAR for Spark/Hadoop/Kubernetes
  • ClassDataLoader - Runtime loading of impl classes from .classdata resources with thread-safe parallel class loading (registerAsParallelCapable + per-class locking)
  • EmbeddedExtensionRepository - Discovers extensions from JAR manifest with pre-compiled Pattern validation for class names (prevents regex backtracking)
  • Gradle Plugin (org.openjdk.btrace.fat-agent) - Auto-discovers extensions, supports project/maven/file sources
  • Maven Plugin (btrace-maven-plugin) - Equivalent functionality for Maven users

Bug Fixes

  • Bootstrap classloader NPE in Main and EmbeddedExtensionRepository
  • Probe listing after disconnect and test isolation
  • Thread retransform after startup scripts for early hooks
  • Java 8 compatibility (List.of → Collections.emptyList)
  • ClassDataLoader thread safety — fix double-define race in findClass() by adding registerAsParallelCapable() and synchronized(getClassLoadingLock(name)) with double-checked locking to prevent LinkageError from concurrent class definitions
  • Regex safety — fix catastrophic backtracking in EmbeddedExtensionRepository.isValidClassName() using possessive quantifiers and pre-compiled static Pattern
  • JDK 25 compatibility — exclude new Thread accessor methods (inheritableThreadLocals, getAndClearInterrupt, dispatchUncaughtException, etc.) from instrumentation to prevent infinite recursion
  • Fat agent config — update to use btraceJar task (masked JAR architecture)
  • Test port isolationManifestLibsTests uses port 2022 to avoid contention with other tests on default port 2020

Documentation

  • Updated all docs to reference btrace.jar (masked JAR) instead of legacy btrace-agent.jar/btrace-boot.jar
  • Fixed broken cross-references between doc files
  • Added JDK 25 Thread method exclusion documentation
  • Added ClassDataLoader thread-safety documentation

Documentation

Document Description
Fat Agent Plugin Architecture Single-JAR deployment design
Provided-Style Extensions API on bootstrap, impl via TCCL
Migration Guide Migrating from deprecated libs/profiles
Extension Development Guide Building extensions
Gradle Plugin README Both extension and fat-agent plugins

Example Extensions

Usage

Fat Agent (Gradle)

plugins { id 'org.openjdk.btrace.fat-agent' }
btraceFatAgent {
    embedExtensions {
        project(':btrace-metrics')
        maven('io.btrace:btrace-statsd:2.3.0')
    }
}

Fat Agent (Maven)

<plugin>
    <groupId>org.openjdk.btrace</groupId>
    <artifactId>btrace-maven-plugin</artifactId>
    <configuration>
        <extensions>
            <extension>io.btrace:btrace-metrics:${btrace.version}</extension>
        </extensions>
    </configuration>
</plugin>

Breaking Changes

  • libs= and profiles= deprecated (removal planned N+2)

🤖 Generated with Claude Code


This change is Reviewable

@jbachorik jbachorik added the AI AI-generated code label Jan 13, 2026
Copy link
Copy Markdown

@reviewabot reviewabot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the pull request diff is too large to be displayed directly on GitHub. Here are some general recommendations for handling large pull requests:

  1. Break Down the PR: Try to break down the pull request into smaller, more manageable chunks. This makes it easier to review and ensures that each part can be thoroughly checked.

  2. Review Locally: Clone the repository and check out the branch locally. This allows you to review the changes using your preferred tools and environment.

  3. Focus on Key Areas: Identify and focus on the key areas of the code that are most critical or have the highest impact. This can help prioritize the review process.

  4. Automated Tools: Use automated code review tools and linters to catch common issues and enforce coding standards.

  5. Collaborate with the Team: If possible, involve multiple reviewers to share the workload and get different perspectives on the changes.

If you can provide a smaller subset of the changes or specific files that need review, I'd be happy to help with a more detailed review.

@jbachorik jbachorik changed the title fix(agent,client,tests): probe listing after disconnect and test isolation feat: manifest-libs, provided-style extensions, permission removal, and fixes Jan 13, 2026
@jbachorik jbachorik changed the title feat: manifest-libs, provided-style extensions, permission removal, and fixes feat: provided-style extensions and manifest-libs Jan 13, 2026
@jbachorik jbachorik requested a review from Copilot January 13, 2026 16:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces provided-style extensions and manifest-driven library loading to enable tracing application-specific classes without polluting the application classpath. The core innovation is using reflection and object hand-off patterns to access application classes at runtime, keeping BTrace isolated.

Changes:

  • New extension utilities (ClassLoadingUtil, MethodHandleCache) for runtime class resolution and reflective method access
  • Manifest-driven library path resolution (AgentManifestLibs) with security restrictions to BTRACE_HOME
  • Refactored agent classpath processing with deduplication and better error handling
  • Example extensions for Spark and Hadoop demonstrating the provided-style pattern
  • Comprehensive architecture documentation and migration guides
  • Deprecation of libs/profiles mechanism with escape hatch for edge cases

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
settings.gradle Added example extension modules (btrace-spark, btrace-hadoop)
integration-tests/src/test/java/tests/RuntimeTest.java Fixed debug flag insertion and added timing delays for test stability
integration-tests/src/test/java/tests/ManifestLibsTests.java New test suite for manifest-libs feature
integration-tests/src/test/java/tests/BTraceFunctionalTests.java Improved unattended test with better synchronization and cleanup
integration-tests/build.gradle Added test property for skipping preconfigured libs
docs/examples/README.md Guide for building and configuring provided-style extension examples
docs/architecture/provided-style-extensions.md Comprehensive guide on provided-style extension pattern
docs/architecture/migrating-from-libs-profiles.md Migration guide from deprecated libs/profiles to extensions
docs/architecture/agent-manifest-libs.md Design document for manifest-driven library loading
compile.log Build artifact that should not be committed
btrace-extensions/examples/btrace-spark/* Example Spark extension with API and provided-style implementation
btrace-extensions/examples/btrace-hadoop/* Example Hadoop extension with API and provided-style implementation
btrace-extension/src/main/java/org/openjdk/btrace/extension/util/MethodHandleCache.java Cache for reflective method handles to reduce lookup overhead
btrace-extension/src/main/java/org/openjdk/btrace/extension/util/ClassLoadingUtil.java Helper utilities for TCCL-based class loading and service discovery
btrace-client/src/main/java/org/openjdk/btrace/client/Main.java Removed exit hooks for list-only operations to prevent unintended probe removal
btrace-client/src/main/java/org/openjdk/btrace/client/Client.java Added agent-already-running detection and system property pass-through
btrace-agent/src/main/java/org/openjdk/btrace/agent/RemoteClient.java Improved disconnect handling with disconnecting flag
btrace-agent/src/main/java/org/openjdk/btrace/agent/Main.java Refactored classpath processing with manifest support, deduplication, and escape hatch for single-jar injection
btrace-agent/src/main/java/org/openjdk/btrace/agent/AgentManifestLibs.java New utility for manifest-driven library resolution with security checks
README.md Added deprecation notice for libs/profiles with migration guidance

Comment thread btrace-agent/src/main/java/org/openjdk/btrace/agent/AgentManifestLibs.java Outdated
Comment thread btrace-agent/src/main/java/org/openjdk/btrace/agent/AgentManifestLibs.java Outdated
Comment thread btrace-agent/src/main/java/org/openjdk/btrace/agent/Main.java Outdated
Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 13, 2026

@jbachorik I've opened a new pull request, #792, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 13, 2026

@jbachorik I've opened a new pull request, #793, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 13, 2026

@jbachorik I've opened a new pull request, #794, to work on those changes. Once the pull request is ready, I'll request review from you.

@jbachorik jbachorik closed this Mar 15, 2026
@jbachorik jbachorik reopened this Apr 6, 2026
@jbachorik
Copy link
Copy Markdown
Collaborator Author

ReDoS in EmbeddedExtensionRepository.isValidClassName — fixed in dbd3964

Addresses the CodeQL ReDoS alert (js/redos, alert #15) on the class-name validation pattern.

The regex had three ambiguities the engine could explore:

  1. Inner char class [a-zA-Z0-9_$]* inside the .-group and the $-group both allowed $
  2. The outer (\$[a-zA-Z_][a-zA-Z0-9_$]*)* group also started with $
  3. All three repetitions were standard (backtracking) *

On an input like A$A$A$A…! (trailing mismatch) this produced exponential partition searching before failing.

Fix: switch the three repetitions to possessive quantifiers (*+). Each commit is final, so the engine cannot re-partition $-separated tail segments. Match semantics are preserved for every accepted and rejected input in SecurityValidationTest > Class name validation (valid / invalid / special-chars / code-injection / null-and-empty — all green).

Stress-check: A$A × 30 + ! now rejects in ~0 ms instead of hanging.

Items 1–5 of the earlier review (unused Enumeration import, redundant homeStr null check, three zip-slip sites in FatAgentMojo) were already addressed in prior commits on this branch.

@jbachorik jbachorik marked this pull request as ready for review April 12, 2026 18:08
@jbachorik jbachorik requested a review from Copilot April 12, 2026 18:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 54 out of 55 changed files in this pull request and generated 12 comments.

Comment on lines 995 to +1009
public void runBTrace(String[] args, ProcessOutputProcessor outputProcessor) throws Exception {
List<String> argVals =
new ArrayList<>(
Arrays.asList(
javaHome + "/bin/java",
"-cp",
cp,
"org.openjdk.btrace.boot.Loader",
debugBTrace ? "-v" : "",
"-cp",
eventsClassPath,
"-d",
Paths.get(System.getProperty("java.io.tmpdir"), "btrace-test").toString()));
if (debugBTrace) {
argVals.add(4, "-v"); // insert after Main class name
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argVals is built with debugBTrace ? "-v" : "" and then (when debugBTrace=true) another -v is inserted at index 4. This results in a duplicate -v argument, and when debugBTrace=false it leaves an empty-string argument in the command line. Consider removing the placeholder element and only adding -v conditionally (without inserting empty args).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in eaf1686 — removed the debugBTrace ? "-v" : "" ternary placeholder from the initial Arrays.asList(); -v is now added once via the existing conditional add(4, "-v") when debugBTrace=true.

Comment on lines +795 to +799
// Allow time for traced app to produce additional output after "ready:"
// BTrace INFO logs during agent init can exhaust stdoutLatch before app work begins
try {
Thread.sleep(1000L);
} catch (InterruptedException ie) {
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This additional fixed Thread.sleep(1000L) adds another full second of delay even though there is already a 1s sleep immediately above for output flushing. Using two unconditional sleeps increases test runtime and can still be flaky; prefer waiting on a concrete condition (e.g., latch for expected app output) or reusing the existing flush wait.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentionally kept — the two sleeps serve distinct purposes: the first allows late BTrace output to flush in on-startup mode, the second allows the traced app to produce output after ready:. Collapsing them into one risks missing app output. A latch-based approach would be preferable but requires broader refactoring of the test harness.

Comment thread docs/Troubleshooting.md Outdated

This guide helps you diagnose and resolve common BTrace issues.

> **Note:** Examples use `btrace.jar` -- the single masked JAR. If using a legacy multi-JAR distribution, substitute `btrace.jar`.
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note says legacy multi-JAR users should “substitute btrace.jar”, which is the same value used in the example and doesn’t explain what to substitute. This looks like a copy/paste error; it should point to the legacy jar(s) (e.g., btrace-agent.jar / btrace-boot.jar) or otherwise clarify the substitution.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in eaf1686 — note now reads: "If using a legacy multi-JAR distribution, replace btrace.jar with btrace-agent.jar (and add -Xbootclasspath/a:btrace-boot.jar where needed)."

Comment thread docs/BTraceTutorial.md Outdated
@@ -1,5 +1,7 @@
# BTrace Tutorial (BTrace 2.3.0)

> **Note:** Examples use `btrace.jar` -- the single masked JAR. If using a legacy multi-JAR distribution, substitute `btrace.jar`.
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note says legacy multi-JAR users should “substitute btrace.jar”, which is the same value used in the example. This looks incorrect/ambiguous; update it to reference the actual legacy jar name(s) or clarify what should be substituted.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in eaf1686 — same correction applied to BTraceTutorial.md.

Comment thread docs/README.md
Comment on lines +47 to 54
1. **JFR Integration** → [Getting Started: JFR Integration](GettingStarted.md#advanced-jfr-integration), [Tutorial Lesson 5](BTraceTutorial.md)
2. **Sampling** → [Quick Reference: @Sampled](QuickReference.md#sampled), [FAQ: Performance](FAQ.md#performance-issues)
3. **Aggregations** → [Quick Reference: Aggregation Functions](QuickReference.md#aggregation-functions)
4. **Cloud Deployments** → [Getting Started: K8s](GettingStarted.md#btrace-in-containers-and-kubernetes), [FAQ: K8s](FAQ.md#can-i-use-btrace-with-microservices)
5. **Fat Agent JAR** → [Getting Started: Fat Agent](GettingStarted.md#fat-agent-jar-single-jar-deployment) for single-JAR deployment
6. **Level Filtering** → [Quick Reference: @Level](QuickReference.md#level)
7. **Extensions Architecture** → [Extension invokedynamic Bridge](architecture/extension-invokedynamic-bridge.md)

Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link target architecture/extension-invokedynamic-bridge.md does not exist in docs/architecture/ (the file present is ExtensionInvokeDynamicBridge.md). Also, this section uses a different filename casing (GettingStarted.md, QuickReference.md, etc.) than other links earlier in this doc; please make link casing consistent across the whole file to avoid broken links on case-sensitive filesystems.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in eaf1686 — link target corrected to architecture/ExtensionInvokeDynamicBridge.md; link casing standardized throughout the "Documentation by Topic" section to match the capitalized filenames used elsewhere.

* does not emit a misleading "Loaded bundled probe" info line.
*/
private static boolean loadEmbeddedProbe(String resourcePath, String probeName, boolean traceToStdOut) {
try (InputStream is = Main.class.getClassLoader().getResourceAsStream(resourcePath)) {
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main.class.getClassLoader() can be null when loaded from the bootstrap classloader; calling getResourceAsStream(...) on it will throw an NPE. Use the same null-safe resource loading approach used elsewhere (fallback to ClassLoader.getSystemResourceAsStream) so embedded probe lookup works reliably.

Suggested change
try (InputStream is = Main.class.getClassLoader().getResourceAsStream(resourcePath)) {
ClassLoader loader = Main.class.getClassLoader();
try (InputStream is =
loader != null
? loader.getResourceAsStream(resourcePath)
: ClassLoader.getSystemResourceAsStream(resourcePath)) {

Copilot uses AI. Check for mistakes.
public ClassDataLoader(String extensionId, ClassLoader resourceLoader, ClassLoader parent) {
super(parent);
this.extensionId = extensionId;
this.resourceLoader = resourceLoader;
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClassDataLoader stores resourceLoader as-is and later unconditionally calls resourceLoader.getResourceAsStream(...). If the loader is null (bootstrap), this will throw an NPE. Consider normalizing null to ClassLoader.getSystemClassLoader() (as done in EmbeddedExtensionRepository) or guarding at use sites.

Suggested change
this.resourceLoader = resourceLoader;
this.resourceLoader =
resourceLoader != null ? resourceLoader : ClassLoader.getSystemClassLoader();

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in eaf1686 — constructor now normalizes: this.resourceLoader = resourceLoader != null ? resourceLoader : ClassLoader.getSystemClassLoader();

Comment on lines +425 to +427
List<Path> boot = List.of(tempDir.resolve("boot.jar"));
List<Path> system = List.of(tempDir.resolve("sys.jar"));

Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test uses List.of(...) (Java 9+) which will fail if unit tests are ever executed on Java 8 (and contradicts the PR’s stated Java 8 compatibility goal). Prefer Collections.singletonList(...) / Arrays.asList(...) to keep tests Java-8-runtime compatible.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in eaf1686 — replaced with Collections.singletonList() and added the required import java.util.Collections;.

Comment on lines +6 to +12
group = 'org.openjdk.btrace'
version = '2.3.0-SNAPSHOT'

java {
sourceCompatibility = JavaVersion.VERSION_11
targetCompatibility = JavaVersion.VERSION_11
}
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module hard-codes version = '2.3.0-SNAPSHOT', which can diverge from the repository’s root version and lead to inconsistent published artifacts. Prefer inheriting rootProject.version (or a shared version property) so the Maven plugin version stays aligned with the rest of the build.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in eaf1686version = '2.3.0-SNAPSHOT' replaced with version = rootProject.version.

Comment on lines 1133 to +1147
List<String> argVals =
new ArrayList<>(
Arrays.asList(
javaHome + "/bin/java",
"-cp",
cp,
"org.openjdk.btrace.boot.Loader",
debugBTrace ? "-v" : "",
"-cp",
eventsClassPath,
"-d",
Paths.get(System.getProperty("java.io.tmpdir"), "btrace-test").toString()));
if (debugBTrace) {
argVals.add(4, "-v"); // insert after Main class name
}
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runBTrace(..., checkLines, ...) helper still builds argVals with debugBTrace ? "-v" : "" and then conditionally inserts another -v at index 4. This duplicates -v when enabled and leaves an empty-string arg when disabled. Remove the placeholder element and only add -v once when needed.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in eaf1686 — same fix applied to the second runBTrace helper (lines ~1136–1147).

jbachorik and others added 5 commits April 17, 2026 23:49
…ibs deprecation; provided-style helpers; examples; tests\n\n- agent: manifest-driven libs (flagged), centralized classpath append with dedup + diagnostics; deprecate libs (debug log); add single-jar system CL escape hatch; test knob scoped to manifest-libs\n- client: pass-through of btrace.* props to agent using $ system props\n- extension: add ClassLoadingUtil and MethodHandleCache for provided-style extensions\n- examples: add btrace-extensions/examples/{btrace-spark,btrace-hadoop} with README\n- docs: research + provided-style guide + migration + examples index; README deprecation note\n- tests: add ManifestLibsTests; gradle knob for btrace.test.skipLibs\n\nBREAKING CHANGE: libs/profiles deprecated with planned removal after N+2; prefer extensions
- Proactively retransform java.lang.Thread after startup scripts load
- Move initExtensions() after transformer is installed
- Add delay in RuntimeTest to allow traced app output after ready marker

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ation

- Add disconnecting flag to RemoteClient for immediate probe visibility
- Skip loadAgent when agent already running (check btrace.port)
- Add try-finally cleanup in tests to prevent port conflicts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…il/ClassLoadingUtil.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI and others added 18 commits April 17, 2026 23:49
…usion

Co-authored-by: jbachorik <738413+jbachorik@users.noreply.github.com>
… compatibility

Co-authored-by: jbachorik <738413+jbachorik@users.noreply.github.com>
- Add ExtensionConfigurator interface for environment-aware probe selection
- Add ProbeConfiguration and RuntimeEnvironment classes
- Add ClassDataLoader for loading .classdata resources (dd-trace pattern)
- Add EmbeddedExtensionRepository for discovering embedded extensions
- Extend ExtensionDescriptorDTO with embedded, resourceBasePath, configuratorClass, bundledProbes fields
- Register EmbeddedExtensionRepository in ExtensionLoader (lowest priority)
- Handle embedded extensions in ExtensionLoaderImpl via ClassDataLoader

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- gradle: BTraceFatAgentPlugin with auto-discovery, project/maven/file sources
- maven: btrace-maven-plugin with fat-agent goal (Gradle-built module)
- fix: bootstrap classloader NPE in Main and EmbeddedExtensionRepository
- docs: fat-agent-plugin architecture, getting started, migration guide

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Zip Slip protection in FatAgentMojo (path traversal prevention)
- Add extension ID validation to prevent path traversal attacks
- Add bytecode validation in ClassDataLoader (magic number check)
- Add class name validation for services and configurators
- Validate ProbeConfiguration.setOutput() throws on invalid values
- Improve ClassFilter documentation for sensitive method filtering
- Add AgentManifestLibsTest (27 tests)
- Add BTraceFatAgentPluginTest using Gradle TestKit (10 tests)
- Add FatAgentMojoTest (26 tests)
- Add SecurityValidationTest (17 tests)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The class-name validation pattern in EmbeddedExtensionRepository used
ambiguous quantifiers where the inner character class allowed '$' and
the outer group matched '$'-prefixed segments. On input like 'A$A$A$A'
with a trailing mismatch, the engine explored exponentially many
partitions (CodeQL js/redos, alert #15).

Switch the three repetitions to possessive quantifiers (`*+`) so each
commit step is final. Semantics are preserved for all accepted and
rejected class names (verified against SecurityValidationTest).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…and plugins

Applies 16 fixes surfaced by a multi-scholic review of the jb/configurations
branch. Each item below was verified against current HEAD before editing.

agent:
- Main.loadEmbeddedProbe: close the probe InputStream via try-with-resources
  and stop returning true from a stub that installs nothing; log an operator
  warn so the probes= argument no longer silently succeeds.
- Main.appendBootJar / appendSystemJar: synchronize the
  contains/appendTo/add block on the respective Set so two threads cannot
  both pass the dedup check and double-append the same jar.
- Main libs deprecation: emit at log.warn (was hidden behind isDebugEnabled).
- AgentManifestLibs.scanLibTree: close the Files.walk stream via
  try-with-resources (Files.walk is backed by a DirectoryStream).
- AgentManifestLibs.locateAgentPath: broaden the catch to include
  IllegalArgumentException and FileSystemNotFoundException (both thrown by
  Paths.get(URI) for non-file code sources), and drop the dead
  Files.isDirectory/else branch that returned the same value in both arms.
- AgentManifestLibs.filterAndNormalize: when toRealPath() fails, compare np
  against home.toAbsolutePath().normalize() instead of the raw home path so
  the fallback is at the same normalization level as np, and emit a warn so
  operators know symlink resolution was bypassed.
- RemoteClient.reconnect: publish this.disconnecting=false last so the
  volatile write acts as a release fence for the prior sock/ois/oos writes.

extension:
- EmbeddedExtensionRepository.readManifestIndex: trim each id after splitting
  the BTrace-Embedded-Extensions manifest value so a human-written
  "ext1, ext2" no longer produces a leading-space id that isValidExtensionId
  rejects.
- EmbeddedExtensionRepository.discoverBundledProbes: read the bundled probe
  list from the extension's properties file (probes=Probe1,Probe2) instead
  of always returning an empty list, so ExtensionDescriptorDTO actually
  carries the probes the extension declares.
- MethodHandleCache: add a negative cache so a repeated lookup for a missing
  method fails fast via the cached LookupRuntimeException instead of
  re-entering the reflective machinery on every call.

core:
- ProbeConfiguration.setOutput: use toLowerCase(Locale.ROOT) instead of
  default-locale toLowerCase() so Turkish dotted/dotless-I does not make
  "JFR" / "FILE" fall through to IllegalArgumentException.

maven plugin:
- FatAgentMojo.execute: wrap the staging-directory lifetime in try-finally
  so deleteDirectory runs on every exit path, not just the happy one.

gradle plugin:
- BTraceFatAgentExtension Maven/File extractFromZip: switch from
  project.buildDir (deprecated in Gradle 7, removed in Gradle 9) to
  project.layout.buildDirectory.dir(...).get().asFile.

housekeeping:
- Remove the accidentally-committed top-level compile.log and add it to
  .gitignore so it cannot be reintroduced.

Tests: full builds of btrace-core, btrace-extension, btrace-agent,
btrace-maven-plugin, and btrace-gradle-plugin are green, including
FatAgentMojoTest, AgentManifestLibsTest, SecurityValidationTest,
MainTest, and ExtensionBridgeImplPolicyTest.

The ClassDataLoader double-define race also surfaced in the review but the
chosen fix is still under discussion (see inline PR comment on
ClassDataLoader.findClass re: registerAsParallelCapable vs. synchronized
block on getClassLoadingLock) and is deliberately not included in this
commit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two threads racing findClass for the same name could both reach
defineClass, which the JVM rejects with LinkageError. Guard the
cache-check + defineClass block with synchronized(getClassLoadingLock(name))
so only one thread per name defines, while a lock-free fast path still
handles the warm-cache case for already-loaded classes.

The synchronized block is chosen over registerAsParallelCapable because
it protects direct findClass callers in addition to the standard
loadClass delegation path, and keeps the invariant locally visible in
findClass itself.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ClassDataLoader: fix double-define race in findClass() by adding
  registerAsParallelCapable() and synchronized(getClassLoadingLock(name))
  with double-checked locking to prevent LinkageError from concurrent
  class definitions
- EmbeddedExtensionRepository: fix regex backtracking in isValidClassName()
  by removing $ from package-segment character class, and extract static
  Pattern field to avoid recompilation on each call

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…build.gradle

A rebase conflict resolution left a commit message fragment appended to
a systemProperty line, causing Groovy to fail parsing at line 77.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…itecture)

The masked JAR architecture merged agent + boot into a single `btraceJar`
task. The fat agent config still referenced the old `agentJar` and
`bootJar` tasks which no longer exist, causing build failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ient.java

- Main.java:898: commit message fragment appended to closing brace
- RemoteClient.java:351: reference to non-existent `disconnecting` field
  from pre-rebase branch (develop uses `disconnected`)

Both were leftovers from the rebase conflict resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rebase conflict resolution left a spurious `try {` block wrapping
testOnMethodUnattended() without a matching catch/finally, causing
a compile error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ManifestLibsTests was failing on CI with "Port 2020 unavailable" because
a previous test class (BTraceFunctionalTests) left the default BTrace
port briefly occupied. Use port 2022 to avoid contention.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The btrace CLI process launched by RuntimeTest.attach() was not receiving
the configured port — it defaulted to 2020 regardless of btrace.port.
Pass -Dbtrace.port and -p <port> to the spawned java process so
ManifestLibsTests (port 2022) avoids contention with other tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace btrace-agent.jar/btrace-boot.jar references with btrace.jar
  across all docs (README, GettingStarted, FAQ, Troubleshooting,
  QuickReference, BTraceTutorial, fat-agent-plugin arch doc)
- Fix broken doc links (casing: btrace-extension-development-guide.md
  → BTraceExtensionDevelopmentGuide.md; remove nonexistent
  extension-configuration.md reference)
- Update fat agent plugin docs: agentJar/bootJar → btraceJar task,
  document ClassDataLoader thread-safety (registerAsParallelCapable,
  per-class locking)
- Update Gradle plugin README: remove obsolete bootJarTask config
- Add JDK 25 Thread method exclusions to Troubleshooting.md
- Add test port isolation note to CLAUDE.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Applied via /muse:implement (direct mode):
- DC-08/DC-19: Remove duplicate -v flag and empty-string arg from RuntimeTest.java runBTrace helpers
- DC-10/DC-11: Fix confusing legacy JAR note in docs (substitute → replace with legacy jar names)
- DC-12: Fix broken link in docs/README.md (ExtensionInvokeDynamicBridge.md) + standardize link casing
- DC-13/DC-14: Fix BOOT_ADDED/SYSTEM_ADDED ordering — add to set AFTER successful append to prevent stale state on failure
- DC-15: Null-safe classloader in loadEmbeddedProbe (bootstrap classloader guard)
- DC-16: Normalize null resourceLoader in ClassDataLoader constructor to system classloader
- DC-17: Replace List.of() with Collections.singletonList() for Java 8 compatibility
- DC-18: Use rootProject.version in btrace-maven-plugin/build.gradle

DC-01/02/03/04/05/06/07 were already resolved in the branch.
DC-09 deferred (removing sleep risks test flakiness without a concrete condition mechanism).

<!-- muse-session:impl-20260417-225428 -->

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
jbachorik added a commit that referenced this pull request Apr 17, 2026
After rebasing onto develop (which added PR #805's btracePort field and
conditional -p block), the unconditional -p inserted by PR #791 collides
with the conditional block, producing two -p flags when a test sets
btracePort. The btrace CLI rejects the duplicate and prints usage,
causing ExtensionLifecycleIntegrationTest to fail on every JDK.

Drop the unconditional pair from runBTraceDynamic so only the existing
conditional block adds -p when btracePort > 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rebasing onto develop brought in PR #805's conditional
  if (btracePort > 0) { add -p btracePort }
block, which collides with PR #791's unconditional
  "-p", String.valueOf(getBTracePort())
in runBTraceDynamic — producing two -p flags when a test overrides
btracePort.

Keep the unconditional form, since getBTracePort() already honours
both the develop-style btracePort field (ExtensionLifecycleIntegrationTest)
and the PR-style btrace.port system property (ManifestLibsTests), and
drop the now-redundant conditional block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jbachorik and others added 2 commits April 21, 2026 20:40
Operational logging & error handling:
- Main.java: log bootstrap/system jar append failures at warn (was debug),
  Thread.class retransform failure at warn (was debug), and warn when both
  deprecated libs= and manifest-based libs are in use
- AgentManifestLibs: upgrade skip/validation log levels from DEBUG to INFO
  (non-existent / non-jar entries) and WARN (path resolution exceptions)
- Main.java: add FIXME/TODO marker on loadBundledProbes() placeholder

Exception handling & resource lifecycle:
- ClassDataLoader.findClass: preserve IOException cause when throwing
  ClassNotFoundException
- MethodHandleCache: drop the negative cache; a failed lookup no longer
  permanently poisons the cache, so classes that become available later
  are found on retry
- FatAgentMojo.createJar: wrap Files.walk() in try-with-resources to
  release file handles on all exit paths

Classloader & instrumentation:
- ClassDataLoader: add canonical-pattern comment above
  registerAsParallelCapable() with JDK javadoc reference
- ClassFilter: expand inline comment on Thread-method exclusions to
  explain why getUncaughtExceptionHandler stays in the exclusion set
  (dispatchUncaughtException is the recursion source but probes hooking
  the getter could still re-enter the dispatcher path)
- Main.java: add comment on Thread.class retransform ordering clarifying
  that the current post-startScripts / pre-initExtensions placement is
  intentional

Extension repository:
- EmbeddedExtensionRepository.readManifestIndex: iterate all
  BTrace-Embedded-Extensions manifests instead of returning on first
  match; dedupe extension IDs and warn on duplicates with the offending
  manifest URL
- EmbeddedExtensionRepository: introduce a BOOTSTRAP_SENTINEL classloader
  + isBootstrap flag so null-classloader semantics are explicit rather
  than coalesced

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- AgentManifestLibs.filterAndNormalize: javadoc describes that the
  returned list preserves LinkedHashSet insertion order (manifest
  declaration order), that callers must supply the set in desired
  precedence order, and that the first entry wins for duplicate class
  names.
- ExtensionLoaderImpl.loadEmbedded: comment at the ClassDataLoader
  construction site explains that a null parentClassLoader is
  intentional and well-defined — ClassLoader(ClassLoader) treats null
  as "use the bootstrap classloader as parent", which is the correct
  delegation chain when BTrace runs on the bootstrap classpath.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jbachorik
Copy link
Copy Markdown
Collaborator Author

Review-fix commits pushed

Addressed 17 of 23 accepted findings from the muse chorus review (session review-20260421-095158), in two commits:

Highlights

  • Main.java: bootstrap/system jar append failures now log at warn; Thread.class retransform failure warns; explicit warn when both deprecated libs= and manifest-libs are in use; FIXME marker on loadBundledProbes() placeholder; comment clarifying Thread-class retransform ordering is intentional.
  • ClassDataLoader.findClass: IOException cause is preserved when converting to ClassNotFoundException.
  • MethodHandleCache: negative cache removed — failed lookups no longer permanently poison the cache.
  • FatAgentMojo.createJar: Files.walk() now in try-with-resources.
  • AgentManifestLibs: skip/validation log levels upgraded (DEBUG → INFO / WARN) so operators see non-existent / invalid entries.
  • EmbeddedExtensionRepository.readManifestIndex: iterates all BTrace-Embedded-Extensions manifests instead of returning on first match; dedupes extension IDs and warns on duplicates. Null-classloader handling now explicit via BOOTSTRAP_SENTINEL + isBootstrap flag.
  • ClassFilter: expanded rationale comment for JDK-25 Thread-method exclusions — kept getUncaughtExceptionHandler in the exclusion set because probes on the getter could re-enter the dispatcher path.

Verification

  • ./gradlew :btrace-agent:compileJava :btrace-extension:compileJava :btrace-instr:compileJava :btrace-maven-plugin:compileJava — green
  • ./gradlew spotlessCheck — green
  • Unit tests: 930/930 pass (920 module + 10 gradle-plugin), 0 failures, 0 errors

Deferred for follow-up

6 findings were not addressed in these commits because they need architectural or design decisions best handled separately:

  1. ClassDataLoader.java:98 — add ProtectionDomain/CodeSource on defineClass. Planner concertato recommended deferring: post-JDK 17 removal of SecurityManager makes a correct ProtectionDomain non-trivial.
  2. FatAgentMojo.java:~6055 — extract duplicated isValidExtensionId logic. Needs a module-boundary decision (new btrace-plugin-common vs. intentional duplication).
  3. FatAgentMojoTest.java:189 — replace "no-behavior-assertion" tests with concrete assertions.
  4. SecurityValidationTest.java:3392 — replace mock-based bytecode magic test with a real ClassDataLoader integration test.
  5. BTraceFatAgentPluginTest.java:~5117 — test duplication (phantom line numbers — requires locating the actual duplicated test code).
  6. BTraceFatAgentPlugin.groovy:~4722 — unify Gradle and Maven plugin extension-staging logic. Planner concertato flagged as out of scope for this PR.

The user also dismissed two earlier findings with explicit rationale: ClassDataLoader double-check-lock (intentional lockfree fast path) and ClassDataLoader:2468 concurrency (assumed harmless race, to be validated).

🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI AI-generated code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants