codeflash-agent/plugin/languages/java/references/library-replacement.md
mashraf-222 270cb56cee
Feat/java language support (#12)
* Add Java/Kotlin detection to top-level language router

Adds pom.xml, build.gradle, build.gradle.kts, settings.gradle, and
settings.gradle.kts as markers that route to the codeflash-java router.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add Java/Kotlin agent definitions for all optimization domains

10 agents covering the full optimization pipeline:
- codeflash-java: router/team lead for domain detection
- codeflash-java-setup: environment detection (build tool, JDK, profiling tools)
- codeflash-java-deep: cross-domain optimizer (default)
- codeflash-java-cpu: data structures, algorithms, JIT deopt, JMH benchmarks
- codeflash-java-memory: heap/GC tuning, escape analysis, leak detection
- codeflash-java-async: virtual threads, lock contention, CompletableFuture
- codeflash-java-structure: class loading, JPMS, startup time, circular deps
- codeflash-java-scan: quick cross-domain diagnosis via JFR/jdeps/GC logs
- codeflash-java-ci: GitHub webhook handler for Java PRs
- codeflash-java-pr-prep: JMH benchmarks and PR body templates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add Java domain reference guides for all optimization domains

6 guides covering deep domain knowledge for agent consumption:
- data-structures: collection selection, autoboxing, JIT patterns, sorting
- memory: JVM heap layout, GC algorithms and tuning, escape analysis, leaks
- async: virtual threads, structured concurrency, lock hierarchy, contention
- structure: class loading, JPMS, CDS/AppCDS, ServiceLoader, Spring startup
- database: JPA N+1, HikariCP, pagination, batch operations, EXPLAIN plans
- native: JNI, Panama FFM API, GraalVM native-image, Vector API

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add Java optimization skills: session launcher and JFR profiling

- codeflash-optimize: session launcher with start/resume/status/scan/review
- jfr-profiling: quick-action JFR profiling in cpu/alloc/wall modes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Slim Java agents to match Go's concise ~175-line pattern

Move inline code examples, antipattern encyclopedias, JMH templates,
and deep-dive sections from agent prompts into reference guides.
Agents now contain only: target tables, one-liner antipatterns,
reasoning checklists, profiling commands, and keep/discard trees.

Line counts (before → after):
  cpu:       636 → 181
  memory:    878 → 193
  async:     578 → 165
  structure: 532 → 167
  deep:      507 → 186
  scan:      440 → 163
  Average:   595 → 176 (vs Go's 175)

Adds to data-structures/guide.md:
  - Collection contract traps table
  - Reflection → MethodHandle migration pattern
  - JMH benchmark template

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix Makefile build: use rsync merge and portable sed -i

Two bugs in the build target:
1. cp -R created nested dirs (agents/agents/, references/references/)
   instead of merging language overlay into shared base. Fix: rsync -a.
2. sed -i '' is macOS-only; fails silently on Linux. Fix: sed -i.bak
   (works on both macOS and Linux), then delete .bak files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add HANDOFF.md session lifecycle to Java agents

Java agents could read HANDOFF.md on resume but never wrote or
updated it. A session that hit plateau would lose all context —
what was tried, what worked, why it stopped, what to do next.

Changes:
- Deep agent: init HANDOFF.md on fresh start, record after each
  experiment, write Stop Reason + learnings.md on session end
- Domain agents (CPU, memory, async, structure): record to
  HANDOFF.md after each keep/discard, write session-end state
- Handoff template: make language-agnostic (was Python-specific),
  add Session status, Strategy & Decisions, and Stop Reason fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Close 11 gaps between Java and Python plugins

Add missing sections to Java deep agent: experiment loop depth (12 steps),
library boundary breaking, Phase 0 environment setup, CI mode, pre-submit
review, adversarial review, team orchestration, cross-domain results schema,
and structured progress reporting.

Add polymorphic dispatch safety to CPU agent and data-structures guide.
Add diff hygiene to CPU agent. Add native reference to router.

Create two new reference files: library-replacement.md (Guava/Commons/
Jackson/Joda replacement tables) and team-orchestration.md (full dispatch
and merge protocol).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 18:49:41 -05:00

7 KiB

Library Boundary Breaking -- Java

Domain agents treat external libraries as walls they can't cross. The deep agent doesn't. When profiling shows an external library dominating runtime and domain agents have plateaued, the deep agent has authority to replace library calls with focused JDK stdlib implementations that only cover the subset the codebase actually uses.

When to consider this

All three conditions must hold:

  1. Profiling evidence: The library accounts for >15% of cumtime (JFR CPU sampling), AND the cost is in the library's internal machinery (reflection, tree building, generalized parsing), not in your code's usage of it
  2. Plateau evidence: A domain agent already tried to optimize around the library -- caching results, reducing call frequency, batching -- and still plateaued because the remaining calls are essential but the library's implementation is heavy
  3. Narrow usage surface: The codebase uses a small fraction of the library's API. If you're using 5 methods out of 200, a focused replacement is feasible. If you're using most of the API, it's not worth it

How to assess feasibility

Step 1 -- Audit the actual API surface

# What does the codebase actually import?
grep -rn "import com.google.common" --include="*.java" --include="*.kt" src/ | sort -u
grep -rn "import org.apache.commons" --include="*.java" --include="*.kt" src/ | sort -u

# What classes/methods are actually called?
grep -rn "Preconditions\.\|ImmutableList\.\|ImmutableMap\.\|Strings\." --include="*.java" src/ | sort -u

Step 2 -- Classify each usage

For each call site, determine:

  • What does it need? (null check, immutable collection, string manipulation, date conversion)
  • What subset of the library's type system does it touch?
  • Could JDK stdlib handle this use case? (check minimum JDK version from setup.md)
  • Does it depend on library-specific features (e.g., Guava's @VisibleForTesting, custom serialization)?

Step 3 -- Map the replacement boundary

  • Replace: Uses where JDK stdlib provides equivalent functionality (collection factories, string checks, null guards)
  • Keep: Uses where the library provides functionality JDK lacks (e.g., Guava's Cache with TTL, Commons CSV parsing)
  • Hybrid: Replace read-only/simple uses, keep complex uses

Step 4 -- Estimate effort vs payoff

A focused replacement is worth it when:

  • The library calls being replaced account for >20% of total runtime
  • The replacement uses JDK stdlib only -- no new dependencies
  • The API surface being replaced is <10 methods/classes
  • Correctness can be verified: run both library path and replacement, diff results

Common Java replacement patterns

Guava -> JDK stdlib

Guava API JDK Replacement Min JDK
ImmutableList.of(a, b, c) List.of(a, b, c) 9
ImmutableList.copyOf(col) List.copyOf(col) 10
ImmutableMap.of(k, v) Map.of(k, v) 9
ImmutableSet.of(a, b) Set.of(a, b) 9
Preconditions.checkNotNull(x, msg) Objects.requireNonNull(x, msg) 7
Preconditions.checkArgument(cond, msg) if (!cond) throw new IllegalArgumentException(msg) 1
Strings.isNullOrEmpty(s) s == null || s.isEmpty() 1
Strings.nullToEmpty(s) s == null ? "" : s 1
Joiner.on(",").join(items) String.join(",", items) 8
Splitter.on(",").split(s) s.split(",") (note: different empty-string behavior) 1
FluentIterable.from(col).transform(f) col.stream().map(f).collect(toList()) 8
Optional (Guava) Optional (JDK) 8
Iterables.getOnlyElement(col) manual: check size == 1, get(0) 1

Caution: List.of() / Map.of() return truly immutable collections that throw on null elements. Guava's ImmutableList also rejects nulls, so this is safe. But if code passes these to APIs expecting mutable lists, it will break.

Apache Commons Lang -> JDK stdlib

Commons API JDK Replacement Min JDK
StringUtils.isBlank(s) s == null || s.isBlank() 11
StringUtils.isEmpty(s) s == null || s.isEmpty() 1
StringUtils.strip(s) s.strip() 11
StringUtils.trimToNull(s) s == null ? null : (s.isBlank() ? null : s.strip()) 11
StringUtils.join(arr, sep) String.join(sep, arr) 8
StringUtils.defaultIfBlank(s, def) s == null || s.isBlank() ? def : s 11
ObjectUtils.defaultIfNull(obj, def) Objects.requireNonNullElse(obj, def) 9
ObjectUtils.firstNonNull(a, b, c) Stream.of(a, b, c).filter(Objects::nonNull).findFirst().orElse(null) 8

Apache Commons Collections -> JDK

Commons API JDK Replacement Min JDK
CollectionUtils.isEmpty(col) col == null || col.isEmpty() 1
CollectionUtils.isNotEmpty(col) col != null && !col.isEmpty() 1
IterableUtils.forEach(iter, closure) iter.forEach(closure) 8
MapUtils.getInteger(map, key, def) (Integer) map.getOrDefault(key, def) 8
CollectionUtils.select(col, pred) col.stream().filter(pred).collect(toList()) 8

Jackson/Gson: full tree vs streaming

When profiling shows Jackson ObjectMapper.readTree() or readValue() dominating:

  • If you only need 2-3 fields from a large JSON: use JsonParser (streaming API) to extract them without building the full tree
  • If you deserialize the same type repeatedly: cache the ObjectReader (objectMapper.readerFor(MyClass.class)) -- ObjectMapper.readValue() creates a new one each time
  • If serialization is the bottleneck: use JsonGenerator for targeted output instead of objectMapper.writeValueAsString()

Joda-Time -> java.time

Joda-Time java.time Min JDK
DateTime ZonedDateTime 8
LocalDate (Joda) LocalDate (JDK) 8
LocalTime (Joda) LocalTime (JDK) 8
Duration (Joda) Duration (JDK) 8
Period (Joda) Period (JDK) 8
DateTimeFormat.forPattern(p) DateTimeFormatter.ofPattern(p) 8

Verification requirements

Library replacements are high-reward but high-risk. Always verify:

  1. Diff test: Run both the library path and your replacement with representative inputs. Outputs must match exactly
  2. Edge cases: null inputs, empty collections, empty strings, concurrent access, very large inputs
  3. JDK version: Verify the project's minimum JDK version (from setup.md) supports the replacement API. List.of() needs JDK 9+, String.isBlank() needs JDK 11+
  4. Serialization: If replaced types are serialized (Jackson, Java serialization, protobuf), verify wire compatibility. List.of() returns a non-serializable-compatible type in some JDK versions
  5. Behavioral differences: Some replacements have subtle differences:
    • String.split(",") keeps trailing empty strings; Splitter.on(",") does not by default
    • List.of() throws on null elements; Arrays.asList() allows them
    • Map.of() is limited to 10 entries; use Map.ofEntries() for more