mirror of https://github.com/codeflash-ai/codeflash-agent.git synced 2026-05-04 18:25:19 +00:00

Mohamed Ashraf 5b2b94fd71 refactor: move Java-specific content out of shared files into language overlay

Review feedback: shared experiment-loop-base.md and pre-submit-review.md
contained Java/Kotlin-specific content that all languages inherit. This
broke step numbering for non-Java agents and polluted cross-language files.

Changes:
- Revert experiment-loop-base.md to language-neutral (18-step original)
- Revert pre-submit-review.md to language-neutral (remove Java section)
- Create plugin/languages/java/references/pre-submit-review.md following
  the same pattern as the existing Python pre-submit-review.md
- Reduce duplication in all 4 domain agents (cpu, memory, async, deep):
  replace inlined benchmark-validity and correctness-verification content
  with concise references, keeping only domain-specific additions
- Add pre-submit-review.md to Deep References in all agents

No content was removed — all JMH validation, correctness verification,
mechanism explanation, milestone sanity check, and JDK compatibility
requirements remain in the Java language overlay. They are now referenced
from the single-source-of-truth files instead of being duplicated inline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-16 11:22:38 +00:00

2.9 KiB

Raw Blame History

Pre-Submit Self-Review — Java/Kotlin

Java/Kotlin-specific checks for the pre-submit review. Read ${CLAUDE_PLUGIN_ROOT}/references/shared/pre-submit-review.md first for the language-agnostic checklist.

JMH Benchmark Validity

For every KEEP, confirm the benchmark was validated per benchmark-validity.md:

Results consumed (anti-DCE): every @Benchmark method returns or uses Blackhole.consume()
Inputs dynamic (anti-constant-folding): all inputs from @State or @Param, not literals
Warmup meets floors: >=5 iterations for fast ops, >=3 for slow
Fork count: @Fork(2) minimum, @Fork(3) for GC-sensitive or marginal improvements
Error bars do NOT overlap between baseline and optimized

If any KEEP lacks valid JMH evidence, re-run the benchmark now.

Mechanism Explanation

Every KEEP commit message must contain a one-paragraph explanation of WHY the optimization is faster at the JVM level — the specific mechanism (e.g., "eliminates autoboxing allocations" or "replaces O(n^2) nested loop with HashMap index"). If you wrote "improved performance" without explaining the mechanism, fix the commit message.

JDK Version Compatibility

Verify every optimization uses only APIs available on the project's minimum JDK version (from .codeflash/setup.md). Common traps:

API / Feature	Minimum JDK
`List.of()`, `Map.of()`, `Set.of()`	9
`String.isBlank()`, `String.strip()`, `String.repeat()`	11
`String.formatted()`	15
`Stream.toList()`	16
`record` types	16
`sealed` classes	17
Virtual threads (`Thread.ofVirtual()`)	21
`SequencedCollection`, `SequencedMap`	21

An optimization using unavailable APIs does not compile in production.

Correctness Verification

For every KEEP, confirm output equivalence was verified per correctness-verification.md:

Return value deep equality (.equals(), not ==)
Floating-point tolerance (relative epsilon)
Collection ordering semantics (ordered vs unordered comparison)
Mutable parameter state preservation
Exception contract preservation
Side effect preservation
Serialization compatibility (if the object is serialized)

Milestone Sanity Check

If milestones were reached, confirm the cumulative JMH improvement was at least 70% of the sum of individual improvements. If not, at least one KEEP is a false positive — re-measure each individual KEEP and revert the non-contributing one(s).

GC Impact Verification

If you claimed GC improvement, verify with JFR GC events (jdk.G1GarbageCollection, jdk.GCPhasePause) or -Xlog:gc*, not just CPU timing. Compare pause distributions (p50, p99, max), not just averages.

Serialization Safety

If you changed collection types (e.g., ArrayList to EnumSet, HashMap to Map.of()), check if the object is serialized anywhere (Java serialization, Jackson, protobuf). See correctness-verification.md section 7.

2.9 KiB Raw Blame History