update continuous page and remove pricing

2026-05-01 02:39:32 +05:30 · 2026-05-01 02:39:32 +05:30 · c274133889
commit c274133889
parent 8fcd7b0f7b
3 changed files with 212 additions and 137 deletions
--- a/experiments/website-april/wireframe/continuous-optimization.html
+++ b/experiments/website-april/wireframe/continuous-optimization.html
@ -5,6 +5,35 @@
 <meta name="viewport" content="width=device-width, initial-scale=1" />
 <title>Continuous Optimization · Codeflash</title>
 <link rel="stylesheet" href="styles.css" />
+<style>
+  /* ---- Continuous Optimization page-specific ---- */
+
+  /* Integration cards — 2-row grid so descriptions have room */
+  .integration-grid {
+    display: grid;
+    grid-template-columns: repeat(4, 1fr);
+    gap: 1px;
+    background: var(--border);
+    border: 1px solid var(--border);
+    border-radius: var(--radius);
+    overflow: hidden;
+    margin-top: 36px;
+  }
+  .int-card { background: var(--bg-card); padding: 24px; }
+  .int-card .int-name {
+    font-family: var(--mono); font-size: 12px; font-weight: 600;
+    color: var(--accent); letter-spacing: 0.06em; margin-bottom: 10px;
+  }
+  .int-card h4 { font-size: 15px; font-weight: 600; margin: 0 0 6px; letter-spacing: -0.01em; }
+  .int-card p { font-size: 13px; color: var(--fg-dim); margin: 0; line-height: 1.55; }
+
+  /* Pricing dots fix */
+  .price-sub-clean { font-size: 13px; color: var(--fg-mute); font-family: var(--mono); margin-bottom: 20px; }
+
+  @media (max-width: 960px) {
+    .integration-grid { grid-template-columns: repeat(2, 1fr); }
+  }
+</style>
 </head>
 <body>

@ -38,76 +67,85 @@
      <a href="team.html">Team</a>
    </nav>
    <div class="nav-cta">
-      <a class="btn btn-ghost" href="optimize.html">Try the CLI →</a>
+      <a class="btn btn-ghost" href="optimization-engagement.html">Optimization Engagement →</a>
      <a class="btn btn-primary" href="contact.html">Book a call</a>
    </div>
  </div>
 </header>

-<div class="wire-bar">Wireframe · Continuous Optimization</div>
-
-<section class="hero" style="padding: 88px 0 48px;">
+<!-- Hero -->
+<section class="hero" style="padding: 88px 0 56px;">
  <div class="container">
-    <h1>Keep every PR optimal, forever.</h1>
-    <p class="hero-sub">Continuous Optimization watches every pull request. It benchmarks the changed functions, catches regressions before they merge, and suggests faster rewrites inline — while the diff is still fresh.</p>
+    <div class="section-label">/ Continuous Optimization</div>
+    <h1>New code ships fast.<br/>Now it ships optimal too.</h1>
+    <p class="hero-sub">Every pull request your team opens, Codeflash benchmarks the changed code, catches regressions before they merge, and posts a faster rewrite with the numbers to prove it. Your engineers see it in the same PR they're already reviewing. No new workflow.</p>
    <div class="hero-ctas">
-      <a class="btn btn-primary btn-lg" href="#">Try the CLI →</a>
-      <a class="btn btn-lg" href="pricing.html">See pricing →</a>
+      <a class="btn btn-primary btn-lg" href="contact.html">Talk to us</a>
+      <a class="btn btn-lg" href="optimization-engagement.html">See Optimization Engagement →</a>
    </div>
  </div>
 </section>

-<!-- How it integrates -->
+<!-- Brad Dwyer quote — anchored early, not buried -->
+<section style="padding: 56px 0; border-bottom: 1px solid var(--border);">
+  <div class="container big-quote">
+    <blockquote>"After installing Codeflash in the GitHub Pull Request code review stage, it tries to optimize every new code we write. With that, I can be more confident that our engineers are shipping more optimized code every time."</blockquote>
+    <div class="attr"><strong>Brad Dwyer</strong> · Founder &amp; CTO, Roboflow</div>
+  </div>
+</section>
+
+<!-- Why it matters -->
 <section>
  <div class="container">
-    <div class="section-label">Integration</div>
-    <h2>Lives where your code lives.</h2>
-    <div class="capabilities">
+    <div class="section-label">The problem it solves</div>
+    <h2>Performance wins decay. This stops that.</h2>
+    <p class="lead" style="max-width: 720px;">A one-time optimization engagement cuts your bill. But new code ships every week, and most of it has never been profiled. Within a year, the gains are gone. Continuous Optimization closes that loop — the same agent that found your bottlenecks now watches every PR going forward.</p>
+    <div class="capabilities" style="margin-top: 40px;">
      <div class="capability">
-        <div class="cap-title">GitHub PRs</div>
-        <div class="cap-body">Check runs + inline comments on every PR. Merge-blocking on regressions (configurable).</div>
+        <div class="cap-title">Regressions caught at the source</div>
+        <div class="cap-body">Spotted in the PR where they're introduced, not three sprints later in a postmortem.</div>
      </div>
      <div class="capability">
-        <div class="cap-title">Claude Code</div>
-        <div class="cap-body">Plugin. Optimization suggestions inline while your engineers write code.</div>
+        <div class="cap-title">No extra work for your team</div>
+        <div class="cap-body">The benchmark and rewrite show up as a comment in the PR your engineers are already reviewing.</div>
      </div>
      <div class="capability">
-        <div class="cap-title">Codex</div>
-        <div class="cap-body">Plugin. Same loop, different host.</div>
+        <div class="cap-title">AI code gets reviewed too</div>
+        <div class="cap-body">We've found 118 functions up to 446× slower in two AI-written PRs. The agent catches what code review can't.</div>
      </div>
      <div class="capability">
-        <div class="cap-title">Cursor</div>
-        <div class="cap-body">Plugin. Surfaces regressions and rewrites without leaving the editor.</div>
+        <div class="cap-title">Savings compound, not decay</div>
+        <div class="cap-body">Every optimized PR keeps the baseline lower. The bill bends down and stays there.</div>
      </div>
    </div>
  </div>
 </section>

-<!-- Loop visual -->
+<!-- The loop -->
 <section>
  <div class="container">
-    <div class="section-label">The loop</div>
+    <div class="section-label">How it works</div>
    <h2>What happens on every PR.</h2>
    <div class="steps" style="margin-top: 40px;">
      <div class="step">
        <div class="step-num">01</div>
        <h3>Detect</h3>
-        <p>Agent identifies functions changed by the diff. Picks representative inputs from prior traces.</p>
+        <p>The agent identifies which functions changed in the diff and selects representative inputs based on prior execution traces.</p>
      </div>
      <div class="step">
        <div class="step-num">02</div>
        <h3>Benchmark</h3>
-        <p>Runs before/after on isolated CI hardware. Stats-significance gated.</p>
+        <p>Runs the old and new version on isolated hardware. The result is only reported if it's statistically significant.</p>
      </div>
      <div class="step">
        <div class="step-num">03</div>
        <h3>Rewrite</h3>
-        <p>If there's a faster equivalent, the agent writes it and proves correctness.</p>
+        <p>If a faster equivalent exists, the agent writes it and verifies correctness against your test suite before surfacing it.</p>
      </div>
      <div class="step">
        <div class="step-num">04</div>
        <h3>Comment</h3>
-        <p>Posts a PR comment with numbers and a ready-to-apply patch.</p>
+        <p>Posts directly on the PR with before/after numbers and a one-click patch. Your engineers decide whether to apply it.</p>
      </div>
    </div>
  </div>
@ -117,7 +155,7 @@
 <section>
  <div class="container">
    <div class="section-label">What your team sees</div>
-    <h2 style="font-size: 28px;">A PR comment that pays rent.</h2>
+    <h2>A PR comment that pays for itself.</h2>
    <div class="agent-mock" style="margin-top: 24px; max-width: 820px;">
      <div class="mock-head">
        <span class="mock-dot"></span><span class="mock-dot"></span><span class="mock-dot"></span>
@ -140,70 +178,62 @@
  </div>
 </section>

-<!-- Customer quote -->
+<!-- Where it runs -->
 <section>
-  <div class="container big-quote">
-    <blockquote>"After installing Codeflash in the GitHub Pull Request code review stage, it tries to optimize every new code we write. With that, I can be more confident that our engineers are shipping more optimized code every time."</blockquote>
-    <div class="attr"><strong>Brad Dwyer</strong> · Founder & CTO, Roboflow</div>
+  <div class="container">
+    <div class="section-label">Integrations</div>
+    <h2>Works where your team already works.</h2>
+    <div class="integration-grid">
+      <div class="int-card">
+        <div class="int-name">GitHub</div>
+        <h4>On every PR, automatically</h4>
+        <p>Check runs and inline comments on every pull request. You can configure it to block merges on regressions, or keep it advisory.</p>
+      </div>
+      <div class="int-card">
+        <div class="int-name">Claude Code</div>
+        <h4>Inline while writing</h4>
+        <p>The plugin surfaces optimization suggestions as your engineers write code, before a PR is even opened.</p>
+      </div>
+      <div class="int-card">
+        <div class="int-name">Cursor</div>
+        <h4>Inside the editor</h4>
+        <p>Regressions and rewrites surface without leaving the editor. The feedback loop tightens to the moment of authorship.</p>
+      </div>
+      <div class="int-card">
+        <div class="int-name">Codex</div>
+        <h4>Same loop, native to Codex</h4>
+        <p>If your team uses Codex as their primary agent, the Codeflash plugin runs the same benchmarking and rewrite loop inside it.</p>
+      </div>
+    </div>
  </div>
 </section>

-<!-- Free / Pro recap -->
-<section>
-  <div class="container">
-    <div class="section-label">Plans</div>
-    <h2>Free for public projects. $20/user/mo for private.</h2>
-    <div class="price-grid">
-      <div class="price-card">
-        <div class="tag">Free</div>
-        <h3>$0</h3>
-        <div class="price-sub">Public GitHub projects</div>
-        <ul>
-          <li>25 function optimizations / month</li>
-          <li>PR optimization + CI benchmarks</li>
-          <li>Claude Code / Codex / Cursor plugins</li>
-          <li>Model training on your code <strong>permitted</strong> (it's public anyway)</li>
-        </ul>
-        <a class="btn" href="#">Try the CLI →</a>
-      </div>
-      <div class="price-card emph">
-        <div class="tag">Pro</div>
-        <h3>$20 <span style="font-size: 16px; color: var(--fg-mute); font-weight: 400;">/ user / mo</span></h3>
-        <div class="price-sub">Private projects · 14-day trial</div>
-        <ul>
-          <li>500 function optimizations / user / month</li>
-          <li>Priority support</li>
-          <li>Dashboard & analytics</li>
-          <li><strong>Never trained on your code.</strong> Ever.</li>
-        </ul>
-        <a class="btn btn-primary" href="#">Start Pro trial</a>
-      </div>
-    </div>
-    <div class="wire-note">Large-team Enterprise plan (SSO, custom limits, on-prem) routed through <a href="contact.html" style="color: var(--accent);">contact sales</a> rather than a third pricing card.</div>
-  </div>
-</section>

 <!-- FAQ -->
 <section>
  <div class="container" style="max-width: 780px;">
    <div class="section-label">FAQ</div>
-    <h2 style="font-size: 28px;">Short answers.</h2>
+    <h2 style="font-size: 28px;">Common questions.</h2>
    <div style="margin-top: 32px;">
      <details open style="padding: 18px 0; border-bottom: 1px solid var(--border);">
-        <summary style="cursor: pointer; font-weight: 600; font-size: 16px;">Will it slow our CI?</summary>
-        <p style="color: var(--fg-dim); margin-top: 10px;">No. Benchmarks run out-of-band on our hardware. Your CI just reads the result.</p>
-      </details>
-      <details style="padding: 18px 0; border-bottom: 1px solid var(--border);">
-        <summary style="cursor: pointer; font-weight: 600; font-size: 16px;">What languages?</summary>
-        <p style="color: var(--fg-dim); margin-top: 10px;">Python, Java, JavaScript, TypeScript, Go, and more.</p>
-      </details>
-      <details style="padding: 18px 0; border-bottom: 1px solid var(--border);">
-        <summary style="cursor: pointer; font-weight: 600; font-size: 16px;">What counts as a "function optimization"?</summary>
-        <p style="color: var(--fg-dim); margin-top: 10px;">A completed agent run against one function: profile + candidate generation + benchmark + verification. Re-runs on the same function (e.g., after your edit) don't re-count.</p>
+        <summary style="cursor: pointer; font-weight: 600; font-size: 16px;">Will this slow down our CI pipeline?</summary>
+        <p style="color: var(--fg-dim); margin-top: 10px;">No. Benchmarks run on our hardware, out-of-band. Your CI pipeline just reads the result from us. There's no compute overhead on your side.</p>
      </details>
      <details style="padding: 18px 0; border-bottom: 1px solid var(--border);">
        <summary style="cursor: pointer; font-weight: 600; font-size: 16px;">Does it block merges?</summary>
-        <p style="color: var(--fg-dim); margin-top: 10px;">Only if you want it to. Default is advisory — it comments, you decide.</p>
+        <p style="color: var(--fg-dim); margin-top: 10px;">Only if you configure it to. The default is advisory: it posts the benchmark and patch as a comment, and your engineers decide whether to apply it. You can enable merge-blocking on regressions for critical paths.</p>
+      </details>
+      <details style="padding: 18px 0; border-bottom: 1px solid var(--border);">
+        <summary style="cursor: pointer; font-weight: 600; font-size: 16px;">What languages are supported?</summary>
+        <p style="color: var(--fg-dim); margin-top: 10px;">Python, Java, JavaScript, TypeScript, Go, and more.</p>
+      </details>
+      <details style="padding: 18px 0; border-bottom: 1px solid var(--border);">
+        <summary style="cursor: pointer; font-weight: 600; font-size: 16px;">Will you use our code to train models?</summary>
+        <p style="color: var(--fg-dim); margin-top: 10px;">On Pro, never. Not ours, not any third party's. On the free plan your code is public by definition, so we make no training restriction on public code.</p>
+      </details>
+      <details style="padding: 18px 0; border-bottom: 1px solid var(--border);">
+        <summary style="cursor: pointer; font-weight: 600; font-size: 16px;">How is this different from just asking Claude or Cursor to optimize the code?</summary>
+        <p style="color: var(--fg-dim); margin-top: 10px;">AI coding tools suggest changes based on what they can see in the file. Codeflash runs an actual benchmark before and after, verifies correctness against your test suite, and only surfaces a rewrite if the numbers prove it's faster. The difference is measurement versus intuition.</p>
      </details>
    </div>
  </div>
@ -211,9 +241,12 @@

 <div class="cta-band">
  <div class="container">
-    <h2>New code starts optimal.</h2>
-    <p>Install the GitHub app and run on your next PR.</p>
-    <a class="btn btn-primary btn-lg" href="#">Try the CLI →</a>
+    <h2>Ready to keep your bill down for good?</h2>
+    <p>Continuous Optimization pairs with an Optimization Engagement. The engagement cuts the bill; this keeps it there.</p>
+    <div class="cta-band-ctas">
+      <a class="btn btn-primary btn-lg" href="contact.html">Talk to us</a>
+      <a class="btn btn-lg" href="optimization-engagement.html">See Optimization Engagement →</a>
+    </div>
  </div>
 </div>

--- a/experiments/website-april/wireframe/ml-performance.html
+++ b/experiments/website-april/wireframe/ml-performance.html
@ -38,18 +38,18 @@
      <a href="team.html">Team</a>
    </nav>
    <div class="nav-cta">
-      <a class="btn btn-ghost" href="optimize.html">Try the CLI →</a>
-      <a class="btn btn-primary" href="contact.html">Book a call</a>
+      <a class="btn btn-ghost" href="optimization-engagement.html">Optimization Engagement →</a>
+      <a class="btn btn-primary" href="contact.html">Book a diagnostic</a>
    </div>
  </div>
 </header>

-<div class="wire-bar">Wireframe · ML Performance</div>
-
-<section class="hero" style="padding: 88px 0 48px;">
+<!-- Hero -->
+<section class="hero" style="padding: 88px 0 56px;">
  <div class="container">
-    <h1>If you run ML in production, we've probably optimized your stack.</h1>
-    <p class="hero-sub">Inference, preprocessing, and memory dominate the ML bill. Codeflash has shipped merged wins across PyTorch, JAX, ONNX, vLLM, Diffusers, YOLO, RF-DETR, SAM3, PaddleOCR, and spaCy — from algorithmic rewrites to container-aware scheduling.</p>
+    <div class="section-label">/ ML Performance</div>
+    <h1>If you run ML in production,<br/>we've probably optimized your stack.</h1>
+    <p class="hero-sub">Inference, preprocessing, and memory dominate the ML bill. Codeflash has shipped merged wins across PyTorch, JAX, ONNX, vLLM, Diffusers, YOLO, RF-DETR, SAM3, PaddleOCR, and spaCy. From algorithmic rewrites to container-aware scheduling.</p>
    <div class="hero-ctas">
      <a class="btn btn-primary btn-lg" href="contact.html">Book a diagnostic</a>
      <a class="btn btn-lg" href="case-studies/unstructured.html">See the Unstructured case →</a>
@ -57,20 +57,31 @@
  </div>
 </section>

-<!-- Stack we've optimized -->
+<!-- Why ML costs are hard to fix — moved before proof to build empathy first -->
 <section>
  <div class="container">
-    <div class="section-label">Stack coverage</div>
-    <h2>Frameworks we've shipped wins on.</h2>
-    <div class="capabilities" style="grid-template-columns: repeat(4, 1fr);">
-      <div class="capability"><div class="cap-title">PyTorch</div><div class="cap-body">Graph rewrites, custom ops, memory layout.</div></div>
-      <div class="capability"><div class="cap-title">JAX</div><div class="cap-body">jit/pmap boundaries, trace stability.</div></div>
-      <div class="capability"><div class="cap-title">ONNX / ONNXRuntime</div><div class="cap-body">Worker sizing, CPU-aware execution.</div></div>
-      <div class="capability"><div class="cap-title">vLLM</div><div class="cap-body">Token decoding, scheduler hotspots.</div></div>
-      <div class="capability"><div class="cap-title">HF Diffusers</div><div class="cap-body">Encoder/decoder speedups, WAN path.</div></div>
-      <div class="capability"><div class="cap-title">YOLO family</div><div class="cap-body">YOLOv8 inference throughput on GPU.</div></div>
-      <div class="capability"><div class="cap-title">RF-DETR / SAM3</div><div class="cap-body">End-to-end inference latency.</div></div>
-      <div class="capability"><div class="cap-title">PaddleOCR · spaCy</div><div class="cap-body">Preprocessing and pipeline bottlenecks.</div></div>
+    <div class="section-label">Why it's hard</div>
+    <h2>ML costs hide in places profiling doesn't show you.</h2>
+    <p class="lead" style="max-width: 720px;">Most teams optimize the obvious things — model size, batch size, hardware tier. The real waste is usually elsewhere, and it compounds quietly until the bill forces a conversation.</p>
+    <div style="margin-top: 40px; max-width: 860px;">
+      <ul style="list-style: none; padding: 0; margin: 0;">
+        <li style="padding: 22px 0; border-bottom: 1px solid var(--border); color: var(--fg-dim);">
+          <strong style="color: var(--fg);">Your worker count is wrong and you don't know it.</strong>
+          Inside Kubernetes, <code>os.cpu_count()</code> returns the host CPU count, not the pod's. A single OCR service was spawning 4 ONNX workers on a 1-CPU pod — 4× the memory, zero extra throughput. We find this by default.
+        </li>
+        <li style="padding: 22px 0; border-bottom: 1px solid var(--border); color: var(--fg-dim);">
+          <strong style="color: var(--fg);">The model isn't the bottleneck. The preprocessing is.</strong>
+          In vision and OCR pipelines, the GPU waits on CPU. PIL, PNG decoding, or a redundant resize three layers deep — that's where the latency lives. Fixing the model achieves nothing until you fix the pipeline feeding it.
+        </li>
+        <li style="padding: 22px 0; border-bottom: 1px solid var(--border); color: var(--fg-dim);">
+          <strong style="color: var(--fg);">Memory creep turns into over-provisioning.</strong>
+          24 MB per request is invisible at low scale. At high scale it causes OOMs, which cause defensive over-provisioning, which causes a cloud bill no one can explain. We track RSS before and after on every PR.
+        </li>
+        <li style="padding: 22px 0; color: var(--fg-dim);">
+          <strong style="color: var(--fg);">Fix one bottleneck and the next one surfaces.</strong>
+          This is why one-time profiling sessions rarely deliver the full saving. The agent follows the stack down layer by layer until the bill actually bends.
+        </li>
+      </ul>
    </div>
  </div>
 </section>
@ -78,12 +89,12 @@
 <!-- Merged results -->
 <section>
  <div class="container">
-    <div class="section-label">Merged results</div>
-    <h2>Named projects. Merged PRs. Measured speedups.</h2>
-    <p class="lead">Everything here is upstream, reviewed by maintainers, and in production.</p>
+    <div class="section-label">Proof</div>
+    <h2>Real projects. Merged PRs. Numbers you can verify.</h2>
+    <p class="lead">Everything below is upstream, reviewed by maintainers, and running in production.</p>
    <div class="agent-pullquote" style="max-width: 820px; margin: 32px 0 40px;">
      <p>"Codeflash made our core object detection flow 25% faster. On the same GPU machine, the object detection throughput went up from 80 fps to 100 fps with a corresponding drop in latency from 12.2ms to 9.8ms."</p>
-      <div class="attr"><strong>Brad Dwyer</strong> · Founder & CTO, Roboflow</div>
+      <div class="attr"><strong>Brad Dwyer</strong> · Founder &amp; CTO, Roboflow</div>
    </div>
    <div class="results-grid">
      <div class="result-card">
@ -107,13 +118,13 @@
      <div class="result-card">
        <div class="customer">Roboflow</div>
        <div class="headline">25% faster object detection</div>
-        <div class="support">80fps → 100fps · 12.2ms → 9.8ms latency · YOLOv8 on GPU</div>
+        <div class="support">80fps → 100fps · 12.2ms → 9.8ms · YOLOv8 on GPU</div>
        <div class="link">Read case study →</div>
      </div>
      <div class="result-card">
        <div class="customer">pdfminer.six</div>
        <div class="headline">3× speedup</div>
-        <div class="support">OSS library in thousands of pipelines</div>
+        <div class="support">OSS library running in thousands of pipelines</div>
        <div class="link">Read case study →</div>
      </div>
      <div class="result-card">
@ -126,23 +137,50 @@
  </div>
 </section>

-<!-- Why ML is different -->
+<!-- Stack coverage — now positioned as credibility after proof -->
 <section>
  <div class="container">
-    <div class="section-label">Why ML is different</div>
-    <h2>The costs compound in places humans don't look.</h2>
-    <div style="margin-top: 40px; max-width: 820px;">
-      <ul style="list-style: none; padding: 0; margin: 0;">
-        <li style="padding: 18px 0; border-bottom: 1px solid var(--border); color: var(--fg-dim);"><strong style="color: var(--fg);">Container-awareness.</strong> <code>os.cpu_count()</code> lies inside Kubernetes. Worker pools multiply memory without multiplying throughput. We find it by default.</li>
-        <li style="padding: 18px 0; border-bottom: 1px solid var(--border); color: var(--fg-dim);"><strong style="color: var(--fg);">Preprocessing dominance.</strong> In vision and OCR pipelines, the model isn't the bottleneck — it's PIL, PNG, or a redundant resize three layers deep.</li>
-        <li style="padding: 18px 0; border-bottom: 1px solid var(--border); color: var(--fg-dim);"><strong style="color: var(--fg);">Memory creep.</strong> 24 MB per request compounds into OOM defensive over-provisioning. We track RSS before/after on every PR.</li>
-        <li style="padding: 18px 0; color: var(--fg-dim);"><strong style="color: var(--fg);">Stacked bottlenecks.</strong> Fix one, the next one surfaces. The agent follows the stack down — until the bill bends.</li>
-      </ul>
+    <div class="section-label">Stack coverage</div>
+    <h2>Frameworks we've shipped wins on.</h2>
+    <p class="lead">If your stack is listed here, we've shipped merged, production-verified improvements on it.</p>
+    <div class="capabilities" style="grid-template-columns: repeat(4, 1fr); margin-top: 36px;">
+      <div class="capability">
+        <div class="cap-title">PyTorch</div>
+        <div class="cap-body">Graph rewrites, custom ops, and memory layout improvements that reduce GPU idle time.</div>
+      </div>
+      <div class="capability">
+        <div class="cap-title">JAX</div>
+        <div class="cap-body">JIT and pmap boundary fixes, trace stability issues that cause silent slowdowns at scale.</div>
+      </div>
+      <div class="capability">
+        <div class="cap-title">ONNX / ONNXRuntime</div>
+        <div class="cap-body">Worker pool sizing and CPU-aware execution. The most common source of wasted memory in inference services.</div>
+      </div>
+      <div class="capability">
+        <div class="cap-title">vLLM</div>
+        <div class="cap-body">Token decoding and scheduler hotspots. 13.7× improvement merged upstream in PR #20413.</div>
+      </div>
+      <div class="capability">
+        <div class="cap-title">HF Diffusers</div>
+        <div class="cap-body">Encoder and decoder speedups. 9× improvement on the WAN encoding path, merged in PR #11665.</div>
+      </div>
+      <div class="capability">
+        <div class="cap-title">YOLO family</div>
+        <div class="cap-body">YOLOv8 inference throughput on GPU. Roboflow saw 80fps go to 100fps on the same hardware.</div>
+      </div>
+      <div class="capability">
+        <div class="cap-title">RF-DETR / SAM3</div>
+        <div class="cap-body">End-to-end inference latency on modern vision architectures.</div>
+      </div>
+      <div class="capability">
+        <div class="cap-title">PaddleOCR · spaCy</div>
+        <div class="cap-body">Preprocessing pipelines and CPU-bound bottlenecks that sit upstream of the model itself.</div>
+      </div>
    </div>
  </div>
 </section>

-<!-- Big quote -->
+<!-- Crag Wolfe quote -->
 <section>
  <div class="container big-quote">
    <blockquote>"A PR comes in, we measure RSS before and after on a running system, and the improvement is 2× or 3×. Real demonstrable progress, not theoretical."</blockquote>
@ -153,8 +191,11 @@
 <div class="cta-band">
  <div class="container">
    <h2>Find the waste in your ML stack.</h2>
-    <p>20-minute diagnostic. We'll tell you where the cost is hiding — inference, preprocess, memory, or all three.</p>
-    <a class="btn btn-primary btn-lg" href="contact.html">Book a diagnostic</a>
+    <p>20-minute diagnostic. We'll tell you where the cost is hiding — inference, preprocessing, memory, or all three — before you commit to anything.</p>
+    <div class="cta-band-ctas">
+      <a class="btn btn-primary btn-lg" href="contact.html">Book a diagnostic</a>
+      <a class="btn btn-lg" href="optimization-engagement.html">See how engagements work →</a>
+    </div>
  </div>
 </div>

@ -175,6 +216,7 @@
      </ul></div>
      <div class="footer-col"><h5>Proof</h5><ul>
        <li><a href="case-studies.html">Case studies</a></li>
+        <li><a href="case-studies/unstructured.html">Unstructured</a></li>
        <li><a href="security.html">Security</a></li>
      </ul></div>
      <div class="footer-col"><h5>Company</h5><ul>
--- a/experiments/website-april/wireframe/optimization-engagement.html
+++ b/experiments/website-april/wireframe/optimization-engagement.html
@ -140,7 +140,7 @@
  <div class="container">
    <div class="kicker">/ Optimization Engagement</div>
    <h1>Cut your infrastructure bill<br/><em>40–90%. Guaranteed.</em></h1>
-    <p class="sub">Inefficient code quietly drains gross margin, compresses runway, and makes scale more expensive than it needs to be. A Codeflash Optimization Engagement finds that waste across your production system and ships the fixes. Every change is reviewed by our senior performance engineers, proven correct, and delivered as a PR. <strong>ROI guaranteed before you commit.</strong></p>
+    <p class="sub">Inefficient code quietly drains gross margin, compresses runway, and makes scale more expensive than it needs to be. A Codeflash Optimization Engagement finds that waste across your production system and ships the fixes. Every change created by our agent is crafted & reviewed by our senior performance engineers, proven correct, and delivered as a PR. <strong>ROI guaranteed before you commit.</strong></p>
    <div class="hero-ctas">
      <a class="btn btn-primary btn-lg" href="contact.html">Book a diagnostic call</a>
      <a class="btn btn-lg" href="case-studies/unstructured.html">See the Unstructured result →</a>
@ -238,9 +238,9 @@
      <div class="agent-copy">
        <h2 style="margin-bottom: 20px;">An autonomous performance engineer that never sleeps.</h2>
        <p style="font-size: 17px; color: var(--fg-dim); line-height: 1.6; margin-bottom: 24px;">A human perf engineer works file by file. <strong style="color: var(--fg);">codeflash-agent</strong> has your entire codebase in view at once. It finds six-step flows that can become three-step flows. It sees the O(N²) scan nested two abstractions deep. It catches the 24 MB memory creep hiding inside baseline noise.</p>
-        <p style="font-size: 17px; color: var(--fg-dim); line-height: 1.6; margin-bottom: 28px;">It runs 24/7, in parallel, in our sandbox. Your engineers don't drive it — they review the output. That's a fundamentally different workload on your team.</p>
+        <p style="font-size: 17px; color: var(--fg-dim); line-height: 1.6; margin-bottom: 28px;">It runs 24/7, in parallel, in our sandbox. Your engineers don't drive it. They review the output. Your team's job goes from doing the optimization work to approving it.</p>
        <div class="agent-pullquote">
-          <p>"Having Codeflash — the subject-matter experts on optimization — run the agent makes more sense than asking our engineers, who are focused on product and features, to develop and run that process themselves."</p>
+          <p>"Having Codeflash run the agent makes more sense than asking our engineers, who are focused on product and features, to develop and run that process themselves."</p>
          <div class="attr"><strong>Crag Wolfe</strong> · Chief Architect, Unstructured</div>
        </div>
      </div>
@ -280,7 +280,7 @@
  <div class="container">
    <div class="section-label">Why Codeflash</div>
    <h2>AI agents write new code faster. They don't fix what's already expensive.</h2>
-    <p class="lead">Claude, Cursor, and Codex operate at the point of authorship. The waste already in production — and the waste being added at speed — requires a different approach.</p>
+    <p class="lead">Claude, Cursor, and Codex operate at the point of authorship. The waste already in production, and the waste being added at speed by those same tools, requires a different approach.</p>
    <table class="compare-table">
      <thead>
        <tr>
@ -292,7 +292,7 @@
      <tbody>
        <tr>
          <td>Cloud bill impact</td>
-          <td class="col-bad">None — no visibility into production cost</td>
+          <td class="col-bad">None. No visibility into production cost</td>
          <td class="col-us"><span class="check">→</span> 40–90% infra cost reduction, measured</td>
        </tr>
        <tr>
@ -308,7 +308,7 @@
        <tr>
          <td>Your team's time cost</td>
          <td class="col-bad">Engineers must prompt, review, and validate</td>
-          <td class="col-us"><span class="check">→</span> You review PRs — we do everything else</td>
+          <td class="col-us"><span class="check">→</span> You review PRs. We do everything else</td>
        </tr>
        <tr>
          <td>ROI before you pay</td>
@ -325,12 +325,12 @@
  <div class="container">
    <div class="section-label">Deliverables</div>
    <h2>What lands on your side of the engagement.</h2>
-    <p class="lead">Concrete. Documented. Merge-ready.</p>
+    <p class="lead">Here is exactly what you get at the end of an engagement.</p>
    <div class="deliverables">
      <div class="deliverable">
        <div class="d-icon">01 · Baseline</div>
        <h4>A measured starting point</h4>
-        <p>We reproduce your baseline on your actual workload before touching a line. No guesswork — the improvement is measured from a real number, not a projection.</p>
+        <p>We reproduce your baseline on your actual workload before touching a line. No guesswork. The improvement is measured from a real number, not a projection.</p>
      </div>
      <div class="deliverable">
        <div class="d-icon">02 · PRs</div>
@ -345,7 +345,7 @@
      <div class="deliverable">
        <div class="d-icon">04 · Report</div>
        <h4>End-of-engagement summary</h4>
-        <p>Bottlenecks found, methodology, PRs shipped, before/after metrics, and — if applicable — a scope recommendation for the next phase. The report is yours to keep.</p>
+        <p>Bottlenecks found, methodology, PRs shipped, before/after metrics, and a scope recommendation for the next phase if applicable. The report is yours to keep.</p>
      </div>
      <div class="deliverable">
        <div class="d-icon">05 · Security</div>
@ -369,9 +369,9 @@
    <div class="guarantee-block">
      <div class="g-label">ROI guarantee</div>
      <h3>If the savings don't clearly exceed what you'd pay, we tell you before you commit anything.</h3>
-      <p>Every engagement starts with a paid diagnostic. In three weeks, we profile your system, identify your top bottlenecks, and quantify the annual dollar opportunity. That diagnostic is the decision point — not a down payment on a foregone conclusion.</p>
+      <p>Every engagement starts with a paid diagnostic. In three weeks, we profile your system, identify your top bottlenecks, and quantify the annual dollar opportunity. That diagnostic is the decision point, not a down payment on a foregone conclusion.</p>
      <p><strong>If we don't identify at least 5× the diagnostic fee in annualized savings, the diagnostic is on us.</strong> We've never triggered that clause. But it's there because we mean it.</p>
-      <p style="font-size: 14px; color: var(--fg-mute); font-family: var(--mono);">Commercial structure — scoped fixed fee or shared-upside — is agreed before work begins.</p>
+      <p style="font-size: 14px; color: var(--fg-mute); font-family: var(--mono);">Commercial structure (scoped fixed fee or shared-upside) is agreed before work begins.</p>
    </div>
  </div>
 </section>
@ -381,7 +381,7 @@
  <div class="container">
    <div class="section-label">From the field</div>
    <h2>What Crag Wolfe said after seven weeks.</h2>
-    <p class="lead">Chief Architect at Unstructured. Three separate quotes, unprompted.</p>
+    <p class="lead">Chief Architect at Unstructured, after seven weeks.</p>
    <div class="quote-trio">
      <div class="quote-block">
        <blockquote>"We knew we could do better, but we didn't have the bandwidth."</blockquote>
@ -418,7 +418,7 @@
      <div class="after-half muted">
        <div class="ah-label">After engagement</div>
        <h4>We keep it there.</h4>
-        <p>The same agent stays on, watching every new PR. Regressions are caught where they're introduced — before they compound. Integrates with GitHub, Claude Code, Cursor, and Codex.</p>
+        <p>The same agent stays on, watching every new PR. Regressions are caught where they're introduced, before they compound. Works with GitHub, Claude Code, Cursor, and Codex.</p>
        <a class="btn" href="continuous-optimization.html">See Continuous Optimization →</a>
      </div>
    </div>
@ -429,7 +429,7 @@
 <section>
  <div class="container" style="max-width: 800px;">
    <div class="section-label">FAQ</div>
-    <h2 style="font-size: 30px;">Short answers to real questions.</h2>
+    <h2 style="font-size: 30px;">Common questions.</h2>
    <div style="margin-top: 32px;">

      <details class="faq-item" open>
@ -439,12 +439,12 @@

      <details class="faq-item">
        <summary>What do you actually need from us?</summary>
-        <p>One or more repos, a benchmark (or a profile — if you don't have a benchmark, we can write one), a defined objective, and the availability to review PRs. We don't need production access. We don't need to be onboarded to your infrastructure. Your team's time commitment is reviewing PRs, not driving the work.</p>
+        <p>One or more repos, a benchmark or a profile (if you don't have a benchmark, we can write one), a defined objective, and time to review PRs. We don't need production access and we don't need to be onboarded to your infrastructure. Your team reviews PRs; we drive everything else.</p>
      </details>

      <details class="faq-item">
        <summary>How is this different from running AI coding tools ourselves?</summary>
-        <p>AI coding tools optimize within a function, at the point of authorship. codeflash-agent profiles your running system, identifies the real bottlenecks, and rewrites across layers — then every change is benchmarked and correctness-verified before a Codeflash engineer reviews it. The output is a PR with measured proof, not a suggestion.</p>
+        <p>AI coding tools optimize within a function, at the point of authorship. codeflash-agent profiles your running system, identifies the real bottlenecks, and rewrites across layers. Then every change is benchmarked and correctness-verified before a Codeflash engineer reviews it. The output is a PR with measured proof, not a suggestion.</p>
      </details>

      <details class="faq-item">
@ -464,12 +464,12 @@

      <details class="faq-item">
        <summary>Can you run on-prem or in our VPC?</summary>
-        <p>Yes. Deployment model — SaaS, your cloud, or on-prem — is agreed during scoping. Same agent, same process.</p>
+        <p>Yes. Deployment model (SaaS, your cloud, or on-prem) is agreed during scoping. Same agent, same process.</p>
      </details>

      <details class="faq-item">
        <summary>What languages and stacks?</summary>
-        <p>Python, Java, JavaScript, TypeScript, Go, and more. ML stacks including PyTorch, JAX, vLLM, HF Diffusers, YOLO, and spaCy. If you're not sure, <a href="contact.html">tell us your stack</a> — we'll tell you honestly whether we've worked with it.</p>
+        <p>Python, Java, JavaScript, TypeScript, Go, and more. ML stacks including PyTorch, JAX, vLLM, HF Diffusers, YOLO, and spaCy. If you're not sure, <a href="contact.html">tell us your stack</a> and we'll give you an honest answer.</p>
      </details>

    </div>
@ -480,7 +480,7 @@
 <div class="cta-band">
  <div class="container">
    <h2>Start with a 20-minute diagnostic call.</h2>
-    <p>We'll profile your system, identify the top bottlenecks, and tell you what the savings opportunity looks like — before you commit to anything.</p>
+    <p>We'll profile your system, identify the top bottlenecks, and tell you what the savings opportunity looks like before you commit to anything.</p>
    <div class="cta-band-ctas">
      <a class="btn btn-primary btn-lg" href="contact.html">Book a call</a>
      <a class="btn btn-lg" href="optimize.html">Start a Lightspeed assessment →</a>