Add workload profile explanations to latency benchmark table

The 1p/10p/16p column headers weren't self-explanatory. Added a
"Benchmark Workload Profiles" card above the latency table in the
Detail view explaining that each document tests a distinct workload
shape (table-dense, scanned, mixed), not just different page counts.

Also added annotation below the table calling out that #1505 has 4x
the impact on the 1-page doc vs. the 16-page doc — letting the data
demonstrate that per-document cost depends on content, not page count.
This commit is contained in:
Kevin Turcios 2026-04-16 02:27:00 -05:00
parent ddb4cf8258
commit eeebf6eec2

View file

@ -2334,6 +2334,57 @@ def build_detail_view():
"Latency Optimization Detail",
"Individual PR benchmarks (standalone vs main) and cumulative via FastAPI endpoint.",
),
# ── Workload Profiles ──
card(
[
html.H3(
"Benchmark Workload Profiles",
style={
"fontSize": "16px",
"fontWeight": "700",
"color": ACCENT,
"margin": "0 0 16px",
},
),
html.P(
"Page count is one dimension of workload, but content density "
"and element type are what actually drive compute cost. A 10-page "
"table-heavy PDF can be more expensive than a 100-page native text PDF. "
"These three documents were chosen to isolate different workload shapes, "
"not just different page counts.",
style={
"color": GRAY,
"fontSize": "14px",
"lineHeight": "1.6",
"margin": "0 0 16px",
},
),
html.Div(
[
_method_row(
"1p-tables",
"A single page dense with tables. Despite being 1 page, "
"this is the heaviest per-page workload — each table triggers "
"its own OCR + transformer inference pass. Isolates optimizations "
"that target per-element cost.",
),
_method_row(
"10p-scan",
"10-page scanned document, hi_res strategy. Every page goes through "
"the full pipeline: render → layout detection → OCR. Closest to the "
"real production workload on the FastAPI endpoint.",
),
_method_row(
"16p-mixed",
"16 pages of mixed content: native text, scans, and tables. Not every "
"page hits the heavy path — native text skips OCR entirely. Tests that "
"optimizations improve the heavy path without regressing the light one.",
),
]
),
],
marginBottom="24px",
),
dash_table.DataTable(
columns=[
{"name": "Optimization", "id": "opt"},
@ -2367,7 +2418,11 @@ def build_detail_view():
html.P(
"Individual contributions overlap (they optimize adjacent stages of the same pipeline), "
"so they don't sum to the cumulative total. Cumulative measured through the real production path: "
"uvicorn -> FastAPI -> POST /general/v0/general with strategy=hi_res.",
"uvicorn -> FastAPI -> POST /general/v0/general with strategy=hi_res. "
"Note how #1505 has 4x the impact on the 1-page doc vs. the 16-page doc — "
"because that single page is table-dense and OCR-heavy. Conversely, #1503 scales "
"with page count because it optimizes a per-page operation (render format). "
"This is why per-document workload depends on content, not page count.",
style={
"color": LIGHT_GRAY,
"fontSize": "12px",