codeflash-agent/codeflash-evals/templates/memory-misdirection/tests/test_analytics.py
Kevin Turcios 37efa524d7 feat: improve skill, eval system, and tessl config
- Optimize codeflash-optimize SKILL.md (review score 17% → 98%, eval 87% → 100%)
  - Fix frontmatter (allowed-tools format, argument-hint under metadata)
  - Lead description with concrete actions, explicit agent launch parameters
- Add multi-run variance detection to eval system (--runs N flag)
  - score.py aggregate command: min/max/avg/stddev per criterion, flaky detection
  - check-regression.sh defaults to 3 runs for reliable regression detection
- Add per-criterion regression tracking to baseline-scores.json (v3)
  - Reports exactly which criteria regressed, not just total score drops
- Rename evals/ → codeflash-evals/ to avoid tessl directory conflicts
- Switch tessl to managed mode, gitignore vendored tiles and symlinks
2026-03-27 11:30:17 -05:00

43 lines
1.5 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

from analytics.core import generate_transactions, process_transactions
def test_basic():
raw = generate_transactions(10)
result = process_transactions(raw)
assert result["total_processed"] == 10
assert result["group_count"] > 0
assert "TRANSACTION ANALYTICS REPORT" in result["report"]
# Check analytics structure
for key, stats in result["analytics"].items():
assert stats["count"] > 0
assert stats["revenue"] > 0
assert stats["avg_order"] > 0
assert stats["min_order"] <= stats["max_order"]
def test_large_batch():
"""Production-scale batch — process_transactions uses too much memory.
With 50k transactions, peak memory is far higher than the input data size.
The goal is to reduce memory overhead while preserving correctness.
"""
raw = generate_transactions(50_000)
result = process_transactions(raw)
# Correctness checks
assert result["total_processed"] == 50_000
assert result["group_count"] == 50 # 5 regions × 10 categories
assert "TRANSACTION ANALYTICS REPORT" in result["report"]
# Verify analytics integrity
total_count = sum(s["count"] for s in result["analytics"].values())
assert total_count == 50_000
total_revenue = sum(s["revenue"] for s in result["analytics"].values())
assert total_revenue > 0
for key, stats in result["analytics"].items():
assert stats["count"] > 0
assert stats["revenue"] > 0
assert stats["avg_order"] > 0
assert stats["min_order"] <= stats["max_order"]