INFO: Evaluating: function=_slugify_model_id file=inference/core/cache/air_gapped.py
INFO:   repo=/workspace/inference test_dir=/tests/codeflash_eval output=/logs/verifier
INFO: Found 2 behavioral, 2 perf test files
INFO: Step 0: Computing coverage on original code...
INFO:   Coverage: 100.0%
INFO: Step 1: Behavioral tests on original code...
INFO:   Running behavior tests (iteration=0, 2 files)...
INFO:   Original: 1050 invocations, 1 loops
INFO: Step 2: Behavioral tests on candidate code...
INFO:   Running behavior tests (iteration=1, 2 files)...
INFO:   Candidate: 1050 invocations, 1 loops
INFO: Step 3: Comparing (return values + mutations + exceptions + stdout)...
INFO:   CORRECT: 1050/1050 passed
INFO: Step 4: Performance benchmarks using perf (2 files)...
INFO:   4a: Benchmarking original...
INFO:   Running performance tests (iteration=10, 2 files)...
INFO:   Original runtime: 1,561,398ns (1.56ms)
INFO:   4b: Benchmarking candidate...
INFO:   Running performance tests (iteration=11, 2 files)...
INFO:   Candidate runtime: 114,498ns (0.11ms)
INFO:   Overall speedup: 13.6369x
INFO:   Wrote reward.json: correct=1.0 speedup=13.6369 passed=1050/1050
