mirror of
https://github.com/codeflash-ai/codeflash-internal.git
synced 2026-05-04 18:25:18 +00:00
13 KiB
13 KiB
Codeflash RL Environment — Batch Validation Report
Summary
| Metric | Count | % |
|---|---|---|
| Total tasks | 106 | 100% |
| Solve passes | 0 | 0% |
| Eval correct (all behavioral tests pass) | 106 | 100% |
| Faster than original (speedup > 1.0) | 105 | 99% |
| All test cases pass | 106 | 100% |
Speedup Distribution (correct tasks only)
- Slower (< 1x): 1 tasks
- 1-1.5x: 52 tasks
- 1.5-2x: 16 tasks
- 2-5x: 14 tasks
- 5-100x: 17 tasks
- >100x: 6 tasks
Successful Tasks (correct=1.0)
| Task | Function | Speedup | Tests | Coverage | Quality | DB Speedup |
|---|---|---|---|---|---|---|
| models-prepare_multi_label_classification_response | prepare_multi_label_classification_response |
62109.3293x | 32/32 | 7.7% | low | 68465.46x |
| introspection-prepare_operators_descriptions | prepare_operators_descriptions |
15264.3773x | 1057/1057 | 35.0% | 14150.75x | |
| decorators-withfixedsizecache-memory_pressure_detected | memory_pressure_detected |
1789.6793x | 132/132 | 37.8% | 917.09x | |
| depth_anything_v3-inferencemodelsdepthanythingv3adapter-predict | predict |
388.9574x | 36/36 | 23.2% | 391.52x | |
| detection_event_log-detectioneventlogblockv1-_evict_oldest_video | _evict_oldest_video |
337.1908x | 170/170 | 46.4% | low | 15.92x |
| camera-_generate_grid_colors | _generate_grid_colors |
283.7794x | 1901/1901 | 9.0% | 218.30x | |
| workflow_caller-_check_workflow_for_circular_references | _check_workflow_for_circular_references |
34.9018x | 41/41 | 31.1% | low | 11.84x |
| semantic_segmentation-blockmanifest-describe_outputs | describe_outputs |
22.3384x | 2539/2539 | 38.9% | high | 21.92x |
| dynamic_blocks-build_traceback_string | build_traceback_string |
20.1609x | 2047/2047 | 16.0% | low | 13.38x |
| bytetrack-bytetrackmanifest-describe_outputs | describe_outputs |
18.8987x | 3033/3033 | 90.2% | low | 17.56x |
| workflow_caller-_describe_outputs_from_spec | _describe_outputs_from_spec |
16.9712x | 25/25 | 23.1% | low | 12.94x |
| event_writer-_extract_detail | _extract_detail |
15.8018x | 43/43 | 31.9% | 13.46x | |
| managers-try_releasing_cuda_memory | try_releasing_cuda_memory |
15.0190x | 1006/1006 | 10.8% | 1.22x | |
| s3-deduct_csv_header | deduct_csv_header |
13.9646x | 54/54 | 38.6% | 8.90x | |
| cache-_slugify_model_id | _slugify_model_id |
13.1907x | 1050/1050 | 100.0% | 11.21x | |
| sort-sortmanifest-describe_outputs | describe_outputs |
12.9542x | 8036/8036 | 89.7% | low | 17.16x |
| dynamic_blocks-create_dynamic_module | create_dynamic_module |
10.7336x | 142/142 | 27.4% | 12.31x | |
| dataset_upload-roboflowdatasetuploadblockv2-run | run |
9.8378x | 13/13 | 57.1% | 9.14x | |
| glm_ocr-blockmanifest-describe_outputs | describe_outputs |
8.2526x | 1035/1035 | 51.9% | medium | 9.33x |
| qwen3_5vl-blockmanifest-describe_outputs | describe_outputs |
7.6162x | 3228/3228 | 48.6% | medium | 7.30x |
| http-with_route_exceptions | with_route_exceptions |
6.3600x | 1297/1297 | 8.1% | low | 6.89x |
| introspection-prepare_operations_descriptions | prepare_operations_descriptions |
6.3557x | 147/147 | 82.5% | high | 6.26x |
| qwen3_5vl-qwen35vlblockv1-run_remotely | run_remotely |
5.0724x | 23/23 | 69.4% | 5.64x | |
| core_steps-load_kinds | load_kinds |
4.7680x | 1153/1153 | 42.0% | 3.68x | |
| depth_anything_v2-inferencemodelsdepthanythingv2adapter-predict | predict |
4.0435x | 38/38 | 60.2% | medium | 5.03x |
| qwen3_5vl-inferencemodelsqwen35vladapter-predict | predict |
3.5654x | 2275/2275 | 70.3% | low | 3.72x |
| core-_prepare_workflow_response_cache_key | _prepare_workflow_response_cache_key |
2.9861x | 7539/7539 | 2.7% | medium | 2.39x |
| event_writer-_detections_to_v2_instance_segmentations | _detections_to_v2_instance_segmentations |
2.7234x | 36/36 | 41.2% | high | 2.18x |
| managers-modelmanager-_dispose_model_lock | _dispose_model_lock |
2.7203x | 2784/2784 | 14.7% | 3.24x | |
| compiler-establish_step_execution_dimensionality | establish_step_execution_dimensionality |
2.6967x | 47/47 | 23.2% | 2.37x | |
| semantic_segmentation-roboflowsemanticsegmentationmodelblockv1-_convert_to_sv_de | _convert_to_sv_detections |
2.6344x | 13/13 | 71.7% | 2.22x | |
| qwen3vl-inferencemodelsqwen3vladapter-map_inference_kwargs | map_inference_kwargs |
2.6323x | 1125/1125 | 26.8% | medium | 2.39x |
| models-baseinference-infer | infer |
2.2840x | 1037/1037 | 2.8% | low | 2.32x |
| clip_comparison-blockmanifest-get_required_cache_artifacts | get_required_cache_artifacts |
2.2470x | 130/130 | 26.6% | 2.04x | |
| text_display-clamp_box | clamp_box |
2.1986x | 1210/1210 | 15.0% | high | 2.80x |
| compiler-verify_compatibility_of_input_data_lineage_with_control_flow_lineage | verify_compatibility_of_input_data_lineage_with_control_flow_lineage |
2.0117x | 39/39 | 26.4% | low | 2.11x |
| introspection-_get_property_name_options | _get_property_name_options |
2.0115x | 1053/1053 | 57.5% | 1.52x | |
| execution_data_manager-executiondatamanager-_register_control_flow_output_for_no | _register_control_flow_output_for_non_simd_step |
1.9572x | 32/32 | 20.2% | high | 2.65x |
| compiler-_collect_unique_control_flow_lineages_with_step_mapping | _collect_unique_control_flow_lineages_with_step_mapping |
1.9380x | 33/33 | 24.3% | 1.95x | |
| mask_area_measurement-maskareameasurementblockv1-run | run |
1.9282x | 39/39 | 93.0% | 1.65x | |
| entities-workflowimagedata-copy_and_replace | copy_and_replace |
1.9097x | 2336/2336 | 72.1% | medium | 2.04x |
| compiler-separate_control_flow_predecessors_from_data_providers | separate_control_flow_predecessors_from_data_providers |
1.8826x | 34/34 | 23.1% | high | 1.87x |
| core-_forcetracerootsampler-get_description | get_description |
1.8642x | 3244/3244 | 1.6% | medium | 2.03x |
| enterprise_blocks-load_enterprise_blocks | load_enterprise_blocks |
1.8464x | 1936/1936 | 32.2% | low | 1.45x |
| event_writer-_build_event_data | _build_event_data |
1.8219x | 4732/4732 | 34.7% | medium | 1.74x |
| cache-get_cached_foundation_models | get_cached_foundation_models |
1.7699x | 32/32 | 34.7% | low | 1.46x |
| compiler-step_definition_allows_control_flow_references | step_definition_allows_control_flow_references |
1.7359x | 27/27 | 22.5% | medium | 1.86x |
| dataset_upload-maybe_register_datapoint_at_roboflow | maybe_register_datapoint_at_roboflow |
1.7163x | 1039/1039 | 55.6% | 1.47x | |
| introspection-retrieve_selectors_from_union_definition | retrieve_selectors_from_union_definition |
1.6884x | 36/36 | 22.2% | medium | 1.98x |
| introspection-_ref_to_def_name | _ref_to_def_name |
1.6153x | 1344/1344 | 27.5% | medium | 1.51x |
| mask_area_measurement-compute_detection_areas | compute_detection_areas |
1.6137x | 24/24 | 83.0% | 1.46x | |
| managers-list_files | list_files |
1.5955x | 99/99 | 8.9% | 1.66x | |
| dynamic_blocks-assembly_custom_python_block | assembly_custom_python_block |
1.5618x | 135/135 | 36.7% | low | 1.61x |
| compiler-is_control_flow_step | is_control_flow_step |
1.4868x | 1830/1830 | 15.3% | medium | 1.34x |
| qwen3_5vl-inferencemodelsqwen35vladapter-map_inference_kwargs | map_inference_kwargs |
1.4798x | 1549/1549 | 64.9% | low | 1.53x |
| core-_url_for_safe_logging | _url_for_safe_logging |
1.4670x | 1055/1055 | 2.8% | 1.47x | |
| usage_tracking-usagecollector-_compute_execution_duration | _compute_execution_duration |
1.4621x | 2017/2017 | 27.5% | 1.55x | |
| execution_data_manager-construct_mask_for_all_inputs_dimensionalities | construct_mask_for_all_inputs_dimensionalities |
1.4471x | 31/31 | 19.0% | 1.51x | |
| execution_data_manager-construct_simd_step_input | construct_simd_step_input |
1.4210x | 26/26 | 28.3% | low | 1.37x |
| qwen3_5vl-qwen35vlblockv1-run | run |
1.4114x | 28/28 | 93.1% | low | 1.69x |
| common-add_inference_keypoints_to_sv_detections | add_inference_keypoints_to_sv_detections |
1.4070x | 30/30 | 4.1% | 1.56x | |
| core-get_workflow_specification | get_workflow_specification |
1.4033x | 1157/1157 | 3.6% | low | 1.56x |
| common-deserialize_image_kind | deserialize_image_kind |
1.3860x | 1506/1506 | 7.4% | 1.42x | |
| cache-is_block_cached | is_block_cached |
1.3696x | 53/53 | 27.9% | high | 1.36x |
| email_notification-format_email_message | format_email_message |
1.3680x | 56/56 | 31.7% | high | 1.35x |
| managers-modelmanager-infer_from_request_sync | infer_from_request_sync |
1.3592x | 3041/3041 | 13.7% | low | 1.46x |
| sequences-sequence_apply | sequence_apply |
1.3493x | 58/58 | 30.2% | high | 1.48x |
| cache-get_task_type_to_block_mapping | get_task_type_to_block_mapping |
1.3274x | 30/30 | 29.6% | low | 1.39x |
| execution_data_manager-filter_to_valid_prefix_chains | filter_to_valid_prefix_chains |
1.3229x | 32/32 | 15.3% | high | 1.32x |
| entities-batch-remove_by_indices | remove_by_indices |
1.3209x | 44/44 | 65.4% | high | 1.26x |
| dataset_upload-is_prediction_registration_forbidden | is_prediction_registration_forbidden |
1.3204x | 2043/2043 | 31.7% | 1.44x | |
| webrtc_worker-videoframeprocessor-_check_termination | _check_termination |
1.3132x | 2029/2029 | 16.1% | 1.36x | |
| cache-_is_model_cached | _is_model_cached |
1.3058x | 45/45 | 27.0% | 1.24x | |
| core-load_cached_workflow_response | load_cached_workflow_response |
1.2964x | 12126/12126 | 2.8% | low | 1.38x |
| workflow_caller-_extract_workflow_caller_ids_from_spec | _extract_workflow_caller_ids_from_spec |
1.2938x | 44/44 | 25.8% | medium | 1.34x |
| workflow_caller-_fetch_workflow_spec_for_validation | _fetch_workflow_spec_for_validation |
1.2808x | 1547/1547 | 23.1% | low | 1.33x |
| cache-is_model_cached | is_model_cached |
1.2649x | 55/55 | 28.7% | medium | 1.22x |
| execution_data_manager-intersect_masks_per_dimension | intersect_masks_per_dimension |
1.2627x | 40/40 | 13.5% | 1.66x | |
| dataset_upload-register_datapoint_at_roboflow | register_datapoint_at_roboflow |
1.2611x | 2037/2037 | 38.6% | medium | 1.32x |
| webrtc_worker-videoframeprocessor-serialize_outputs_sync | serialize_outputs_sync |
1.2545x | 48/48 | 17.9% | 1.37x | |
| anthropic_claude-blockmanifest-get_air_gapped_availability | get_air_gapped_availability |
1.2463x | 2243/2243 | 16.4% | low | 1.45x |
| executor-_run_workflow | _run_workflow |
1.2057x | 130/130 | 21.6% | low | 1.22x |
| http-_build_step_execution_error_response | _build_step_execution_error_response |
1.2001x | 1029/1029 | 1.0% | low | 1.19x |
| detection_event_log-detectioneventlogblockv1-_get_relative_time | _get_relative_time |
1.1992x | 41/41 | 43.0% | 1.19x | |
| dataset_upload-roboflowdatasetuploadblockv1-run | run |
1.1971x | 41/41 | 38.6% | medium | 1.26x |
| managers-rank_for_deletion | rank_for_deletion |
1.1867x | 106/106 | 7.3% | 1.88x | |
| models-inferencemodelsobjectdetectionadapter-postprocess | postprocess |
1.1644x | 33/33 | 8.7% | 1.23x | |
| compiler-get_lineage_derived_from_control_flow | get_lineage_derived_from_control_flow |
1.1586x | 33/33 | 23.8% | low | 1.25x |
| text_display-draw_background_with_alpha | draw_background_with_alpha |
1.1567x | 176/176 | 29.5% | 1.18x | |
| core-record_inference | record_inference |
1.1509x | 3033/3033 | 1.6% | low | 1.22x |
| execution_data_manager-get_masks_intersection_for_dimensions | get_masks_intersection_for_dimensions |
1.1479x | 36/36 | 16.9% | low | 1.23x |
| email_notification-apply_operations_to_message_parameters | apply_operations_to_message_parameters |
1.1431x | 44/44 | 29.5% | low | 1.15x |
| easy_ocr-blockmanifest-get_supported_model_variants | get_supported_model_variants |
1.1429x | 2039/2039 | 57.5% | medium | 1.31x |
| mask_area_measurement-get_detection_area | get_detection_area |
1.1349x | 129/129 | 83.7% | medium | 1.19x |
| dataset_upload-register_datapoint | register_datapoint |
1.1344x | 1138/1138 | 42.5% | low | 1.16x |
| event_writer-_build_image_entry | _build_image_entry |
1.1270x | 1337/1337 | 60.6% | low | 1.10x |
| moondream2-inferencemodelsmoondream2adapter-caption | caption |
1.1264x | 185/185 | 45.1% | medium | 1.11x |
| yolo_world-blockmanifest-get_supported_model_variants | get_supported_model_variants |
1.1187x | 2232/2232 | 50.0% | medium | 1.29x |
| trackers-instancecache-record_instance | record_instance |
1.1115x | 14857/14857 | 17.3% | 1.14x | |
| workflow_caller-workflowcallerblockv1-run | run |
1.1059x | 59/59 | 48.9% | 1.13x | |
| webrtc_worker-default_encoder | default_encoder |
1.0985x | 4071/4071 | 17.0% | 1.12x | |
| cache-_get_block_type_identifier | _get_block_type_identifier |
1.0949x | 34/34 | 26.5% | low | 1.11x |
| workflow_caller-_convert_output_descriptions_to_kinds | _convert_output_descriptions_to_kinds |
1.0865x | 37/37 | 24.6% | medium | 1.19x |
| common-serialise_sv_detections | serialise_sv_detections |
1.0764x | 149/149 | 5.1% | 1.19x | |
| openai-execute_gpt_4v_request | execute_gpt_4v_request |
1.0642x | 37/37 | 25.8% | high | 2.00x |
| notification-blockmanifest-get_air_gapped_availability | get_air_gapped_availability |
0.7728x | 1535/1535 | 43.0% | low | 1.14x |