codeflash-internal/tiles/codeflash-internal-skills/evals/scenario-5/task.md

# Scenario: Generated tests fail to compile after instrumentation

## Context

Test generation for a Python function that uses PyTorch produces tests that fail to compile. The function under test is:

```python
import torch

def normalize_tensor(t: torch.Tensor) -> torch.Tensor:
    return (t - t.mean()) / t.std()
```

The testgen request completes successfully -- the LLM returns valid test code. However, after the instrumentation stage, the tests fail with:

```
SyntaxError: invalid syntax
  File "test_normalize_tensor_instrumented.py", line 15
    torch.cuda.synchronize()import torch
                            ^^^^^^
```

The raw (pre-instrumentation) test code compiles fine. The issue only appears after instrumentation.

Server logs:

```
INFO  2026-02-14 14:00:00 testgen: build_prompt completed, prompts non-empty
INFO  2026-02-14 14:00:02 llm: call_llm completed for model gpt-4o
INFO  2026-02-14 14:00:03 postprocessing: add_missing_imports completed
INFO  2026-02-14 14:00:03 instrumentation: detect_frameworks_from_code found ['torch']
INFO  2026-02-14 14:00:03 instrumentation: _create_device_sync_precompute_statements injecting cuda.synchronize()
ERROR 2026-02-14 14:00:03 instrumentation: instrumented tests failed to compile - SyntaxError
```

## Task

Diagnose why the instrumented tests fail to compile. Walk through the test generation pipeline, focusing on the instrumentation stage. Provide:

1. The exact stage where the failure occurs.
2. The specific function responsible for the syntax error.
3. An explanation of what `_create_device_sync_precompute_statements()` does and how it can introduce syntax errors.
4. The file to investigate and what to look for.
5. A recommendation for fixing the instrumentation bug.

## Expected Outputs

- Identification that the failure is at the **instrumentation** stage (Step 7 of the debug-test-generation workflow).
- The file responsible is `core/languages/python/testgen/instrumentation/instrument_new_tests.py`.
- Explanation that `detect_frameworks_from_code()` detected PyTorch, which triggered `_create_device_sync_precompute_statements()` to inject `torch.cuda.synchronize()` calls for GPU timing accuracy.
- The syntax error (`torch.cuda.synchronize()import torch`) indicates the sync statement was injected without a newline separator, concatenating with an existing import line.
- Recommendation: check the injection logic in `_create_device_sync_precompute_statements()` to ensure it adds proper newlines/indentation when inserting sync calls, and add a compilation check after instrumentation to catch such issues before returning.
test: add evals for all three Tessl tiles 2026-02-15 03:25:30 +00:00			`# Scenario: Generated tests fail to compile after instrumentation`

			`## Context`

			`Test generation for a Python function that uses PyTorch produces tests that fail to compile. The function under test is:`

			```python
			`import torch`

			`def normalize_tensor(t: torch.Tensor) -> torch.Tensor:`
			`return (t - t.mean()) / t.std()`
			```

			`The testgen request completes successfully -- the LLM returns valid test code. However, after the instrumentation stage, the tests fail with:`

			```
			`SyntaxError: invalid syntax`
			`File "test_normalize_tensor_instrumented.py", line 15`
			`torch.cuda.synchronize()import torch`
			`^^^^^^`
			```

			`The raw (pre-instrumentation) test code compiles fine. The issue only appears after instrumentation.`

			`Server logs:`

			```
			`INFO 2026-02-14 14:00:00 testgen: build_prompt completed, prompts non-empty`
			`INFO 2026-02-14 14:00:02 llm: call_llm completed for model gpt-4o`
			`INFO 2026-02-14 14:00:03 postprocessing: add_missing_imports completed`
			`INFO 2026-02-14 14:00:03 instrumentation: detect_frameworks_from_code found ['torch']`
			`INFO 2026-02-14 14:00:03 instrumentation: _create_device_sync_precompute_statements injecting cuda.synchronize()`
			`ERROR 2026-02-14 14:00:03 instrumentation: instrumented tests failed to compile - SyntaxError`
			```

			`## Task`

			`Diagnose why the instrumented tests fail to compile. Walk through the test generation pipeline, focusing on the instrumentation stage. Provide:`

			`1. The exact stage where the failure occurs.`
			`2. The specific function responsible for the syntax error.`
			3. An explanation of what `_create_device_sync_precompute_statements()` does and how it can introduce syntax errors.
			`4. The file to investigate and what to look for.`
			`5. A recommendation for fixing the instrumentation bug.`

			`## Expected Outputs`

			`- Identification that the failure is at the instrumentation stage (Step 7 of the debug-test-generation workflow).`
			- The file responsible is `core/languages/python/testgen/instrumentation/instrument_new_tests.py`.
			- Explanation that `detect_frameworks_from_code()` detected PyTorch, which triggered `_create_device_sync_precompute_statements()` to inject `torch.cuda.synchronize()` calls for GPU timing accuracy.
			- The syntax error (`torch.cuda.synchronize()import torch`) indicates the sync statement was injected without a newline separator, concatenating with an existing import line.
			- Recommendation: check the injection logic in `_create_device_sync_precompute_statements()` to ensure it adds proper newlines/indentation when inserting sync calls, and add a compilation check after instrumentation to catch such issues before returning.