Improve the language

This commit is contained in:
Saurabh Misra 2025-01-21 19:48:35 -08:00
parent 39d68fa786
commit 8af36cfe48

View file

@ -3,60 +3,61 @@ sidebar_position: 4
---
# How Codeflash Works
Codeflash follows a "generate and verify" approach. It generates optimizations with LLMs and then verifies rigorously if those optimizations are indeed
faster and if they have the same behavior. The unit of optimization is a function, and codeflash tries to speed up the function, and tries to ensure that the optimizec function still behaves the same way. This way if you merge in the code, it should not break anything, just speed up the execution.
Codeflash follows a "generate and verify" approach to optimize code. It uses LLMs to generate optimizations, then it rigorously verifies if those optimizations are indeed
faster and if they have the same behavior. The basic unit of optimization is a function—Codeflash tries to speed up the function, and tries to ensure that it still behaves the same way. This way if you merge the optimized code, it simply runs faster without breaking any functionality.
## Analysis of your code
## Analysis
Codeflash parses your codebase to discover all the functions that exist within it.
It also discovers any existing unit tests in your projects and determines what function they test.
Codeflash then runs the discovered tests to make sure they did not break.
Codeflash scans your codebase to identify all available functions. It locates existing unit tests in your projects and maps which functions they test. When optimizing a function, Codeflash runs these discovered tests to verify nothing has broken.
#### What kind of functions can Codeflash optimize?
Codeflash works well with functions that are self-contained and have few side effects (communicates with an external system, for example sending a network request). Codeflash currently optimizes a group of function - an entrypoint function and other functions that function directly calls.
Codeflash does not support optimizing async functions yet.
Codeflash works best with self-contained functions that have minimal side effects (like communicating with external systems or sending network requests). Codeflash optimizes a group of functions - consisting of an entry point function and any other functions it directly calls.
Currently, Codeflash cannot optimize async functions.
#### Test Discovery
Codeflash currently only runs tests that directly call the function to optimize in their test body. To discover all the tests that might indirectly call the function, you can use the Codeflash Tracer to trace the test suite which will discover all the tests that eventually call a function.
Codeflash currently only runs tests that directly call the target function in their test body. To discover tests that indirectly call the function, you can use the Codeflash Tracer. The Tracer analyzes your test suite and identifies all tests that eventually call a function.
## Optimization Generation
For a code to optimize, Codeflash gathers all the necessary context from the codebase and calls our backend that generates several candidate optimizations. The optimizations are only "candidates" because we are not sure if they are indeed faster or correct. These properties will be verified later.
To optimize code, Codeflash first gathers all necessary context from the codebase. It then calls our backend to generate several candidate optimizations. These are called "candidates" because their speed and correctness haven't been verified yet. Both properties will be verified in later steps.
## Verification of correctness
![Verification](/img/verification.svg)
The goal of correctness verification is to ensure that if the originally written code is replaced by the new code, there are no behavioral changes and the rest of the codebase and the system behaves exactly the same way. That means, it is safe to swap the original code with the new code.
The goal of correctness verification is to ensure that when the original code is replaced by the new code, there are no behavioral changes in the code and the rest of the system. This means the replacement should be completely safe.
Codeflash verifies correctness by calling the function with a large set of inputs and then verifying that the new function behaves exactly the same way as before with those inputs.
To verify correctness, Codeflash calls the function with numerous inputs, confirming that the new function behaves identically to the original.
Codeflash verifies the following behaviors are correct -
- function return values are exactly the same
Codeflash verifies these specific behaviors to be correct -
- function return values match exactly
- inputs to function have been mutated exactly the same way as before
- the same exception type is thrown as before
- exception types remain consistent
Codeflash also evaluates that there is sufficient line coverage of the code under optimization. This provides more confidence over testing.
Additionally, Codeflash checks for sufficient line coverage of the optimized code, increasing confidence in the testing process.
Codeflash also evaluates that there is sufficient line coverage of the code under optimization. This provides more confidence with testing.
We recommend manually reviewing the optimized code, since there might be important input cases that we havent verified where the behavior could differ.
#### Test Generation
Codeflash currently generates 2 types of tests -
- LLM Generated tests - Codeflash generates several regression tests with LLMs which generate test cases that test for the typical cases that function might be called with, edge cases that the function might see and also some large scale inputs to check for their performance.
- Concolic coverage tests - Codeflash incorporates state of the art concolic testing that uses an SMT Solver (a kind of theorem prover) to explore execution paths and look for arguments to functions. The goal is to maximize the coverage of the function under optimization. This creates a test file which is run by Codeflash to verify correctness. We currently only support generating this test for pytest.
Codeflash generates two types of tests:
- LLM Generated tests - Codeflash uses LLMs to create several regression test cases that cover typical function usage, edge cases, and large-scale inputs to verify both correctness and performance.
- Concolic coverage tests - Codeflash uses state-of-the-art concolic testing with an SMT Solver (a theorem prover) to explore execution paths and generate function arguments. This aims to maximize code coverage for the function being optimized. Codeflash runs the resulting test file to verify correctness. Currently, this feature only supports pytest.
## Execution of code
## Code Execution
Codeflash executes the tests that test the function to optimize with either pytest or unittest test-framework. The test runs on your machine, to have access to the Python environment and any other dependencies associated to run the code properly. Moreover, running on your system helps codeflash to measure the performance correctly as the runtime depends on the system it runs on.
Codeflash runs tests for the target function using either pytest or unittest frameworks. The tests execute on your machine, ensuring access to the Python environment and any other dependencies associated to let Codeflash run your code properly. Running on your machine also ensures accurate performance measurements since runtime varies by system.
#### Performance benchmarking
Codeflash carefully implements several techniques to measure the performance of the code correctly.
In particular, it runs the code in a loop several times to be able to figure out the best performance with minimum runtime. We compare the performance of the original code vs the optimization to see if the optimization is indeed at least 10% faster before we call it faster. This technique gets rid of most of the variability in runtime measurement, even on noisy CI systems and virtual machines.
The runtime number Codeflash finally reports is the minimum total time it took to run all the test cases.
## Creation of Pull Request
Once an optimization passes all the checks, Codeflash creates a Pull request using the Codeflash GitHub app directly on your repository.
The pull request shows the speedup percentage, an explanation of how the optimization works, the number of tests that were run, the test coverage and the test content themselves.
You can review the new code and merge it in, if it looks good. Feel free to make any modifications to the code if necessary, we won't hold it against you :)
Codeflash implements several techniques to measure code performance accurately. In particular, it runs multiple iterations of the code in a loop to determine the best performance with the minimum runtime. Codeflash compares performance of the original code against the optimization, requiring at least a 10% speed improvement before considering it faster. This approach eliminates most runtime measurement variability, even on noisy CI systems and virtual machines. The final runtime Codeflash reports is the minimum total time it took to run all the test cases.
## Creating Pull Requests
Once an optimization passes all checks, Codeflash creates a pull request through the Codeflash GitHub app directly in your repository. The pull request includes the new code, the speedup percentage, an explanation of the optimization, test statistics including coverage, and the test content itself. You can review and merge the new code if it meets your standards. Feel free to modify the code as needed—we welcome your improvements!