first version of how codeflash works
This commit is contained in:
parent
9ffc304c3c
commit
39d68fa786
2 changed files with 58 additions and 0 deletions
|
|
@ -3,3 +3,60 @@ sidebar_position: 4
|
|||
---
|
||||
# How Codeflash Works
|
||||
|
||||
Codeflash follows a "generate and verify" approach. It generates optimizations with LLMs and then verifies rigorously if those optimizations are indeed
|
||||
faster and if they have the same behavior. The unit of optimization is a function, and codeflash tries to speed up the function, and tries to ensure that the optimizec function still behaves the same way. This way if you merge in the code, it should not break anything, just speed up the execution.
|
||||
|
||||
|
||||
## Analysis
|
||||
Codeflash parses your codebase to discover all the functions that exist within it.
|
||||
It also discovers any existing unit tests in your projects and determines what function they test.
|
||||
Codeflash then runs the discovered tests to make sure they did not break.
|
||||
|
||||
#### What kind of functions can Codeflash optimize?
|
||||
Codeflash works well with functions that are self-contained and have few side effects (communicates with an external system, for example sending a network request). Codeflash currently optimizes a group of function - an entrypoint function and other functions that function directly calls.
|
||||
Codeflash does not support optimizing async functions yet.
|
||||
|
||||
#### Test Discovery
|
||||
Codeflash currently only runs tests that directly call the function to optimize in their test body. To discover all the tests that might indirectly call the function, you can use the Codeflash Tracer to trace the test suite which will discover all the tests that eventually call a function.
|
||||
|
||||
|
||||
## Optimization Generation
|
||||
|
||||
For a code to optimize, Codeflash gathers all the necessary context from the codebase and calls our backend that generates several candidate optimizations. The optimizations are only "candidates" because we are not sure if they are indeed faster or correct. These properties will be verified later.
|
||||
|
||||
|
||||
## Verification of correctness
|
||||
|
||||

|
||||
|
||||
The goal of correctness verification is to ensure that if the originally written code is replaced by the new code, there are no behavioral changes and the rest of the codebase and the system behaves exactly the same way. That means, it is safe to swap the original code with the new code.
|
||||
|
||||
Codeflash verifies correctness by calling the function with a large set of inputs and then verifying that the new function behaves exactly the same way as before with those inputs.
|
||||
|
||||
Codeflash verifies the following behaviors are correct -
|
||||
- function return values are exactly the same
|
||||
- inputs to function have been mutated exactly the same way as before
|
||||
- the same exception type is thrown as before
|
||||
|
||||
Codeflash also evaluates that there is sufficient line coverage of the code under optimization. This provides more confidence over testing.
|
||||
|
||||
#### Test Generation
|
||||
Codeflash currently generates 2 types of tests -
|
||||
|
||||
- LLM Generated tests - Codeflash generates several regression tests with LLMs which generate test cases that test for the typical cases that function might be called with, edge cases that the function might see and also some large scale inputs to check for their performance.
|
||||
- Concolic coverage tests - Codeflash incorporates state of the art concolic testing that uses an SMT Solver (a kind of theorem prover) to explore execution paths and look for arguments to functions. The goal is to maximize the coverage of the function under optimization. This creates a test file which is run by Codeflash to verify correctness. We currently only support generating this test for pytest.
|
||||
|
||||
|
||||
## Execution of code
|
||||
|
||||
Codeflash executes the tests that test the function to optimize with either pytest or unittest test-framework. The test runs on your machine, to have access to the Python environment and any other dependencies associated to run the code properly. Moreover, running on your system helps codeflash to measure the performance correctly as the runtime depends on the system it runs on.
|
||||
|
||||
#### Performance benchmarking
|
||||
Codeflash carefully implements several techniques to measure the performance of the code correctly.
|
||||
In particular, it runs the code in a loop several times to be able to figure out the best performance with minimum runtime. We compare the performance of the original code vs the optimization to see if the optimization is indeed at least 10% faster before we call it faster. This technique gets rid of most of the variability in runtime measurement, even on noisy CI systems and virtual machines.
|
||||
The runtime number Codeflash finally reports is the minimum total time it took to run all the test cases.
|
||||
|
||||
## Creation of Pull Request
|
||||
Once an optimization passes all the checks, Codeflash creates a Pull request using the Codeflash GitHub app directly on your repository.
|
||||
The pull request shows the speedup percentage, an explanation of how the optimization works, the number of tests that were run, the test coverage and the test content themselves.
|
||||
You can review the new code and merge it in, if it looks good. Feel free to make any modifications to the code if necessary, we won't hold it against you :)
|
||||
1
docs/static/img/verification.svg
vendored
Normal file
1
docs/static/img/verification.svg
vendored
Normal file
File diff suppressed because one or more lines are too long
|
After Width: | Height: | Size: 24 KiB |
Loading…
Reference in a new issue