Replace all sync test runner calls (run_behavioral_tests,
run_benchmarking_tests, run_line_profile_tests) with their async
counterparts throughout the pipeline. This eliminates the
ThreadPoolExecutor in _baseline.py in favor of asyncio.gather(),
and makes _async_bench.py, _candidate_gen.py, and
_function_optimizer.py fully async. Adds async_run_line_profile_tests
and coverage support to async_run_behavioral_tests in _test_runner.py.