One major limitation the LLM has is that it can't run a profiler on the code,
but we can. (This would be a fun thing to do in the future - feed the output
of perf to the LLM and say 'now optimize').
This has been a feature of the Scalene Python profiler (https://github.com/plasma-umass/scalene) for some time (at this point, about 1.5 years) - bring your own API key for OpenAI / Azure / Bedrock, also works with Ollama. Optimizing Python code to use NumPy or other similar native libraries can easily yield multiple order of magnitude improvements in real-world settings. We tried it on several of the success stories of Scalene (before the integration with LLMs); see https://github.com/plasma-umass/scalene/issues/58 - and found that it often automatically yielded the same or better optimizations - see https://github.com/plasma-umass/scalene/issues/554. (Full disclosure: I am one of the principal designers of Scalene.)I think with new agent capabilities of AI IDEs like Cursor and Cline, you can simply instruct the agent to run a profiler in the shell and make optimizations based on the output of the profiler. You can even instruct the agent to try to make the program run faster than a certain time limit and will try to reach that goal iteratively.
This same approach can work for eliminating errors in LLM generated code. We've had good luck doing it with Lattice [1] going from 5% to 90% success rate of creating a fully functional application.
I'm drafting a blog post that talks about how but for now the documentation will have to do.
Correction: it’s been a feature for at least two years. To our knowledge, Scalene was the first profiler to incorporate LLMs.
There are many LLMs that have tools (function calling) available in their API. Some time ago, I was building AI agent that has profiler connected by function calling, and it works exactly as you described. First produce the code, then call profiler, and last step was to optimize code.
[dead]