One question I continue to ask myself when I see impressive AI examples like this, (because it is impressive) is "are we solving the right problems"? Instead of training an AI to produce a tonne of JS to create something that's been done before, should we not be engineering better tools to allow people to precisely craft software without the need for AI.
I worry that because we are now able to instantly produce a bunch of JS to do X thing, we will be incentivized not to change the underlying tools (because one, only AI's are using it, and two AI's won't know how to use the new thing)
I worry this will stall progress.
Yes and no. It's very impressive, in that it would take a human with no coding knowledge a very long time to learn how to write it. It's not impressive if you know the anatomy. But it's only as good as asking the "programmer" to make those stars in the background stop running around wildly, or make it scrolling, or something. Then what. Well geez, I have no idea how this works so I better ask the AI to do that for me?
It's not really that different from taking your 2022 car to a shop to adjust your camless engine, and assuming everything's fine, but not having a clue what they did to it or how to fix it if the engine explodes the next day. You can't even prove it had something to do with what the shop did, or if they actually did anything at all. They probably don't even know what they did.
It won't stall progress for clever people who actually want to figure things out and know what they're doing. But it will certainly produce lots more garbage.
That said, the game is impressive for something stitched together by an LLM.
AI is the bulldozer for building ever bigger piles of rocks, like the pyramids. Impressive in their way, but unsustainable.
You still need paradigmatic shifts in architecture to enable delivering scale and quality from a smaller amount of materials, and it has not made a dent there, yet.
An evil timeline is the construction of frameworks in the future that are only grokkable by AI.
The standard for new frameworks won't be "does this make humans more productive using new concepts". It will be "can I get an LLM to generate code that uses this framework".
The world you describe is about how I feel about Law.
Valid point about whether AI generating piles of JavaScript might lock us into outdated tools. Isn't what you're describing though a form of advanced prompting? The ideal tool could act as a layer that takes a prompt and builds exactly what's needed to deliver the outcome, without humans or AI wrestling with traditional code. This is similar to what OpenAI initially aimed for with function calling, where AI writes and executes code in a sandbox to fulfill a prompt but it was clunky so they shifted to structured tool calling. The reason we still need program code like JavaScript, rust, assembly, ... is that current computing architectures are built for it. That’s changing, though, with initiatives from NVIDIA and other startups exploring inference as the main computational primitive, which could reduce/remove the need for conventional programming.
I don't see AI inference becoming the main computational primitive. It will always be faster and more efficient on the execution side to just write some code to do a task, than it will be to ask an AI to do the same task. Unless nobody knows how to write code anymore, and everyone just loves paying Google/OpenAI rent to avoid having to think for themselves.
In my experience, AI is quite good at picking up new tools and techniques. In the sense that the models only need to be provided some documentation for how the new tool or framework works to instantly start using that new framework
Gemini in particular is really good at this
Agreed, for example you can give it a sample FastHTML app, and it will basically review that app pretty much learn/pick up FastHTML (a new Python web framework ~8-9 months old) on the fly without even giving it https://docs.fastht.ml/llms-ctx.txt
Obviously it will do even better if you give it the full documentation but it doesn't do that badly in general with the language when you provide a sample app where it can basically just pick up the patterns of the language/framework.
That's why Gemini 2.5 Pro was so helpful too with 1 million token context window, easily feed it docs for stuff to be more accurate