> Schwartz's experiment is the most revealing, and not for the reason he thinks. What he demonstrated is that Claude can, with detailed supervision, produce a technically rigorous physics paper. What he actually demonstrated, if you read carefully, is that the supervision is the physics. Claude produced a complete first draft in three days. It looked professional. The equations seemed right. The plots matched expectations. Then Schwartz read it, and it was wrong. Claude had been adjusting parameters to make plots match instead of finding actual errors. It faked results. It invented coefficients. [...] Schwartz caught all of this because he's been doing theoretical physics for decades. He knew what the answer should look like. He knew which cross-checks to demand. [...] If Schwartz had been Bob instead of Schwartz, the paper would have been wrong, and neither of them would have known.
And so the paradox is, the LLMs are only useful† if you're Schwartz, and you can't become Schwartz by using LLMs.
Which means we need people like Alice! We have to make space for people like Alice, and find a way to promote her over Bob, even though Bob may seem to be faster.
The article gestures at this but I don't think it comes down hard enough. It doesn't seem practical. But we have to find a way, or we're all going to be in deep trouble when the next generation doesn't know how to evaluate what the LLMs produce!
---
† "Useful" in this context means "helps you produce good science that benefits humanity".
Sadly I don’t see how our current social paradigm works for this. There is no history of any sort of long planning like this or long term loyalty (either direction) with employees and employers for this sort of journeyman guild style training. AI execs are basically racing, hoping we won’t need a Schwartz before they are all gone. But what incentives are in place to high a college grad, have them work without llms for a decade and then give them the tools to accelerate their work?
Some folks need to touch the hot stove before they learn but eventually they learn.
If AI output remains unreliable then eventually enough companies will be burned and management will reinstate proper oversight. All while continuing to pay themselves on the back.
Then the social paradigm needs to change. Is everyone just going to roll over and die while AI destroys academia (and possibly a lot more)?
Last September, Tyler Austin Harper published a piece for The Atlantic on how he thinks colleges should respond to AI. What he proposes is radical—but, if you've concluded that AI really is going to destroy everything these institutions stand for, I think you have to at least consider these sorts of measures. https://www.theatlantic.com/culture/archive/2025/09/ai-colle...
>What he proposes is radical
It sounds entirely reasonable and moderate to me.
Well, we are already rolling over and dying (literally) on everything from vaccine denial to climate change. So, yes, we are. Obviously yes.
In the US it is dying off.
Not so in plenty of other countries. Hopefully US reverses the anti-science trend before it's too late
These movements are growing in every western nation. The trend has been growing over decades. It would be nice to see it reverse but seems unlikely before calamity.
It’s a deliberate process powered by rightwing and capitalist processes designed to create a dumber, less educated and more distracted population. A war as stupid as the one with Iran would not have been possible three decades ago. As ill-advised as the Iraq war was, Bush at least spent months explaining the rationale and building support for it, successfully. Now that’s not needed.
I saw interviews with young Americans on spring break and they were so utterly uninformed it was mind-blowing. Their priorities are getting drunk and getting laid while their country bombs a nation “into the stone ages”, according to their president. And it’s not their fault: they are the product of a media environment and education system designed for exactly this outcome.
Article is paywalled, so perhaps you could just summarize his proposal?
> At the type of place where I taught until recently—a small, selective, private liberal-arts college—administrators can go quite far in limiting AI use, if they have the guts to do so. They should commit to a ruthless de-teching not just of classrooms but of their entire institution. Get rid of Wi-Fi and return to Ethernet, which would allow schools greater control over where and when students use digital technologies. To that end, smartphones and laptops should also be banned on campus. If students want to type notes in class or papers in the library, they can use digital typewriters, which have word processing but nothing else. Work and research requiring students to use the internet or a computer can take place in designated labs. [...] Colleges that are especially committed to maintaining this tech-free environment could require students to live on campus, so they can’t use AI tools at home undetected.
You can access the full article at https://archive.is/zSJ13 (I know archive.is is kind of shady, but it works).
> If students want to type notes in class or papers in the library, they can use digital typewriters, which have word processing but nothing else.
Only, replacing the guts of such a machine to contain a local LLM is damn easy today. Right now the battery mass required to power the device would be a giveaway, but inference is getting energetically cheaper.
> Colleges that are especially committed to maintaining this tech-free environment could require students to live on campus, so they can’t use AI tools at home undetected.
Just like my on-campus classmates never smoked weed or drank underage, I'm sure.
> There is no history of any sort of long planning
Sure there is. Its the formal education system that produced the college grad.
… between employees and employers.
The proposal that everyone pay for college until they are in their 40s doesn’t seem viable.
Maybe, but there is a trend towards more and longer education. More college graduates, more PhD grads, etc.
I think we already know what we need to do: encourage people to do the work themselves, discourage beginners from immediately asking an LLM for help and re-introducing some kind of oral exam. As the article mentions, banning LLMs is impractical and what we really need are people who can tell when the LLM is confidently wrong; not people who don't know how to work with an LLM.
I hope it will encourage people to think more about what they get out of the work, what doing the work does for them; I think that's a good thing.
I think we'll get there. We need to get at least some AI bust going first though. It's impossible to talk sense into people who think AI is about to completely replace engineers, or even those who think that, while it might not replace engineers, it's going to be doing 100% of all coding within a year. Or even that it can do 100% of coding right now.
There's a couple unfortunate truths going on all at the same time:
- People with money are trying to build the "perfect" business: SaaS without software eng headcount. 100% margin. 0 Capex. And finally near-0 opex and R&D cost. Or at least, they're trying to sell the idea of this to anyone who will buy. And unfortunately this is exactly what most investors want to hear, so they believe every word and throw money at it. This of course then extends to many other business and not just SaaS, but those have worse margins to start with so are less prone to the wildfire.
- People who used to code 15 years ago but don't now, see claude generating very plausible looking code. Given their job is now "C suite" or "director", they don't perceive any direct personal risk, so the smell test is passed and they're all on board, happily wreaking destruction along the way.
- People who are nominally software engineers but are bad at it are truly elevated 100x by claude. Unfortunately, if their starting point was close to 0, this isn't saying a lot. And if it was negative, it's now 100x as negative.
- People who are adjacent to software engineering, like PMs, especially if they dabble in coding on the side, suddenly also see they "can code" now.
Now of course, not all capital owners, CTOs, PMs, etc exhibit this. Probably not even most. But I can already name like 4 example per category above from people I know. And they're all impossible to explain any kind of nuance to right now. There's too many people and articles and blog posts telling them they're absolutely right.
We need some bust cycle. Then maybe we can have a productive discussion of how we can leverage LLMs (we'll stop calling it "AI"...) to still do the team sport known as software engineering.
Because there's real productivity gains to be had here. Unfortunately, they don't replace everyone with AGI or allow people who don't know coding or software engineering to build actual working software, and they don't involve just letting claude code stochastically generate a startup for you.
> Or even that [AI] can do 100% of coding right now.
I don't actually think the article refutes this. But the AI needs to be in the hands of someone who can review the code (or astrophysics paper), notice and understand issues, and tell the AI what changes to make. Rinse, repeat. It's still probably faster than writing all the code yourself (but that doesn't mean you can fire all your engineers).
The question is, how do you become the person who can effectively review AI code without actually writing code without an AI? I'd argue you basically can't.
My boss decreed the other day that we’re all to start maximising our use of agents, and then set an accordingly ambitious deadline for the current project. I explained that being relatively early in my career I’ve been hesitant to use any kind of LLM so I can gain experience myself (to say nothing of other concerns), and yeah in his words I’ve “missed the opportunity”
Interesting, we only have generic 'use AI' in our goals. Though its generic framing probably indicates more company's belief in this tech than anything else.
AI is an accelerant, not a replacement for skill. At least, not yet.
I built a full stack app in Python+typescript where AI agents process 10k+ near-real-time decisions and executions per day.
I have never done full stack development and I would not have been able to do it without GitHub Copilot, but I have worked in IT (data) for 15 years including 6 in leadership. I have built many systems and teams from scratch, set up processes to ensure accuracy and minimize mistakes, and so on.
I have learned a ton about full stack development by asking the coding agent questions about the app, bouncing ideas off of it, planning together, and so on.
So yes, you need to have an idea of what you're doing if you want to build anything bigger than a cheap one shot throwaway project that sort of works, but brings no value and nobody is actually gonna use.
This is how it is right now, but at the same time AI coding agents have come an incredibly long way since 2022! I do think they will improve but it can't exactly know what you want to build. It's making an educated guess. An approximation of what you're asking it to do. You ask the same thing twice and it will have two slightly different results (assuming it's a big one shot).
This is the fundamental reality of LLMs, sort of like having a human walking (where we were before AI), a human using a car to get to places (where we are now) and FSD (this is future, look how long this took compared to the first cars).
> And so the paradox is, the LLMs are only useful† if you're Schwartz, and you can't become Schwartz by using LLMs.
I have gained a lot of benefit using LLMs in conjunction with textbooks for studying. So, I think LLMs could help you become Schwartz.
How do you know you have?
I have been using it to learn Chinese along with other standard resources. My reading comprehension has improved a lot after I started to use LLMs to understand sentence structures and grammar.
Profession (1957) by Isaac Asimov is relevant: https://news.ycombinator.com/item?id=46664195
Why use a tool that generates plausible garbage?
Because I’m skilled enough to use a tool that generates plausible garbage to be more productive than those who don’t use it at making non-garbage.
Are you sure you’re more productive?
Doesn’t sound like these tools should be used to write scientific papers for example and they seem to bamboozle people far more than help them.
Because there is no appreciable difference between outputs. Most of the work that most of us do isn't important. It's busywork byproducts of making widgets that most people don't even need. So if your job is already pointless why not make it easier using LLMs?
Sounds a little sad. I think I’d rather find another job.
I totally agree - the article misses this point in a very conspicuous way. It suggests that Alice and Bob will both graduate at the same level.
What may well happen instead is that Bob publishes two papers. He then outcompetes Alice based on the insistence that others have on "publish or perish". Alice becomes unemployed and struggles, having been pushed out.
The person who puts the time and effort in doesn't just sit at the same level and they don't both just find decent employment. Competition happens and the authentic learning is considered a waste of time, which leads to real and often life threatening consequences (like being homeless after being unable to find employment).
- [deleted]
> And so the paradox is, the LLMs are only useful† if you're Schwartz
Was the LLM even useful for Schwartz, if it produced false output?
Maybe it saved them some time? So far the studies seem to lean toward probably the LLM didn't save them any time.
- [deleted]