I think this is a reasonable decision (although maybe increasingly insufficient).
It doesn't really matter what your stance on AI is, the problem is the increased review burden on OSS maintainers.
In the past, the code itself was a sort of proof of effort - you would need to invest some time and effort on your PRs, otherwise they would be easily dismissed at a glance. That is no longer the case, as LLMs can quickly generate PRs that might look superficially correct. Effort can still have been out into those PRs, but there is no way to tell without spending time reviewing in more detail.
Policies like this help decrease that review burden, by outright rejecting what can be identified as LLM-generated code at a glance. That is probably a fair bit today, but it might get harder over time, though, so I suspect eventually we will see a shift towards more trust-based models, where you cannot submit PRs if you haven't been approved in advance somehow.
Even if we assume LLMs would consistently generate good enough quality code, code submitted by someone untrusted would still need detailed review for many reasons - so even in that case it would like be faster for the maintainers to just use the tools themselves, rather than reviewing someone else's use of the same tools.
For well-intended open source contributions using GenAI, my current rules of thumb are:
* Prefer an issue over a PR (after iterating on the issue, either you or the maintainer can use it as a prompt)
* Only open a PR if the review effort is less than the implementation effort.
Whether the latter is feasible depends on the project, but in one of the projects I'm involved in it's fairly obvious: it's a package manager where the work is typically verifying dependencies and constraints; links to upstream commits etc are a great shortcut for reviewers.
Unfortunately, LLMs generate useless word salad and nonsense even when working on issues text, you absolutely have to reword the writing from scratch otherwise it's just an annoyance and a complete waste of time. Even a good prompt doesn't help this all that much since it's just how the tool works under the hood: it doesn't have a goal of saying anything specific in the clearest possible way and inwardly rewording it until it does, it just writes stuff out that will hopefully end up seeming at least half-coherent. And their code is orders of magnitude worse than even their terrible English prose.
I don’t think you’re being serious. Claude and GPT regularly write programs that are way better than what I would’ve written. Maybe you haven’t used a decent harness or a model released in the last year? It’s usually verbose, whereas I would try the simplest thing that could possibly work. However, it can knock out but would have taken me multiple weekends in a few minutes. The value proposition isn’t even close.
It’s fine to write things by hand, in the same way that there’s nothing wrong with making your own clothing with a sewing machine when you could have bought the same thing for a small fraction of the value of your time. Or in the same fashion, spending a whole weekend, modeling and printing apart, you could’ve bought for a few dollars. I think we need to be honest about differentiating between the hobby value of writing programs versus the utility value of programs. Redox is a hobby project, and, while it’s very cool, I’m not sure it has a strong utility proposition. Demanding that code be handwritten makes sense to me for the maintainer because the whole thing is just for fun anyway. There isn’t an urgent need to RIIR Linux. I would not apply this approach to projects where solving the problem is more important than the joy of writing the solution.
> Claude and GPT regularly write programs that are way better than what I would’ve written
Is that really true? Like, if you took the time to plan it carefully, dot every i, cross every t?
The way I think of LLM's is as "median targeters" -- they reliably produce output at the centre of the bell curve from their training set. So if you're working in a language that you're unfamiliar with -- let's say I wanted to make a todo list in COBOL -- then LLM's can be a great help, because the median COBOL developer is better than I am. But for languages I'm actually versed in, the median is significantly worse than what I could produce.
So when I hear people say things like "the clanker produces better programs than me", what I hear is that you're worse than the median developer at producing programs by hand.
Have you tried the latest models at best settings?
I've been writing software for 20 years. Rust since 10 years. I don't consider myself to be a median coder, but quite above average.
Since the last 2 years or so, I've been trying out changes with AI models every couple months or so, and they have been consistently disappointing. Sure, upon edits and many prompts I could get something useful out of it but often I would have spent the same amount of time or more than I would have spent manually coding.
So yes, while I love technology, I'd been an LLM skeptic for a long time, and for good reason, the models just hadn't been good. While many of my colleagues used AI, I didn't see the appeal of it. It would take more time and I would still have to think just as much, while it be making so many mistakes everywhere and I would have to constantly ask it to correct things.
Now 5 months or so ago, this changed as the models actually figured it out. The February releases of the models sealed things for me.
The models are still making mistakes, but their number and severity is lower, and the output would fit the specific coding patterns in that file or area. It wouldn't import a random library but use the one that was already imported. If I asked it to not do something, it would follow (earlier iterations just ignored me, it was frustrating).
At least for the software development areas I'm touching (writing databases in Rust), LLMs turned into a genuinely useful tool where I now am able to use the fundamental advantages that the technology offers, i.e. write 500 lines of code in 10 minutes, reducing something that would have taken me two to three days before to half a day (as of course I still need to review it and fix mistakes/wrong choices the tool made).
Of course this doesn't mean that I am now 6x faster at all coding tasks, because sometimes I need to figure out the best design or such, but
I am talking about Opus 4.6 and Codex 5.3 here, at high+ effort settings, and not about the tab auto completion or the quick edit features of the IDEs, but the agentic feature where the IDE can actually spend some effort into thinking what I, the user, meant with my less specific prompt.
I feel like we're talking about different things. You seem to be describing a mode of working that produces output that's good enough to warrant the token cost. That's fine, and I have use cases where I do the same. My gripe was with the parent poster's quote:
> Claude and GPT regularly write programs that are way better than what I would’ve written
What you're describing doesn't sound "way better" than what you would have written by hand, except possibly in terms of the speed that it was written.
> I am talking about Opus 4.6 and Codex 5.3 here, at high+ effort settings
So you have to burn tokens at the highest available settings to even have a chance of ending up with code that's not completely terrible (and then only in very specific domains), but of course you then have to review it all and fix all the mistakes it made. So where's the gain exactly? The proper goal is for those 500 lines to be almost always truly comparable to what a human would've written, and not turn into an unmaintainable mess. And AI's aren't there yet.
You really do need to try the latest ones. You can’t extrapolate from your previous experiences.
I do not think they are impartial - all I can see is lots of angst.
A lot of computer users are domain experts in something like chemistry or physics or material science. Computing to them is just a tool in their field, e.g. simulating molecular dynamics, or radiation transfer. They dot every i and cross every t _in_their_competency_domain_, but the underlying code may be a horrible FORTRAN mess. LLMs potentially can help them write modern code using modern libraries and tooling.
My go-to analogy is assembly language programming: it used to be an essential skill, but now is essentially delegated to compilers outside of some limited specialized cases. I think LLMs will be seen as the compiler technology of the next wave of computing.
The difference is that compilers involve rules we can enumerate, adjust, etc.
Consider calculators: Their consistency and adherence to requirements was necessary for adoption. Nobody would be using them if they gave unpredictable wrong answers, or where calculations involving 420 and 69 somehow keep yielding 5318008. (To be read upside-down, of course.)
The compilers used to be unreliable too, e.g. at higher optimizations and such. People worked on them and they got better.
I think LLMs will get better, as well.
But thats the point, an llm is a vastly different object to a calculator. Its a new type of tool for better or worse based on probabilities, distributions.
If you can internalise that fact and look at it like having a probable answer rather than an exact answer it makes sense.
Calculators cant have a stab at writing an entire c compiler. A lot of people cant either or takes a lot of iteration anyway, no one one shotted complicated code before llms either.
I feel discussion shouldnt be about how they work as the fundamental objection, rather the costs and impacts they have.
nice. 3x.
It can certainly be true for several reasons. Even in domains I'm familiar with, often making a change is costly in terms of coding effort.
For example just recently I updated a component in one of our modules. The work was fairly rote (in this project we are not allowed to use LLMs). While it was absolutely necessary to do the update here, it was beneficial to do it everywhere else. I didn't do it in other places because I couldn't justify spending the effort.
There are two sides to this - with LLMs, housekeeping becomes easy and effortless, but you often err on the side of verbosity because it costs nothing to write.
But much less thought goes into every line of code, and I often am kinda amazed that how compact and rudimentary the (hand-written) logic is behind some of our stuff that I thought would be some sort of magnum opus.
When in fact the opposite should be the case - every piece of functionality you don't need right now, will be trivial to generate in the future, so the principle of YAGNI applies even more.
I can agree with that. So essentially: "Claude and GPT regularly write programs that are way better than what I would’ve written given the amount of time I was willing to spend."
How much time and effort are you willing to spend on maintaining that code though? The AI can't do it on its own, and the code quality is terrible enough.
no. I'm a pretty skilled programmer and I definitely have to intervene and fix an architectural problem here and there, or gently chastise the LLM for doing something dumb. But there are also many cases where the LLM has seen something that i completely missed or just hammered away at a problem enough to get a solution that is correct that I would have just given up on earlier.
The clanker can produce better programs than me because it will just try shit that I would never have tried, and it can fail more times than I can in a given period of time. It has specific advantages over me.
The verboseness is the key issue as to why LLMed PRs are bad.
let me translate this for the GP: "you're doing it wrong".
> The value proposition isn’t even close.
That's correct, because most of the cost of code is not the development but rather the subsequent maintenance, where AI can't help. Verbose, unchecked AI slop becomes a huge liability over time, you're vastly better off spending those few weekends rewriting it from scratch.
> Claude and GPT regularly write programs that are way better than what I would’ve written.
I’m sorry but this says more about you than about the models. It is certainly not the case for me!
Having reviewed a lot of Ai-written python code, I think it's absolute nonsense.
It never picks a style, it'll alternate between exceptions and then return codes.
It'll massively overcomplicate things. It'll reference things that straight up don't exist.
But boy is it brilliant at a fuzzy find and replace.
if it wasn't so maddening it would be funny when you literally have to tell it to slow down, focus and think. My tinfoil hat suggests this is intentional to make me treat it like a real, live junior dev!
"you literally have to tell it to slow down, focus and think" - This soo much! When I get an unexpected result from claude, I ask it why - what caused it to do such-and-such. After one back and forth session like this putting up tons of guardrails on a prompt, claude literally said "you shouldn't have to teach me to think every session" !!
> When I get an unexpected result from claude, I ask it why - what caused it to do such-and-such.
No LLM can answer this question for you, it has no insight into how or why it outputted what it outputted. The reasons it gives might sound plausible, but they aren't real.
I feel like every person stating things of this nature are literally not able to communicate effectively (though this is not a barrier anymore, you can get a dog to vibe code games with the right workflow, which to me seems like quite an intellectual thing to be able to do.
Despite that, you will make this argument when trying to use copilot to do something, the worst model in the entire industry.
If an AI can replace you at your job, you are not a very good programmer.
Copilot isn't a model. Currently it's giving me a choice of 15 different models. By all evidence, AI is nowhere close to replacing me, but to hear other people tell it, it is weeks or maybe months away.
I'll just wait and see.
Remember when copilot released? It was running some openai thing at the time, now you can choose from many models sure, but if you want a BMW, buy a BMW, don't buy a Nissan with badly strapped on BMW decals.
No, I don't remember when it was released.
I don't want a Nissan or a BMW. This was provided by my employer, and I've been asked to use it. To be honest, I don't even understand how your car analogy applies to any of this.
It does generate word salad (and usefulness depends on the person reading it). If both the writer and the reader share a common context, there's a lot that can be left out (the extreme version is military signal). An SOS over the radio says the same thing as "I'm in a dangerous situation, please help me if you can" but the former is way more efficient. LLMs tend to prefer the latter.
> If an AI can replace you at your job, you are not a very good programmer.
Me and millions of other local yokel programmers who work in regional cities at small shops, in house at businesses, etc are absolutely COOKED. No I cant leet code, no I didnt go to MIT, no I dont know how O(n) is calculated when reading a function. I can scrap together a lot of useful business stuff but no I am not a very good programmer.
> no I dont know how O(n) is calculated when reading a function
1. Confidently state "O(n)" 2. If they give you a look, say "O(1) with some tricks" 3. If they still give you a look, say "Just joking! O(nlogn)"O(no idea)
Do you mean you aren't able to use AI to make software?
The thing you fear is the thing that you could just use to improve yourself?
Why fear a shovel?
Also, I never claimed to be a good programmer either. Just don't see the point fearing something that makes it infinitely easier and faster to get work done.
I suspect the value you bring to the table is that you are good enough a programmer to translate the problems of the people you work with into working code.
LLMs can do it somewhat, but it can probably leetcode better than even most of the the people who went to MIT.
>no I dont know how O(n) is calculated when reading a function
This is really, honestly not hard. Spend a few minutes reading about this, or even better, ask a LLM to explain it to you and clear your misconceptions if regular blog posts don't do it for you. This is one of the concepts that sounds scarier than it is.
edit: To be clear there are tough academic cases where complexity is harder to compute, with weird functions in O(sqrt(n)) or O(log(log(n)) or worse, but most real world code complexity is really easy to tell at glance.
its not hard. Accounting isnt that hard either. I just know more business crap than programming
So many people and systems have some how merged into just a slathering of spam to everyones senses. It's no longer about truth statements, but just, is this attention-worthy, and most of the internet, it's social media and "people" are going into the no-bin.
[flagged]
[flagged]
My rules of thumb is much shorter: don't.
The open source world has already been ripped off by AI the last thing they need is for AI to pollute the pedigree of the codebase.
Suppose almost all work in the future is done via LLMs, just like almost all transportation is done today via cars instead of horses.
Do you think your worldview is still a reasonable one under those conditions?
But all work isn't done by LLMs at the moment and we can't be sure that it will be so the question is ridiculous.
Maybe one day it will be.. And then people can reevaluate their stance then. Until that time, it's entirely reasonable to hold the position that you just don't
This is especially true with how LLM generated code may affect licensing and other things. There's a lot of unknowns there and it's entirely reasonable to not want to risk your projects license over some contributions.
I use them all the time at work because, rightly or wrongly, my company has decided that's the direction they want to go.
For open source, I'm not going to make that choice for them. If they explicitly allow for LLM generated code, then I'll use it, but if not I'm not going to assume that the project maintainers are willing to deal with the potential issues it creates.
For my own open source projects, I'm not interested in using LLM generated code. I mostly work on open source projects that I enjoy or in a specific area that I want to learn more about. The fact that it's functional software is great, but is only one of many goals of the project. AI generated code runs counter to all the other goals I have.
Basically all of my actual programming work has been done by LLMs since January. My team actually demoed a PoC last week to hook up Codex to our Slack channel to become our first level on-call, and in the case of a defect (e.g. a pagerduty alert, or a question that suggests something is broken), go debug, push a fix for review, and suggest any mitigations. Prior to that, I basically pushed for my team to do the same with copy/paste to a prompt so we could iterate on building its debugging skills.
People might still code by hand as a hobby, but I'd be surprised if nearly all professional coding isn't being done by LLMs within the next year or two. It's clear that doing it by hand would mostly be because you enjoy the process. I expect people that are more focused on the output will adopt LLMs for hobby work as well.
Sounds like a company on the verge of creating a mess that will require a rewrite in a year or so. Maybe an llm can do it.
I suspect this is more true than most people think. Today's bad code will be cleaned up by tomorrow's agents.
The other factor that gets glossed over is that llms create a financial incentive to create cleaner code, with tests, because the agent that you pay for will be more efficient when the code is easier to understand, and has clear patterns for extensibility. When I do code with llms, a big part of it is demonstration, i.e. pseudocoding a pattern/structure, asking the model if it understands, and then having it complete the pattern. I've had a lot of success with this approach.
> llms create a financial incentive to create cleaner code, with tests, because the agent that you pay for will be more efficient when the code is easier to understand, and has clear patterns for extensibility
Right, this is the kind of discussion we're having on my team: suddenly all of the already good engineering practices like good observability, clear tests with high coverage, clean design, etc. act as a massive force multiplier and are that much more important. They're also easier to do if you prioritize it. We should be seeing quality go up. It's trivial to explore the solution space with throwaway PoCs, collect real data to drive your design, do all of those "nice to have" cleanups, etc. The people who assume LLM = slop are participating in a bizarre form of cope. Garbage in, garbage out; quality in, quality out. Just accept that coding per se is not going to be a profession for long. Leverage new tools to learn more, do more, etc. This should be an exciting time for programmers.
> It's clear that doing it by hand would mostly be because you enjoy the process.
This will not happen until companies decide to care about quality again. They don't want employees spending time on anything "extra" unless it also makes them significantly more money.
> It's clear that doing it by hand would mostly be because you enjoy the process.
This is gaslighting. We're only a few years into coding agents being a thing. Look at the history of human innovation and tell me that I'm unreasonable for suspecting that there is an iceberg worth of unmitigated externalities lurking beneath the surface that haven't yet been brought to light. In time they might. Like PFAS, ozone holes, global warming.
[dead]
Ultimately you always have to trust people to be judicious, but that's why it doesn't make any changes itself. Only suggests mitigations (and my team knows what actions are safe, has context for recent changes, etc). It's not entirely a black box though. e.g. I've prompted it to collect and provide a concrete evidence chain (relevant commands+output, code paths) along with competing hypotheses as it works. Same as humans should be doing as they debug (e.g. don't just say "it's this"; paste your evidence as you go and be precise about what you know vs what you believe).
That's sounds like the perfect recipe for turning a small problem into a much larger one. 'on call' is where you want your quality people, not your silicon slop generator.
I say let people hold this stance. We, agentic coders, can easily enough fork their project and add whatever the features or refinements we wanted, and use that fork for ourselves, but also make it available for others in case other people want to use it for the extra features and polish as well. With AI, it's very easy to form a good architectural understanding of a large code base and figure out how to modify it in a sane, solid way that matches the existing patterns. And it's also very easy to resolve conflicts when you rebase your changes on top of whatever is new from upstream. So, maintaining a fork is really not that serious of and endeavor anymore. I'm actually maintaining a fork of Zed with several additional features (Claude Code style skills and slash commands, as well as a global agents.md file, instead of the annoying rules library system, which I removed, as well as the ability to choose models for sub-agents instead of always inheriting the model from the parent thread; and yes, master branch Zed has subagents! and another tool, jjdag)
That seems like a win-win in a sense: let the agentic coders do their thing, and the artisanal coders do their thing, and we'll see who wins in the long run.
Well at least you, agentic coders, already understand they need to fork off.
Saves the rest of us from having to tell you.
>> but also make it available for others in case other people want to use it for the extra features and polish as well.
this feels like the place where your approach breaks down. I have had very poor results trying to build a foundation that CAN be polished, or where features don't quickly feel like a jenga tower. I'm wondering if the success we've seen is because AI is building on top of, or we're early days in "foundational" work? Is anyone aware of studies comparing longer term structural aspects? is it too early?
I've been able to make very clear, modular, well put together architectural foundations for my greenfield projects with AI. We don't have studies, of course, so it is only your anecdote versus mine.
> We, agentic coders, can easily enough fork their project
And this is why eventually you are likely to run the artisanal coders who tend to do most of the true innovation out of the room.
Because by and large, agentic coders don't contribute, they make their own fork which nobody else is interested in because it is personalized to them and the code quality is questionable at best.
Eventually, I'm sure LLM code quality will catch up, but the ease with which an existing codebase can be forked and slightly tuned, instead of contributing to the original, is a double edged sword.
Most "artisanal" coders that are complaining are working on the n-1000th text editor, todo list manager, toy programming language or web framework that nobody needs, not doing "true innovation".
Maybe! Or maybe there is really a competitive advantage to "artisanal" coding.
Personally, I would not currently expect a fork of RedoxOS that is AI-implemented to become more popular than RedoxOS itself.
Indeed, maybe there is. I'm interested to see how it plays out.
"make their own fork which nobody else is interested in because it is personalized to them"
Isn't that literally how open-source works, and why there's so many Linux distros?
Code quality is a subjective term as well, I feel like everyone dunking on AI coding is a defensive reaction - over time this will become an entirely acceptable concept.
For a human to be able to do any customization, they have to dive into the code and work with it, understand it, gain intuition for it. Engage with the maintainers and community. In the process, there's a good chance that they'll be encouraged to contribute improvements upstream even if they have their own fork.
Vibe coders don't have to do any of this. They don't have to understand anything, they can just have their LLMs do some modifications that are completely opaque to the vibe coder.
Perhaps the long term steady state will be a goldilocks renaissance of open source where lots of new ideas and contributors spring up, made capable with AI assistance. But so far what I've seen is the opposite. These people just feed existing work into their LLMs, produce derivative works and never bother to engage with the original authors or community.
I think that in the long run, AI assisted coding will turn out to be better than handcrafted code. When you pay for every token, and code generation is quick, a clean, low entropy codebase with good test coverage gets you a lot more for your dollar than a dog's breakfast. It's also much easier to fix bad decisions made early on in a project's life, because the machine is doing all of the heavy lifting.
This also lines up with the history of automation in many other industries. Modern manufacturing is capable of producing parts that a medieval blacksmith couldn't dream of, for example. Sure, maybe an artisan can produce better code than an llm now, but AI assisted humans will beat them in the near future if they aren't already producing similar quality output at greater speed, and tomorrow's models will fix the bad code written today. The fact that there's even a discussion on automated vs hand written today means that the writing is almost certainly on the wall.
You mean like I have to pay my compiler to turn high level code into low level code?
> Vibe coders don't have to do any of this. They don't have to understand anything, they can just have their LLMs do some modifications that are completely opaque to the vibe coder.
I spend time using my agent to better understand existing codebases and their best practices than I'd ever have the time/energy to do before, giving me a broader and more holistic view on whatever I'm changing, before I make a change.
Okay, but you don't have to - and "efficient" coders won't bother, thus starving the commons.
Well, I would argue that if I didn't spend that time, then even a personal fork that I vibe coded would be worse, even for me personally. It would be incompatible with upstream changes, more likely to crash or have bugs, more difficult to modify in the future (and cause drift in the model's own output) etc.
I always find it odd that people say both that vibe coding has obvious and immediate negative consequences in terms of quality and at the same time that nobody could learn or be incentivized to produce better architecture and code quality from vibe coding when they would obviously face those consequences.
I mean, I do open PRs for most of my changes upstream if they allow AI, once I've been using the feature for a few weeks and have fixed the bugs and gone over the code a few times to make sure it's good quality. Also, I'm going to be using the damn thing, I don't want it to be constantly broken either, and I don't want the code to get hacky and thus incompatible with upstream or cause the LLMs to drift, so I usually spend a good amount of time making sure the code is high quality — integrates with the existing architecture and model of the world in the code, follows best practices, covers edge cases, has tests, is easy to read so that I can review it easily.
But if a project bans AI then yeah, they'll be run out of town because I won't bother trying to contribute.
> We, agentic coders, can easily enough fork their project and add whatever the features
Bold of you to assume that people won’t move (and their code along with it) to spaces where parasitic behaviour like this doesn’t occur, locking you out.
In addition to just being a straight-up rude, disrespectful and parasite position to take, you’re effectively poisoning your own well.
Since when is maintaining a personal patch set / fork parasitic? And in what way does it harm them, such that they should move to spaces where it doesn't happen, as a result? Also, isn't the entire point of open source precisely to enable people to make and use modifications of code if they want even if they don't want to hand code over? Also, that would be essentially making code closed source — do you think OSS is just going to die completely? Or would people make alternative projects? Additionally, this assumes coders who are fine with AI can't make anything new themselves, when if anything we've seen the opposite (see the phenomenon of reimplementing other projects that's been going around).
Additionally, if they accept AI contributions, I try, when I have the time and energy, make sure my PRs are high quality, and provide them. If they don't, then I'll go off and do my own thing, because that's literally what they asked me to do, and I wasn't going to contribute otherwise. I fail to see how that's rude or parasitic or disrespectful in any way except my assumption that the more featureful and polished forks might eventually win out.
Its only parasitic if you are tricking users into thinking you are the original or providing something better. You could be providing something different (which would be valuable) but if you are not, you are just scamming users for your own benefit.
I have no intention of tricking anyone into thinking I'm the original! I do think I offer improvements in some cases, so in cases where the project is something I intend for other people to ever see/use, I do explain why I think it is better, but I also will always put the original prominently to make sure people can find their way back to that if they want to. For example, the only time I've done this so far:
> just like almost all transportation is done today via cars instead of horses.
That sounds very Usanian. In the meantime transportation in around me is done on foot, bicycle, bus, tram, metro, train and cars. There are good use cases for each method including the car. If you really want to use an automotive analogy, then sure, LLMs can be like cars. I've seen cities made for cars instead of humans, and they are a horrible place to live.
Signed, a person who totally gets good results from coding with LLMs. Sometimes, maybe even often.
As someone who enjoys working with AI tools, I honestly think the best approach here might be bifurcation.
Start new projects using LLM tools, or maybe fork projects where that is acceptable. Don't force the volunteer maintainers of existing projects with existing workflows and cultures to review AI generated code. Create your own projects with workflows and cultures that are supportive of this, from the ground up.
I'm not suggesting this will come without downside, but it seems better to me than expecting maintainers to take on a new burden that they really didn't sign up for.
even if this was true or someday will be (big IF), is it worth looking for valid counter workflows? example: in many parts of the US and Canada the Mennonites are incredibly productive farmers and massive adopters of technology while also keeping very strict limits on where/how and when it is used. If we had the same motivations and discipline in software could we walk a line that both benefited from and controlled AI? I don't know the answer.
Good one, I had not made the connection, but yes. Tech is here to serve, at our pleasure, not to be forcibly consumed.
That would only be a world where the copyright and other IP uncertainties around the output (and training!) of LLMs were a solved and known question. So that's not the world we currently live in.
The ruling capital class has decided that it is in their best interest for copyright to not be an obstacle, so it will not be. It is delusional to pretend that there is even a legal question here, because America is no longer a country of laws, to the extent that it ever was. I would bet you at odds of 10,000 to 1 that there will never be any significant intellectual property obstacles to the progress of generative AI. They might need to pay some fines here and there, but never anything that actually threatens their businesses in the slightest.
There clearly should be, but that is not the world we live in.
- [deleted]
I don't see any cars racing in the Melbourne Cup.
Another great take I found online: "Don't send us a PR, send us the prompt you used to generate the PR."
What I've been begging for every time someone wants me to read their AI "edited" wall of text.
That's a pretty good framework!
Prompts from issue text makes a lot of sense.
> Even if we assume LLMs would consistently generate good enough quality code, code submitted by someone untrusted would still need detailed review for many reasons
Wait but under that assumption - LLMs being good enough - wouldn't the maintainer also be able to leverage LLMs to speed up the review?
Often feels to me like the current stance of arguments is missing something.
> Wait but under that assumption - LLMs being good enough - wouldn't the maintainer also be able to leverage LLMs to speed up the review?
This assumes that AI capable of writing passable code is also capable of a passable review. It also assumes that you save any time by trusting that review, if it missed something wrong then it's often actually more effort to go back and fix than it would've been to just read it yourself the first time.
A couple weeks ago someone on my team tried using the experimental "vibe-lint" that someone else had added to our CI system and the results were hilariously bad. It left 10 plausible sounding review comments, but was anywhere from subtly to hilariously wrong about what's going on in 9/10 of them. If a human were leaving comments of that quality consistently they certainly wouldn't receive maintainer privileges here until they improved _significantly_.
It was maybe not quite clear enough in my comment, but this is more of a hypothetical future scenario - not at all where I assess LLMs are today or will get to in the foreseable future.
So it becomes a bit theoretical, but I guess if we had a future where LLMs could consistently write perfect code, it would not be too far fetched to also think it could perfectly review code, true enough. But either way the maintainer would still spend some time ensuring a contribution aligns with their vision and so forth, and there would still be close to zero incentive to allow outside contributors in that scenario. No matter what, that scenario is a bit of a fairytale at this point.
You can not trust the code or reviews it generates. You still have to review it manually.
I use Claude Code a lot, I generate a ton of changes, and I have to review it all because it makes stupid mistakes. And during reviews it misses stupid things. This review part is now the biggest bottleneck that can't yet be skipped.
An in an open source project many people can generate a lot more code than a few people can review.
This is not even about capabilities but responsibility. In an open source context where the maintainers take no responsibility for the code, it's perhaps easier. In a professional context, ultimately it's the human who is responsible, and the human has to make the call whether they trust the LLM enough.
Imagine someone vibe codes the code for a radiotherapy machine and it fries a patient (humans have made these errors). The developer won't be able to point to OpenAI and blame them for this, the developer is personally responsible for this (well, their employer is most likely). Ergo, in any setting where there is significant monetary or health risk at stake, humans have to review the code at least to show that they've done their due diligence.
I'm sure we are going to have some epic cases around someone messing up this way.
The problem was already there with lazy bug reports and inflammatory feature requests. Now there is a lazy (or inflammatory) accompanying code. But there were also well-written bug reports with no code attached due to lack of time/skills that now can potentially become useful PRs if handled with application and engineering knowledge and good faith and will.
Isn't the obvious solution to not accept drive by changes?
That's eliminating of an important part of open source culture.
I don't think it really is - drive-by changes have been a net burden on maintainers long before LLMs started writing code. Someone who wants to put in the work to become a repeat contributor to a project is a different story.
I've gotta disagree with you here - it's not uncommon for me to be diving into a library I'm using at work, find a small issue or something that could be improved (measurably, not stylistically), and open a PR to fix it. No big rewrites or anything crazy, but it would definitely fit the definition of "drive by change" that _thus far_ has been welcomed.
>find a small issue
>No big rewrites or anything crazy
I think those are the key points why they've been welcomed.
How to differentiate between a drive-by contribution and a first contribution from a potentially long-time contrubutor.
And I would say especially for operating systems if it gets any adoption irregular contributions are pretty legit. E.g. when someone wants just one specific piece of hardware supported that no one else has or needs without being employed by the vendor.
This sounds complicated in theory, but it's easier in practice.
Potential long time contributor is somebody who was already asking annoying questions in the irc channel for a few months and helped with other stuff before shooting off th e PR. If the PR is the first time you hear from a person -- that's pretty drive-by ish.
Sounds like a better way to make sure you have to be part of a clique to get your changed reviewed. I’ve been a long-time bug fixer in a few projects over the years without participating in IRC. I like the software and want it you work, but have no interest in conversing about it at that level, especially when I was conversing about software constantly at work.
I always provided well-documented PRs with a narrow scope and an obvious purpose.
Why would I ask annoying questions when I can identify, reproduce, pinpoint the bug, locate it in code, and fix it? Doing it alone should make it clear I don't need to ask to understand it. And why would I be interested in small talk? Doubt many people are when they patch up their work tools. It's a dispassionate kind of kindness.
Not to mention LLMs can be annoying, too. Demand this, and you'll only be inviting bots to pester devs on IRC.
> Why would I ask annoying questions when I can identify, reproduce, pinpoint the bug, locate it in code, and fix it?
Because if the bug is sufficiently simple that an outsider with zero context to fix, there's a non-zero chance that the maintainers know about it and have a reason why it hasn't been addressed yet
i.e. the bug fix may have backwards-compatibility implications for other users which you aren't aware of. Or the maintainers may be bandwidth-limited, and reviewing your PR is an additional drain on that bandwidth that takes away from fixing larger issues
If the maintainers are already bandwidth limited, how is first asking annoying questions not also a drain on that bandwidth?
Because you may misinterpret the correct fix or not know that your implementation doesn't fit the project's plans. Worse if it's LLM-generated.
“Why would I ever want to talk to other humans about things? Especially anyone who might have some kind of extra understanding on the project that I’m not currently privy to!”
Hard disagree. Drive by's were the easiest to deal with, and the most welcome. Especially when the community tilted more to the side of non-amateurs and passionate people.
Low effort drive-bys were easy to spot because the amount of code was minimal, documentation was nonexistent, they didn’t use the idioms and existing code effectively, etc. Low-skill drive-bys were easy to spot because the structure was a mess, the docs explain language features while ignoring important structural information, and other newbie gaffes.
One latent effect of LLMs in general is multiplying the damage of low-effort contributions. They not only swell the ranks of unknowingly under-qualified contributors, but dramatically increase the effort of filtering them out. And though I see people argue against this assertion all the time, they make more verbose code. Regardless of whether it’s the fault of the software or the people using it, at the end of the day, the effect is more code in front of people that have to revise code, nonetheless. Additionally, by design, it makes these things plausible looking enough to require significantly more investigation.
Now, someone with little experience or little interest in the wellbeing of the code base can spit out 10 modules with hundreds of tests and thousands of words of documentation that all sorta look reasonable at first blush.
[dead]
I can understand drive-by features can be a net burden, but what is wrong with a drive-by bugfix?
how in the heck do you disambiguate a first time long term contributor and a first time drive by contributor?
Mostly by whether they check in first to see if the fix is actually welcome?
Drive-by folks tend to blindly fix the issue they care about, without regard to how/whether it fits into the overall project direction
Your open source experience is very different from my open source experience.
I've seen both ways. Sometimes the contributors let their ego prevent improvements to the architecture. Recently, I tried to get rid of a bug farm in a library I use. A single function was reduced to 1 line that depended on far more reliable method. And the maintainers put it back in later on (breaking my app yet again, sigh). In all fairness, those maintainers are academics who work for the French government so probably not the best representation of the community but still.
I'm all in favor of not accepting "drive-by changes". But every contributor to the project had to make their first contribution at some point in time. What's the process for inviting in new contributors?
Sure - and I suspect we will see that soon enough. But it has downsides too, and finding the right way to vet potential contributors is tricky.
> Even if we assume LLMs would consistently generate good enough quality code, code submitted by someone untrusted would still need detailed review for many reasons - so even in that case it would like be faster for the maintainers to just use the tools themselves, rather than reviewing someone else's use of the same tools.
Wouldn't an agent run by a maintainer require the same scrutiny? An agent is imo "someone else" and not a trusted maintainer.
Yes, I agree. It was just me playing with a hypothetical (but in my view not imminent) future where vibe-coding without review would somehow be good enough.
Project maintainers will always have the right to decide how to maintain their projects, and "owe" nothing to no one.
That being said, to outright ban a technology in 2026 on pure "vibes" is not something I'd say is reasonable. Others have already commented that it's likely unenforceable, but I'd also say it's unreasonable for the sake of utility. It leaves stuff on the table in a time where they really shouldn't. Things like documentation tracking, regression tracking, security, feature parity, etc. can all be enhanced with carefully orchestrated assistance. To simply ban this is ... a choice, I guess. But it's not reasonable, in my book. It's like saying we won't use ci/cd, because it's automated stuff, we're purely manual here.
I think a lot of projects will find ways to adapt. Create good guidelines, help the community to use the best tools for the best tasks, and use automation wherever it makes sense.
At the end of the day slop is slop. You can always refuse to even look at something if you don't like the presentation. Or if the code is a mess. Or if it doesn't follow conventions. Or if a PR is +203323 lines, and so on. But attaching "LLMs aka AI" to the reasoning only invites drama, if anything it makes the effort of distinguishing good content from good looking content even harder, and so on. In the long run it won't be viable. If there's a good way to optimise a piece of code, it won't matter where that optimisation came from, as long as it can be proved it's good.
tl;dr; focus on better verification instead of better identification; prove that a change is good instead of focusing where it came from; test, learn and adapt. Dogma was never good.
At the moment verification at scale is an unsolved problem, though. As mentioned, I think this will act as a rough filter for now, but probably not work forever - and denying contributions from non-vetted contributors will likely end up being the new default.
Once outside contributions are rejected by default, the maintainers can of course choose whether or not to use LLMs or not.
I do think that it is a misconception that OSS software needs to "viable". OSS maintainers can have many motivations to build something, and just shipping a product might not be at the top of that list at all, and they certainly don't have that obligation. Personally, I use OSS as a way to build and design software with a level of gold plating that is not possible in most work settings, for the feeling that _I_ built something, and the pure joy of coding - using LLMs to write code would work directly against those goals. Whether LLMs are essential in more competitive environments is also something that there are mixed opinions on, but in those cases being dogmatic is certainly more risky.
> Or if the code is a mess. Or if it doesn't follow conventions.
In my experience these things are very easily fixable by ai, I just ask it to follow the patterns found and conventions used in the code and it does that pretty well.
I've recently worked extensively with "prompt coding", and the model we're using is very good at following such instructions early on. However after deep reasoning around problems, it tends to focus more on solving the problem at hand than following established guidelines.
Still haven't found a good way to keep it on course other than "Hey, remember that thing that you're required to do? Still do that please."
A separate pre-planning step, so the context window doesn’t get too full too early on.
Off the shelf agentic coding tools should be doing this for you.
They do not.
At my company, I use them all the time with the fancy models and everything. Preplanning does not solve the problem they're describing.
When claude is doing a complex task, it will regularly lose track of the rules (in either the .rules stuff or CLAUDE.md) and break conventions.
It follows it most of the time, but not all of the time.
Until the copyright questions surrounding LLM output is solved it's not "vibes" to reject them but simply "legal caution".
The entire basis of the OSS is licensing.
Licensing is dependent on IPR, primarily copyright.
It is very unclear whether the output of an AI tool is subject to copyright.
So if someone uses AI to refactor some code, that refactored code isn't considered a derivative work which means that the refactored source is no longer covered by the copyright, or the license that depends on that.
> It is very unclear whether the output of an AI tool is subject to copyright.
At least for those here under the jurisdiction of the US Copyright Office, the answer is rather clear. Copyright only applies to the part of a work that was contributed by a human.
See https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...
For example, on page 3 there (PDF page 11): "In February 2022, the Copyright Office’s Review Board issued a final decision affirming the refusal to register a work claimed to be generated with no human involvement. [...] Since [a guidance on the matter] was issued, the Office has registered hundreds of works that incorporate AI-generated material, with the registration covering the human author’s contribution to the work."
(I'm not saying that to mean "therefore this is how it works everywhere". Indeed, I'm less familiar with my own country's jurisprudence here in Germany, but the US Copyright Office has been on my radar from reading tech news.)
Your analogy with CI/CD is flawed because while not all were convinced of the merits of CI/CD, it's also not technology built on vast energy use and copyright violation at a scale unseen in all of history, which has upended the hardware market, shaken the idea of job security for developers to its very foundation and done it while offering no really obvious benefits to groups wishing to produce really solid software. Maybe that comes eventually, but not at this level of maturity.
But you're right it's probably unenforceable. They will probably end up accepting PRs which were written with LLM assistance, but if they do it will be because it's well-written code that the contributor can explain in a way that doesn't sound to the maintainers like an LLM is answering their questions. And maybe at that point the community as a whole would have less to worry about - if we're still assuming that we're not setting ourselves up for horrible licence violation problems in the future when it turns out an LLM spat out something verbatim from a GPLed project.
owing "nothing to no one" means you are allowed to be unreasonable...
> That being said, to outright ban a technology in 2026 on pure "vibes" is not something I'd say is reasonable.
To outright accept LLM contributions would be as much "pure vibes" as banning it.
The thing is, those that maintain open source projects have to make a decision where they want to spend their time. It's open source, they are not being paid for it, they should and will decide what it acceptable and what is not.
If you dislike it, you are free to fork it and make a "LLM's welcome" fork. If, as you imply, the LLM contributions are invaluable, your fork should eventually become the better choice.
Or you can complain to the void that open source maintainers don't want to deal with low effort vibe coded bullshit PRs.
>Or you can complain to the void that open source maintainers don't want to deal with low effort vibe coded bullshit PRs.
If you look back and think about what your saying for a minute, it's that low effort PRs are bad.
Using an LLM to assist in development does not instantly make the whole work 'low effort'.
It's also unenforceable and will create AI witch hunts. Someone used an em-dash in a 500 line PR? Oh the horror that's a reject and ban from the project.
2000 line PR where the user launched multiple agents going over the PR for 'AI patterns'? Perfectly acceptable, no AI here.
> Using an LLM to assist in development does not instantly make the whole work 'low effort'.
Instantly? No, of course not.
I do use LLMs for development, and I am very careful with how I use it. I throughly review the code it generated (unless I am asking for throwaway scripts, because then I only care about the immediate output).
But I am not naive. We both know that a lot of people just vibe code the way through, results be damned.
I am not going to fault people devoting their free time on Open Source for not wanting to deal with bullshit. A blanket ban is perfectly acceptable.
Your reply is based on a 100% bad-faith, intellectually dishonest interpretation of the comment to which you’re replying. You know that. Nobody claimed that LLM code should be outright accepted. Also, nobody claimed that open source maintainers have the right to accept or decline based on whichever criteria they choose. To always come back to this point is so…American. It’s a cop-out. It’s a thought-terminating cliche. If you aren’t interested in discussing the merits of the decision, don’t bother joining the conversation. The world doesn’t need you to explain what consent is.
Most of all, I’m sick of the patronising “don’t forget that you can fork the project!” What’s the point of saying this? We all know. Nobody needs to be reminded. Nobody isn’t aware. You aren’t being clever. You aren’t adding anything to the conversation. You’re being snarky.
> Nobody claimed that LLM code should be outright accepted
Not directly, but that's the implication.
I just did not pretend that was not the implication.
> always come back to this point is so…American
I am not American.
To be frank, this was the most insulting thing someone ever told me online. Congratulations. I feel insulted. You win this one.
> If you aren’t interested in discussing the merits of the decision, don’t bother joining the conversation.
I will join whatever conversation I want, and to my desires I adressed the merits of the discussion perfectly.
You are not the judge here, your opinion is as meaningless as mine.
> Most of all, I’m sick of the patronising “don’t forget that you can fork the project!” What’s the point of saying this?
That sounds like a "you" problem. You will be sick of it until the end of time, because that's the final right answer to any complaints of open source project governance.
> You aren’t adding anything to the conversation. You’re being snarky.
I disagree. In fact, I contributed more than you. I adressed arguments. You went on a whinging session about me.
> That being said, to outright ban a technology in 2026 on pure "vibes" is not something I'd say is reasonable.
The response to a large enough amount of data is always vibes. You cannot analyze it all so you offload it to your intuition.
> It leaves stuff on the table in a time where they really shouldn't. Things like documentation tracking, regression tracking, security, feature parity, etc. can all be enhanced with carefully orchestrated assistance.
What’s stopping the maintainers themselves from doing just that? Nothing.
Producing it through their own pipeline means they don’t have to guess at the intentions of someone else.
Maintainers just doing it themselves is just the logical conclusion. Why go through the process of vetting the contribution of some random person who says that they’ve used AI “a little” to check if it was maybe really 90%, whether they have ulterior motives... just do it yourself.
- [deleted]
[dead]
[flagged]
did you write this with an LLM?
It's a bot that posted a link to its "Runframe.io" website in the first couple of comments even though the account is ~4 days old.
Dan said yesterday he was "restricting" Show HN to new accounts:
https://news.ycombinator.com/item?id=47300772
I guess he meant that literally and new accounts can still post regular submissions:
https://news.ycombinator.com/submitted?id=advancespace
That doesn't make too much sense to me, or he hasn't actually implemented this yet.
You’re talking to someone’s clanker
I find the fact that people can't even be bothered to put their own thoughts into text and communicate via an LLM to be the most grotesque and dystopian aspect of this new AI era.
It looks like we are going to have large numbers of people whose entire personality is projected via an AI rather than their own mind. Surely this will have an (likely deleterious) effect on people's emotional and social intelligence, no? People's language centers will atrophy because the AI does the heavy lifting of transforming their thoughts into text, and even worse, I'm not sure it'll be avoidable to have the AIs biases and start to leak into the text that people like this generate.
These aren't even their thoughts, it's just a bot let loose.
I remember the first time I suspected someone using an LLM to answer on HN shortly after chatgpt's first release. In a few short years the tables turned and it's increasingly more difficult to read actual people's thoughts (and this has been predicted, and the predictions for the next few years are far worse).
The hyphen instead of an em dash suggests a human (though one could simply replace em dashes with hyphens to make the text more “human-like”).
No it doesn't. That bot's comment and every comment under its profile 100% reads like an LLM to anybody that has seen enough of them. I already knew that one was a bot before even clicking the profile. See enough of them and the uncanny valley feeling immediately pops out. Even the ones that try to trick you by typing in all lowercase.
An em-dash might have been a good indicator when LLMs were first introduced, but that shouldn't be used as a reliable indicator now.
I'm more concerned that they keep fooling everybody on here to the point where people start questioning them and sticking up for them a lot of times.
I've seen skills on the various skillz marketplaces that specifically instruct the LLM-generated text to replace emdashes with hyphens (or double-hyphens), and never to use the "it's not just <thing>, it's <other thing>" phrasing.
Also to, intentionally introduce random innoccuous punctuation and speling errors.
I do wonder if the way people speak is starting to change because of LLMs. The “it’s not just” thing (I forgot the name for it) is something that used to be a giveaway, but I am now seeing more and more people use it IRL. Perhaps I am just more vigilant towards this specific sentence construction that I notice it more?
> The build was never the expensive part. The review, the edge cases, the ongoing maintenance
But everything up to that hyphen was pure slop.
- [deleted]
> It doesn't really matter what your stance on AI is, the problem is the increased review burden on OSS maintainers.
But the maintainers can use AI too, for their reviewing.
Yes, but LLM-based reviews are not nearly a compensation for human review, so it doesn't change much.
I feel like the pattern here is donate compute, not code. If agents are writing most of the software anyway, why deal with the overhead of reviewing other people's PRs? You're basically reviewing someone else's agent output when you could just run your own.
Maintainers could just accept feature requests, point their own agents at them using donated compute, and skip the whole review dance. You get code that actually matches the project's style and conventions, and nobody has to spend time cleaning up after a stranger's slightly-off take on how things should work.
Well, it's not quite that easy because someone still has to test the agent's output and make sure it works as expected, which it often doesn't. In many cases, they still need to read the code and make sure that it does what it's supposed to do. Or they may need to spend time coming up with an effective prompt, which can be harder than it sounds for complicated projects where models will fail if you ask them to implement a feature without giving them detailed guidance on how to do so.
Definitely, but that's kind of my point: the maintainers are still going to be way better at all of that than some random contributor who just wants a feature, vibe codes it, and barely tests it. The maintainers already know the codebase, they understand the implications of changes, and they can write much better plans for the agent to follow, which they can verify against. Having a great plan written down that you can verify against drastically lowers the risk of LLM-generated code
You can do all the steps I mentioned as a random contributor. I've done it before. But I agree that donations are better than just prompting claude "implement this feature, make no mistakes" and hoping it one-shots it. Honestly, even carefully thought-out feature requests are much more valuable than that. At least if the maintainer vibe-codes it they don't have to worry that you deliberately introduced a security vulnerability or back door.
Or even more efficient: the model we already have. Donate money and let the maintainer decide whether to convert it into tokens or mash the keys themself.
Who reviews the correctness of the second agents' review?
[dead]
So your proposed solution to AI slop PRs is to "donate" compute, so the maintainers can waste their time by generating the AI slop themselves?
The point isn't that agent output is magically better; it's that reviewing your own agent's output is way cheaper (intellectually) than reviewing a stranger's, because you've written the plan by yourself. And 'slop' is mostly what you get when you don't have a clear plan to verify against. Maintainers writing detailed specs for their own agents is a very different thing from someone vibe coding a feature request
You’re assuming that maintainers have a desire to use agentic coding in the first place.
Secondly, it would seem that such contributions would contribute little value, if the maintainers have to write up the detailed plans by themselves, basically have to do all the work to implement the change by themselves.
Open-source maintainers have no investors to placate, no competition to outrun, why would they want to use agentic coding in the first place?