・

810 points

・

2 days ago

519 comments

rienbdj ・ a day ago

The commits are revealing.

Look at this one:

> Ask Claude to remove the "backup" encryption key. Clearly it is still important to security-review Claude's code!

> prompt: I noticed you are storing a "backup" of the encryption key as `encryptionKeyJwk`. Doesn't this backup defeat the end-to-end encryption, because the key is available in the grant record without needing any token to unwrap it?

I don’t think a non-expert would even know what this means, let alone spot the issue and direct the model to fix it.

victorbjorklund ・ a day ago

That is how LLM:s should be used today. An expert prompts it and checks the code. Still saves a lot of time vs typing everything from scratch. Just the other day I was working on a prototype and let claude write code for a auth flow. Everything was good until the last step where it was just sending the user id as a string with the valid token. So if you got a valid token you could just pass in any user id and become that user. Still saved me a lot of time vs doing it from scratch.
- Vinnl ・ a day ago
  
  At least for me, I'm fairly sure that I'm better at not adding security flaws to my code (which I'm already not perfect at!) than I am at spotting them in code that I didn't write, unfortunately.
  
  bryant ・ a day ago
  ・ 4 more
  
  They're different mindsets. Some folks are better editors, inspectors, auditors, etc, whereas some are better builders, creators, and drafters.
  So what you're saying makes sense. And I'm definitely on the other side of that fence.
  
  blueflow ・ a day ago
  ・ 3 more
  
  When you form a mental model and then write code from that, thats a very lossy transformation. You can write comments and documentation to make it less lossy, but there will be information that is lost to an reviewer, who has to spend great effort to recreate it. If it is unknown how code is supposed to behave, then it becomes physically impossible to verify it for correctness.
  This is less a matter of "mindset", but more a general problem of information.
  
  bbarnett ・ a day ago
  ・ 2 more
  
  Whether reviewer or creator, if the start conditions / problem is known, both start with the same info.
  "code base must do X with Y conditions"
  The reviewer is at no disadvantage, other than the ability to walk the problem without coding.
  
  blueflow ・ a day ago
  
  This is the ideal case where the produced code is well readable and commented so its intent is obvious.
  The worst case is an intern or LLM having generated some code where the intent is not obvious and them not being able to explain the intent behind it. "How is that even related to the ticket"-style code.
- XCSme ・ a day ago
  
  > Still saves a lot of time vs typing everything from scratch.
  In my experience, it takes longer to debug/instruct the LLM than to write it from scratch.
  
  Culonavirus ・ a day ago
  ・ 13 more
  
  Depends on what you're doing. For example when you're writing something like React components and using something like Tailwind for styling, I find the speedup is close to 10X.
  
  XCSme ・ an hour ago
  
  Scaffolding works fine, for things that are common, and you already have 100x examples on the web. Once you need something more specific, it falls apart and leads to hours of prompting and debugging for something that takes 30 minutes to write from scratch.
  Some basic things it fails at:
  * Upgrading the React code-base from Material-UI V4 → V5 * Implementing a simple header navigation dropdown in HTML/CSS that looks decent and is usable (it kept having bugs with hovering, wrong sizes, padding, responsiveness, duplicated code etc.) * Changing anything. About half of the time, it keeps saying "I made those changes", but no changes were made (it happens with all of them, Windsurf, Copilot, etc.).
  
  ksenzee ・ 18 hours ago
  
  This can’t be stressed enough: it depends on what you’re doing. Developers talking about whether LLMs are useful are just talking past each other unless they say “useful for React” or “useful for Rust.” I mostly write Drupal code, and the JetBrains LLM autocomplete saves me a few keystrokes, maybe. It’s not amazing. My theory is that there just isn’t much boilerplate Drupal code out there to train on: everything possible gets pushed out of code and into configuration + UI. If I were writing React components I’d be having an entirely different experience.
  
  nijave ・ a day ago
  ・ 3 more
  
  Isn't there some way to speed up with codegen besides using LLMs?
  
  frank_nitti ・ 19 hours ago
  
  Some may have a better answer, but I often compare with tools like OpenAPI and AsyncAPI generators where HTTP/AMQP/etc code can be generated for servers, clients and extended documentation viewers.
  The trade off here would be that you must create the spec file (and customize the template files where needed) which drives the codegen, in exchange for explicit control over deterministic output. So there’s more typing but potentially less cognitive overhead with reviewing a bunch of LLM output.
  For this use case I find the explicit codegen UX preferable to inspecting what the LLM decided to do with my human-language prompt, if attempting to have the LLM directly code the library/executable source (as opposed to asking it to create the generator, template or API spec).
  
  rienbdj ・ 8 hours ago
  
  You can require less code by using a more expressive programming language.
  
  azemetre ・ a day ago
  ・ 7 more
  
  Isn’t this because the LLMs had like a million+ react tutorials/articles/books/repos to train on?
  I mean I try to use them for svelte or vue and it still recommends react snippets sometimes.
  
  lenglain ・ an hour ago
  
  I have had no issues with LLMs trying to force a language on me. I tried the whole snake game test with ChatGPT but Instead of using Python I asked it to use the nodejs bindings for raylib, which is rather unusual.
  It did it in no time and no complaints.
  
  Culonavirus ・ a day ago
  
  Generally speaking, "LLMs" that I use are always the latest thinking versions of the flagship models (Grok 3/Gemini 2.5/...). GPT4o (and equivalent) are a mess.
  But you're correct, when you use more exotic and/or quite new libraries, the outputs can be of mixed quality. For my current stack (Typescript, Node, Express, React 19, React Router 7, Drizzle and Tailwind 4) both Grok 3 (the paid one with 100k+ context) and Gemini 2.5 are pretty damn good. But I use them for prototyping, i.e. quickly putting together new stuff, for types, refactorings... I would never trust their output verbatim. (YET.) "Build an app that ..." would be a nightmare, but React-like UI code at sufficiently granular level is pretty much the best case scenario for LLMs as your components should be relatively isolated from the rest of the app and not too big anyways.
  
  trillic ・ 20 hours ago
  ・ 2 more
  
  I put these in the Gemini Pro 2.5 system prompt and it's golden for Svelte.
  https://svelte.dev/docs/llms
  
  azemetre ・ 20 hours ago
  
  I do this and it still spits out react snippets regardless like 40% of the time... I feel like unless you are doing something extremely basic this is fine but once you introduce state or animations all these systems death spiral.
  
  ambicapter ・ a day ago
  
  Yes, definitely. Act accordingly.
  
  lovich ・ 21 hours ago
  
  I use https://visualstudio.microsoft.com/services/intellicode/ for my IDE which learns on your codebase, so it does end up saving me a ton of time after its learned my patterns and starts suggesting entire classes hooked up to the correct properties in my EF models.
  It lets me still have my own style preferences with the benefit of AI code generation. Bridged the barrier I had with code coming from Claude/ChatGPT/etc where its style preferences were based on the wider internets standards. This is probably a preference on the level of tabs vs spaces, but ¯\_(ツ)_/¯
- zx8080 ・ a day ago
  
  > An expert prompts it and checks the code. Still saves a lot of time vs typing everything from scratch.
  It's a lie. The marketing one, to be specific, which makes it even worse.
  
  victorbjorklund ・ 21 hours ago
  
  huh?
- 0points ・ a day ago
  
  I really don't agree with the idea that expert time would just be spent typing, and I'd be really surprised if that's the common sentiment around here.
  An expert reasons, plans ahead, thinks and reasons a little bit more before even thinking about writing code.
  If you are measuring productivity by lines of code per hour then you don't understand what being a dev is.
  
  brailsafe ・ a day ago
  ・ 4 more
  
  > I really don't agree with the idea that expert time would just be spent typing, and I'd be really surprised if that's the common sentiment around here.
  They didn't suggest that at all, they merely suggested that the component of the expert's work that would otherwise be spent typing can be saved, while the rest of their utility comes from intense scrutiny, problem solving, decision making about what to build and why, and everything else that comes from experience and domain understanding.
  
  fc417fc802 ・ a day ago
  ・ 2 more
  
  It's not just time spent typing. Figuring out what needs to be typed can be both draining and time consuming. It's often (but not always) much easier to review someone else's solution to the problem than it is to solve it from scratch on your own.
  Oddly enough security critical flows are likely to be one of the few exceptions because catching subtle reasoning errors that won't trip any unit tests when reviewing code that you didn't write is extremely difficult.
  
  oblio ・ a day ago
  
  The problem is, building something IS the destination. At least the first 5-10 times. Building and fixing along the way is what builds lasting knowledge for most people.
  
  kiitos ・ 14 hours ago
  
  Time spent typing is statistically 0% of overall time spent in developing/implementing/shipping a feature or product or whatever. There's literally no reason to try to optimize that irrelevant detail.
  
  victorbjorklund ・ 21 hours ago
  
  Yea, and you still do that now. Lets say you spend 30% of your time coding and the rest planning. Well, now you got even more time for planning.
- otabdeveloper4 ・ a day ago
  
  > Still saves a lot of time vs typing everything from scratch
  No it doesn't. Typing speed is never the bottleneck for an expert.
  As an offline database of Google-tier knowledge, LLM's are useful. Though current LLM tech is half-baked, we need:
  a) Cheap commodity hardware for running your own models locally. (And by "locally" I mean separate dedicated devices, not something that fights over your desktop's or laptop's resources.)
  b) Standard bulletproof ways to fine-tune models on your own data. (Inference is already there mostly with things like llama.cpp, finetuning isn't.)
  
  boruto ・ a day ago
  ・ 3 more
  
  I realize I procrastinate less when using LLM to write code which I know I could write.
  
  kentonv ・ a day ago
  ・ 2 more
  
  I've noticed this too.
  I remember hearing somewhere that humans have a limited capacity in terms of number of decisions made in a day, and it seems to fit here: If I'm writing the code myself, I have to make several decisions on every line of code, and that's mentally tiring, so I tend to stop and procrastinate frequently.
  If an LLM is handling a lot of the details, then I'm just making higher-level decisions, allowing me to make more progress.
  Of course this is totally speculation and theories like this tend to be wrong, but it is at least consistent with how I feel.
  
  autoexec ・ 21 hours ago
  
  I have a feeling that it's something that might help today but also something you might pay for later. When you have to maintain or bug fix that same code down the line the fact that you were the one making all those higher-level decisions and thinking through the details gives you an advantage. Just having everything structured and named in ways that make the most sense to you seems like it'd be helpful the next time you have to deal with the code.
  While it's often a luxury, I'd much rather work on code I wrote than code somebody else wrote.
  
  victorbjorklund ・ 21 hours ago
  
  Maybe you type faster than me then :) I for sure type slower than Claude code. :)
  
  brailsafe ・ a day ago
  ・ 11 more
  
  > No it doesn't. Typing speed is never the bottleneck for an expert
  How could that possibly be true!? Seems like it'd be the same as suggesting being constrained to analog writing utensils wouldn't bottleneck the process of publishing a book or research paper. At the very least such a statement implies that people with ADHD can't be experts.
  
  thisissomething ・ a day ago
  ・ 4 more
  
  Completely agree with you. I was working on the front-end of an application and I prompted Claude the following: "The endpoint /foo/bar is returning the json below ##json goes here##, show this as cards inside the component FooBaz following the existing design system".
  In less than 5 minutes Claude created code that: - encapsulated the api call - modeled the api response using Typescript - created a re-usable and responsive ui component for the card (including a load state) - included it in the right part of the page
  Even if I typed at 200wpm I couldn't produce that much code from such a simple prompt.
  I also had similar experiences/gains refactoring back-end code.
  This being said, there are cases in which writing the code yourself is faster than writing a detailed enough prompt, BUT those cases are becoming exception with new LLM iteration. I noticed that after the jump from Claude 3.7 to Claude 4 my prompts can be way less technical.
  
  oblio ・ a day ago
  ・ 3 more
  
  The thing is... does your code end there? Would you put that code in production without a deep analysis of what Claude did?
  
  s900mhz ・ 10 hours ago
  
  I’m not who you replied to but I keep functions small and testable paired with unit tests with a healthy mix of happy/sad path.
  Afterwards I make sure the LLM passes all the tests before I spend my time to review the code.
  I find this process keeps the iterations count low for review -> prompt -> review.
  I personally love writing code with an LLM. I’m a sloppy typist but love programming. I find it’s a great burnout prevention.
  For context: node.js development/React (a very LLM friendly stack.)
  
  brailsafe ・ 20 hours ago
  
  (GP) I wouldn't, but it would get me close enough that I can do the work that's more intellectually stimulating. Sometimes you need the people to do the concrete for a driveway, and sometimes you need to be signing off on the way the concrete was done, perhaps making some tweaks during the early stages.
  
  throwaway0123_5 ・ a day ago
  
  It seems fair to say that it is ~never the overall bottleneck? Maybe once you figure out what you want, typing speed briefly becomes the bottleneck, but does any expert finish a day thinking "If only I could type twice as fast, I'd have gotten twice as much work done?" That said, I don't think "faster typing" is the only benefit that AI assistance provides.
  
  otabdeveloper4 ・ a day ago
  ・ 5 more
  
  > How could that possibly be true!?
  (I'll assume you're not joking, because your post is ridiculous enough to look like sarcasm.)
  The answer is because programmers read code 10 times more (and think about code 100 times more) than they write it.
  
  thisissomething ・ a day ago
  ・ 3 more
  
  Yeah, but how fast can you write compared to how fast you think?
  How many times have you read a story card and by the time you finished reading it you thought "It's an easy task, should take me 1 hour of work to write the code and tests"?
  In my experience, in most of those cases the AI can do the same amount of code writing in under 10 minutes, leaving me the other 50 minutes to review the code, make/ask for any necessary adjustments, and move on to another task.
  
  dns_snek ・ a day ago
  ・ 2 more
  
  I don't know anyone who can think faster than they can type (on average), they would have to have an IQ over 150 or something. For mere mortals like myself, reasoning through edge cases and failure conditions and error handling and state invariants takes time. Time that I spend looking at a blinking cursor while the gears spin, or reading code. I've never finished a day where I thought to myself "gosh darn, if only I could type faster this would be done already".
  
  skydhash ・ 21 hours ago
  
  You could be fast if you were coding only the happy path, like a lot of juniors do. Instead of thinking about trivial things like malformed input, library semantics, framework gotchas and what not.
  
  brailsafe ・ 20 hours ago
  
  I wasn't joking, it's a bottleneck sometimes, that's it. It's a bottleneck like comfort and any good tool is a bottleneck, like a slow computer is a bottleneck. It's silly to suggest that your ability to rapidly use a fundamental tool is never a bottleneck, no matter what other bits need to come into play during the course of your day.
  My ability to review and understand intent behind code isn't a primarily bottleneck to me actually efficiently reviewing code when it's requested of me, the primary bottleneck is being notified at the right time that I have a waiting request to review code.
  If compilers were never a bottleneck, why would we ever try to make them faster? If build tools were never a bottleneck, why would we ever optimize those? These are all just some of the things that can stand between the identification of a problem and producing a solution for it.
- blinded ・ 14 hours ago
  
  Sure! But over half the fun of coding is writing and learning.
- signa11 ・ a day ago
  
  > ... Still saves a lot of time vs typing everything from scratch ...
  how ? the prompts have still to be typed right ? and then the output examined in earnest.
  
  fastball ・ a day ago
  ・ 5 more
  
  A prompt can be as little as a sentence to write hundreds of lines of code.
  
  shaky-carrousel ・ a day ago
  ・ 4 more
  
  Hundreds of lines that you have to carefully read and understand.
  
  fastball ・ 13 hours ago
  
  Are you not doing that already?
  I go line-by-line through the code that I wrote (in my git client) before I stage+commit it.
  
  victorbjorklund ・ 21 hours ago
  
  Depends on what it is doing. A html template without JS? Enough to just check if it looks right and works.
  
  ImPostingOnHN ・ 19 hours ago
  
  You also have to do that with code you write without LLM assistance.
  
  victorbjorklund ・ 21 hours ago
  
  Latest project I been working on. Prompts are a few sentences (and technically I dictate them instead of typing) and the LLM generates a few hundred lines of code.
  
  fragmede ・ a day ago
  
  not if you don't want to. speech to text is pretty good these days, and even eg aider has a /voice command thanks to OpenAI's whisper.
- dismalaf ・ 21 hours ago
  
  > Still saves a lot of time vs typing everything from scratch
  Probably very language specific. I use a lot of Ruby, typing things takes no time it's so terse. Instead I get to spend 95% of my time pondering my problems (or prompting the LLM)...
  
  deepsun ・ 21 hours ago
  
  With a proper IDE you don't type much even in Java/.Net, it's all autocomplete anyway. "Too verbose" complaints are mostly from Notepad lovers, and those who never needed to read somebody else's code.
  
  victorbjorklund ・ 21 hours ago
  ・ 3 more
  
  It can create a whole dashboard view in elixir in a few seconds that is 100 lines long. No way I can type that in the same time.
  
  QuadmasterXLII ・ 20 hours ago
  
  If you're making a dashboard view your productivity is zero, making it faster just multiplies zero by a bigger number.
  Edit: this comment was more a result of me being in a terrible mood than a true claim. Sorry.
  
  dismalaf ・ 21 hours ago
  
  In my experience the problem is never creating the dashboard view (there's a million examples of it out there anyway to copy/paste), but making sense of the data. Especially if you're doing anything even remotely novel.
- 827a ・ a day ago
  
  I tend to disagree, but I don't know what my disagreement means for the future of being able to use AI when writing software. This workers-oauth-provider project is 1200 lines of code. An expert should be able to write that on the scale of an hour.
  The main value I've gotten out of AI writing software comes from the two extremes; not from the middle-ground you present. Vibe coding can be great and seriously productive; but if I have to check it or manually maintain it in nearly any capacity more complicated than changing one string, productivity plummets. Conversely; delegating highly complex, isolated function writing to an AI can also be super productive, because it can (sometimes) showcase intelligence beyond mine and arrive at solutions which would take me 10x longer; but definitionally I am not the right person to check its code output; outside of maybe writing some unit tests for it (a third thing AI tends to be quite good at)
  
  kentonv ・ 18 hours ago
  ・ 2 more
  
  > This workers-oauth-provider project is 1200 lines of code. An expert should be able to write that on the scale of an hour.
  Are you being serious here?
  Let's do the math.
  1200 lines in a hour would be one line every three seconds, with no breaks.
  And your figure of 1200 lines is apparently omitting whitespace and comments. The actual code is 2626 lines. Let's say we ignore blank lines, then it's 2251 lines. So one line per ~1.6 seconds.
  The best typists type like 2 words per second, so unless the average line of code has 3 words on it, a human literally couldn't type that fast -- even if they knew exactly what to type.
  Of course, people writing code don't just type non-stop. We spend most of our time thinking. Also time testing and debugging. (The test is 2195 lines BTW, not included in above figures.) Literal typing of code is a tiny fraction of a developer's time.
  I'd say your estimate is wrong by at least one, but realistically more likely two orders of magnitude.
  
  827a ・ 15 hours ago
  
  "On the scale of an hour" means "within an order of magnitude of one hour", or either "10 minutes to 10 hours" or "0.1 hours to 10 hours" depending on your interpretation, either is fine.
  
  fc417fc802 ・ a day ago
  ・ 3 more
  
  > An expert should be able to write that on the scale of an hour.
  An expert in oauth, perhaps. Not your typical expert dev who doesn't specialize in auth but rather in whatever he's using the auth for. Navigating those sorts of standards is extremely time consuming.
  
  827a ・ 19 hours ago
  ・ 2 more
  
  Maybe, but also: Cloudflare is one of like fifteen organizations on the planet writing code like this. The vast majority of The Rest Of Us will just consume code like this, which companies like Cloudflare, Auth0, etc write. That tends to be the nature of highly-specialized highly-domain-specific code. Cloudflare employs those mythical Oauth experts you talk about.
  
  kentonv ・ 17 hours ago
  
  That's me. I'm the expert.
  On my very most productive days of my entire career I've managed to produce ~1000 lines of code. This library is ~5000 (including comments, tests, and documentation, which you omitted for some reason). I managed to prompt it out of the AI over the course of about five days. But they were five days when I also had a lot of other things going on -- meetings, chats, code reviews, etc. Not my most productive.
  So I estimate it would have taken me 2x-5x longer to write this library by hand.
i5heu ・ a day ago

Revealing against what?
If you look at the README it is completely revealed... so i would argue there is nothing to "reveal" in the first place.
> I started this project on a lark, fully expecting the AI to produce terrible code for me to laugh at. And then, uh... the code actually looked pretty good. Not perfect, but I just told the AI to fix things, and it did. I was shocked.
> To emphasize, this is not "vibe coded". Every line was thoroughly reviewed and cross-referenced with relevant RFCs, by security experts with previous experience with those RFCs.
- JW_00000 ・ a day ago
  
  I think OP meant "revealing" as in "enlightening", not as "uncovering something that was hidden intentionally".
- rienbdj ・ a day ago
  
  > Revealing against what?
  Revealing of what it is like working with an LLM in this way.
- kortilla ・ a day ago
  
  Revealing the types of critical mistakes LLMs make. In particular someone that didn’t already understand OAuth likely would not have caught this and ended up with a vulnerable system.
- risyachka ・ a day ago
  
  If the guy knew how to properly implement oauth - did he save any time though by prompting or just tried to prove a point that if you actually already know all details of impl you can guide llm to do it?
  Thats the biggest issue I see. In most cases I don't use llm because DIYing it takes less time than prompting/waiting/checking every line.
  
  JimDabell ・ a day ago
  ・ 4 more
  
  > did he save any time though
  Yes:
  > It took me a few days to build the library with AI.
  > I estimate it would have taken a few weeks, maybe months to write by hand.
  – https://news.ycombinator.com/item?id=44160208
  > or just tried to prove a point that if you actually already know all details of impl you can guide llm to do it?
  No:
  > I was an AI skeptic. I thoughts LLMs were glorified Markov chain generators that didn't actually understand code and couldn't produce anything novel. I started this project on a lark, fully expecting the AI to produce terrible code for me to laugh at. And then, uh... the code actually looked pretty good. Not perfect, but I just told the AI to fix things, and it did. I was shocked.
  — https://github.com/cloudflare/workers-oauth-provider/?tab=re...
  
  autoexec ・ 21 hours ago
  ・ 3 more
  
  > I thoughts LLMs were glorified Markov chain generators that didn't actually understand code and couldn't produce anything novel.
  How novel is a OAuth provider library for cloudflare workers? I wouldn't be surprised if it'd been trained on multiple examples.
  
  kentonv ・ 21 hours ago
  
  I'm not aware of any other OAuth provider libraries for Workers. Plenty of clients, but not providers -- implementing the provider side is not that common, historically. See my other comment:
  https://news.ycombinator.com/item?id=44164204
  
  undefined ・ 21 hours ago
  
  [deleted]
  
  theshrike79 ・ a day ago
  
  Do people save time by learning to write code at 420WPM? By optimising their vi(m) layouts and using languages with lots of fancy operators that make things quicker to write?
  Using an LLM to write code you already know how to write is just like using intellisense or any other smart autocomplete, but at a larger scale.
  
  DonHopkins ・ a day ago
  
  [flagged]
throwaway2037 ・ a day ago

While I think this is a cool (public) experiment by Claude, asking an LLM to write security-sensitive code seems crazy at this point. Ad absurdum: Can you imagine asking Claude to implement new functionality in OpenSSL libs!?
PeterStuer ・ a day ago

Which is exactly why AI coding assistants work with your expertise rather than replace it. Most people I see fail at AI assisted development are either non-technical people expecting the AI will solve it all, or technical people playing gotcha with the machine rather than collaborating with it.
bootsmann ・ a day ago

There is also one quite early in the repo where the dev has to tell Claude to store only the hashes of secrets
kentonv ・ a day ago

Yeah I was disappointed in that one.
I hate to say, though, but I have reviewed a lot of human code in my time, and I've definitely caught many humans making similar-magnitude mistakes. :/
- hn_throwaway_99 ・ a day ago
  
  I just wanted to say thanks so much publishing this, and especially your comments here - I found them really helpful and insightful. I think it's interesting (though not unexpected) that many of the other commenters' comments here show what a Rorschach test this is. I think that's kind of unfortunate, because your experience clearly showed some of the benefits and limitations/pitfalls of coding like this in an objective manner.
  I am curious, did you find the work of reviewing Claude's output more mentally tiring/draining than writing it yourself? Like some other folks mentioned, I generally find reviewing code more mentally tiring than writing it, but I get a lot of personal satisfaction by mentoring junior developers and collaborating with my (human) colleagues (most of them anyway...) Since I don't get that feeling when reviewing AI code, I find it more draining. I'm curious how you felt reviewing this code.
  
  kentonv ・ a day ago
  ・ 2 more
  
  I find reviewing AI code less mentally tiring that reviewing human code.
  This was a surprise to me! Until I tried it, I dreaded the idea.
  I think it is because of the shorter feedback loop. I look at what the AI writes as it is writing it, and can ask for changes which it applies immediately. Reviewing human code typically has hours or days of round-trip time.
  Also with the AI code I can just take over if it's not doing the right thing. Humans don't like it when I start pushing commits directly to their PR.
  There's also the fact that the AI I'm prompting is, obviously, working on my priorities, whereas humans are often working on other priorities, but I can't just decline to review someone's code because it's not what I'm personally interested in at that moment.
  When things go well, reviewing the AI's work is less draining than writing it myself, because it's basically doing the busy work while I'm still in control of high-level direction and architecture. I like that. But things don't always go well. Sometimes the AI goes in totally the wrong direction, and I have to prompt it too many times to do what I want, in which case it's not saving me time. But again, I can always just cancel the session and start doing it myself... humans don't like it when I tell them to drop a PR and let me do it.
  Personally, I don't generally get excited about mentoring and collaborating. I wish I did, and I recognize it's an important part of my job which I have to do either way, but I just don't. I get excited primarily about ideas and architecture and not so much about people.
  
  hn_throwaway_99 ・ 20 hours ago
  
  Thank you so much for your detailed, honest, and insightful response! I've done a bunch of AI-assisted coding to varying degrees of success, but your comment here helped me think about it in new ways so that I can take the most advantage of it.
  Again, I think your posting of this is probably the best actual, real world evidence that shows both the pros and cons of AI-assisted coding, dispassionately. Awesome work!
- jjcm ・ 19 hours ago
  
  Most interesting aspect of this is it likely learned this pattern from human-written code!
ActionHank ・ a day ago

But AIbros will be running around telling everyone that Claude invented OAuth for Cloudflare all on its own and then opensourced it.
undefined ・ a day ago

[deleted]
bananapub ・ a day ago

this seems like a true but pointless observation? if you're producing security-sensitive code then experts need to be involved, whether that's me unwisely getting a junior to do something, or receiving a PR from my cat, or using an LLM.
removing expert humans from the loop is the deeply stupid thing the Tech Elite Who Want To Crush Their Own Workforces / former-NFT fanboys keep pushing, just letting an LLM generate code for a human to review then send out for more review is really pretty boring and already very effective for simple to medium-hard things.
- toofy ・ 14 hours ago
  
  > …removing expert humans from the loop is the deeply stupid thing the Tech Elite Who Want To Crush Their Own Workforce…
  this is completely expected behavior by them. departments with well paid experts will be one of the first they’ll want to cut. in every field. experts cost money.
  we’re a long, long, long way off from a bot that can go into random houses and fix under the sink plumbing, or diagnose and then fix an electrical socket. however, those who do most of their work on a computer, they’re pretty close to a point where they can cut these departments.
  in every industry in every field, those will be jobs cut first. move fast and break things.
- hn_throwaway_99 ・ a day ago
  
  I think it's a critically important observation.
  I thought this experience was so helpful as it gave an objective, evidence-based sample on both the pros and cons of AI-assisted coding, where so many of the loudest voices on this topic are so one-sided ("AI is useless" or "developers will be obsolete in a year"). You say "removing expert humans from the loop is the deeply stupid thing the Tech Elite Who Want To Crush Their Own Workforces / former-NFT fanboys keep pushing", but the fact is many people with the power to push AI onto their workers are going to be more receptive to actual data and evidence than developers just complaining that AI is stupid.
october8140 ・ a day ago

It's a Jr Developer that you have to check all their code over. To some people that is useful. But you're still going to have to train Jr Developers so they can turn into Sr Developers.
- PeterStuer ・ a day ago
  
  I don't like the jr dev analogy. It neither has the same weaknesses nor the same strenghts.
  It's more like the genious coworker that has an overassertive ego and sometimes shows up drunk, but if you know how to work with them and see past their flaws, can be a real asset.
  
  hn_throwaway_99 ・ a day ago
  
  I also like your analogy, but it also explains why I find working with AI-assisted coding so mentally tiresome.
  It's like with some auto-driving systems - I say it like having a slightly inebriated teenager at the wheel. I can't just relax and read a book, because then I'd die. But so I have to be more mentally alert than just driving myself because everything could be going smoothly and relaxed, but at any moment the driving system could decide to drive into a tree.
- Cthulhu_ ・ a day ago
  
  I don't really agree; a junior developer, if they're curious enough, wouldn't just write insecure code, they would do self-study and find out best practices etc before writing code, including not storing plaintext passwords and the like.
  
  hn_throwaway_99 ・ a day ago
  
  You have clearly only ever worked with the creme de la creme of junior developers.

paxys ・ 2 days ago

This is exactly the direction I expect AI-assisted coding to go in. Not software engineers being kicked out and some business person pressing a few buttons to have a fully functional app (as is playing out in a lot of fantasies on LinkedIn & X), but rather experienced engineers using AI to generate bits of code and then meticulously reviewing and testing them.

The million dollar (perhaps literally) question is – could @kentonv have written this library quicker by himself without any AI help?

kentonv ・ 2 days ago

It took me a few days to build the library with AI.
I estimate it would have taken a few weeks, maybe months to write by hand.
That said, this is a pretty ideal use case: implementing a well-known standard on a well-known platform with a clear API spec.
In my attempts to make changes to the Workers Runtime itself using AI, I've generally not felt like it saved much time. Though, people who don't know the codebase as well as I do have reported it helped them a lot.
I have found AI incredibly useful when I jump into other people's complex codebases, that I'm not familiar with. I now feel like I'm comfortable doing that, since AI can help me find my way around very quickly, whereas previously I generally shied away from jumping in and would instead try to get someone on the team to make whatever change I needed.
- michelsedgh ・ a day ago
  
  The fascinating part is that each person is finding their own way of using these tools from kids to elders and everyone in between no matter what your background or language or whatever is
  
  protocolture ・ a day ago
  
  This. Lots of people talking up agents right now, but the conversational rubber duck thing hits the spot well for me.
- srhtftw ・ 2 days ago
  
  > It took me a few days to build the library with AI. ... > I estimate it would have taken a few weeks, maybe months to write by hand.
  I don't think this is a fair assessment give the summary of the commit history https://pastebin.com/bG0j2ube shows your work started on 2025-02-27 and started trailing off at 2025-03-20 as others joined in. Minor changes continue to present.
  > That said, this is a pretty ideal use case: implementing a well-known standard on a well-known platform with a clear API spec.
  Still, this allowed you to complete in a month what may have taken two. That's a remarkable feat considering the time and value of someone of your caliber.
  
  kentonv ・ a day ago
  
  I think the data supports that there were about 5 distinct days when I did a large amount of work on this library, and a sprinkling of minor commits through the rest of the month. Glen's commits, while numerous, were also fairly minor, mostly logistical details around releases.
  This library is not the only thing I was working on, nor even the main thing. As the lead engineer of Cloudflare Workers I have quite a few other things demanding my time.
  
  motorest ・ a day ago
  
  > (...) your work started on 2025-02-27 and started trailing off at 2025-03-20 as others joined in. Minor changes continue to present.
  Your analysis is far too superficial to extract anything meaningful. I know for a fact that I have small projects that took me only a couple of days to get done which have a commit history ranging a few months. Also, software is never done. There's always room to refactor, and LLMs turn that into trivial problems. Lastly, is your project still under development if your commits are README updates, linter runs, and renaming variables?
  There is a reason why commit history is not used to track productivity.
  
  manquer ・ 2 days ago
  ・ 3 more
  
  Is it though?
  Would someone of author's caliber even be working on trivial slog item like Oauth2 implementation, if not for the novel development approach he wanted to attempt here ?
  For the kind of regular jobs a engineer typically is expected to do, would it give 100% productivity jump ?
  
  srhtftw ・ 2 days ago
  ・ 2 more
  
  Many tools make lesser developers more productive (to a point) but they fail to improve the productivity of talented professionals. Lots of "no/low" code things come to mind. But here's a tool that made kentonv 2x productive at a task that's clearly in his wheelhouse. It seems under the right conditions it can improve the productivity of developers at the opposite end of the spectrum.
  What other tools could do that?
  
  andyferris ・ a day ago
  
  To answer your question explicitly, we do have existing tools that help on that end, but they are nerdy and not hyped by beginners.
  Type systems, LSPs, tests, formatters, Rust’s borrow checker, logs and traces, source control are examples of things that make experts go faster. This space is hardly neglected (but could always be better).
  It is really nice to see LLMs helping on all skill levels.
- 9dev ・ 2 days ago
  
  Funny thing. I have built something similar recently, that is a 2.1-compliant authorisation server in TypeScript[0]. I did it by hand, with some LLM help on the documentation. I think it took me about two weeks full time, give or take, and there’s still work to do, especially on the testing side of things, so I would agree with your estimate.
  I’m going to take a very close look at your code base :)
  [0] https://github.com/colibri-hq/colibri/blob/next/packages/oau...
- upstairs-war ・ 2 days ago
  
  Thanks kentonv. I picked up where you left off, supported with oauth2.1 rfc, and integrated ms oauth to our internal mcp server. Cool to have Claude be business aware
- graeme ・ a day ago
  
  >I have found AI incredibly useful when I jump into other people's complex codebases, that I'm not familiar with. I now feel like I'm comfortable doing that
  This makes sense. Are there codebases where you find this doesn't work as well, either from the codebase's min required context size or the code patterns not being in the training data?
- aprilthird2021 ・ a day ago
  
  Matches my experiences well. Making changes to large, complex codebases I know well? Teaching the AI to get up to speed with me takes too much time.
  Code I know nothing about? AI is very helpful there
- philipwhiuk ・ 2 days ago
  
  > Though, people who don't know the codebase as well as I do have reported it helped them a lot.
  My problem I guess is that maybe this is just Dunning-Kruger esq. When you don't know what you don't know you get the impression it's smart. When you do, you think it's rubbish.
  Like when you see a media report on a subject you know about and you see it's inaccurate but then somehow still trust the media on a subject you're a non-expert on.
  
  motorest ・ a day ago
  ・ 2 more
  
  > My problem I guess is that maybe this is just Dunning-Kruger esq. When you don't know what you don't know you get the impression it's smart. When you do, you think it's rubbish.
  I see your point. Indeed there are two completely different points of view regarding the output of LLMs:
  * Hey, I managed to vibecode my way into a fully working web service with a React SPA after a couple of prompts, and a full automated test suite to boot.
  * This project is nowhere as clean as I would have written it, and doesn't even follow my pet coding conventions.
  One side lauds LLMs, the other complains they output mainly crap.
  The truth of the matter is that the vast majority of software engineers write crap code, as the definition of "crap code" is "something I would have done differently". Opinionated engineers look at the output of LLMs and accuse it of being crap code. Eppur si muove.
  
  phatskat ・ 15 hours ago
  
  > The truth of the matter is that the vast majority of software engineers write crap code, as the definition of "crap code" is "something I would have done differently".
  This is certainly a part of it, but I do wonder that even if an LLM “learned” the conventions and preferences of an engineer and spit out “perfectly styled” code, would it be treated as such? I’d wager (a small amount) that it wouldn’t, because part of enjoying the code - for me - is _knowing_ the code. “I wrote it this way because I tried X, then Y, then saw I could do Z, and now I’m familiar with the code in a way that’s more intimate.” Unfamiliar code rarely looks like _really good_, in my opinion.
  
  throwaway314155 ・ 2 days ago
  
  I think most of this just amounts to the same old good developers vs. bad developers situation that we've been in for decades.
  
  giantrobot ・ 2 days ago
  
  > Like when you see a media report on a subject you know about and you see it's inaccurate but then somehow still trust the media on a subject you're a non-expert on.
  Gell-Mann Amnesia https://en.m.wikipedia.org/wiki/Gell-Mann_amnesia_effect
gokhan ・ 2 days ago

> Not software engineers being kicked out ... but rather experienced engineers using AI to generate bits of code and then meticulously reviewing and testing them.
But what if you only need 2 kentonv's instead of 20 at the end? Do you assume we'll find enough new tasks that will occupy the other 18? I think that's the question.
And the author is implementing a fairly technical project in this case. How about routine LoB app development?
- thewebguyd ・ 2 days ago
  
  > But what if you only need 2 kentonv's instead of 20 at the end? Do you assume we'll find enough new tasks that will occupy the other 18? I think that's the question.
  This is likely where all this will end up. I have doubts that AI will replace all engineers, but I have no doubt in my mind that we'll certainly need a lot less engineers.
  A not so dissimilar thing happened in the sysadmin world (my career) when everything transitioned from ClickOps to the cloud & Infrastructure as Code. Infrastructure that needed 10 sysadmins to manage now only needed 1 or 2 infrastructure folks.
  The role still exists, but the quantity needed is drastically reduced. The work that I do now by myself would have needed an entire team before AWS/Ansible/Terraform, etc.
  
  kentonv ・ 2 days ago
  ・ 21 more
  
  I think there's a huge huge space of software to build that isn't being touched today because it's not cost-effective to have an engineer build them.
  But if the time it takes an engineer to build any one thing goes down, now there are a lot more things that are cost effective.
  Consider niche use cases. Every company tends to have custom processes and workflows. Think about being an accountant at one company vs. another -- while a lot of the job is the same, there will always be parts that are significantly different. Those bespoke processes often involve manual labor because off-the-shelf accounting software cannot add custom features for every company.
  But what if it could? What if an engineer working with AI could knock out customer-specific features 10x as fast as they could in the past. Now it actually makes sense to build those features, to improve the productivity of each company's accounting department.
  It's hard to say if demand for engineers will go down or up. I'm not pretending to know for sure. But I can see a possibility that we actually have way more developers in coming years!
  
  thewebguyd ・ 2 days ago
  ・ 3 more
  
  > I think there's a huge huge space of software to build that isn't being touched today because it's not cost-effective to have an engineer build them.
  That's definitely an interesting area, but I think we'll actually see (maybe) individual employees solving some of these problems on their own without involving IT/the dev team.
  We kind of see it already - a lot of these problem spaces are being solved with complex Excel workflows, crappy Access databases, etc. because the team needed their problem solved now, and resources couldn't be given to them.
  Maybe AI is the answer to that so that instead of building a house of cards on Excel, these non-tech teams can have something a little more robust.
  It's interesting you mentioned accounting, because that's the one department/area I see taking off and running with it the most. They are already the department that's effectively programming already with Excel workflows & DSLs in whatever ERP du jour.
  So it doesn't necessarily open up more dev jobs, but maybe fulfills the old the mantra of "everyone will become a programmer." and we see more advanced computing become a commodity thanks to AI - much like everyone can click their way through an office suite with little experience or training, everyone will be able to use AI to automate large chunks of their job or departmental processes.
  
  ktzar ・ a day ago
  
  If we shiver at the sight of some of those accounting-created excels, which we only learn about when they fail and they can't understand them anymore, wait for them to hand over a vibe-coded 200k loc Python codebase "which is not working anymore" and nobody had ever reviewed a single line of code.
  
  kentonv ・ 2 days ago
  
  > I think we'll actually see (maybe) individual employees solving some of these problems on their own without involving IT/the dev team.
  I agree, but in my book, those employees are now developers. And so by that definition, there will be a lot more developers.
  Will we see more or fewer people whose primary job is software development? That's harder to answer. I do think we'll see a lot more consultant-type roles, with experienced software developers helping other people write their own personal automations.
  
  motorest ・ a day ago
  ・ 9 more
  
  > I think there's a huge huge space of software to build that isn't being touched today because it's not cost-effective to have an engineer build them.
  LLMs don't change that. If a business does not have the budget for a software engineer, LLMs won't make up budget headroom for it either. What LLMs do is allow engineers to iterate faster, and work on more tasks. This means less jobs.
  
  petersellers ・ a day ago
  ・ 8 more
  
  If a business has the budget for 1 or 2 engineers though, they might be able to task them with work that previously required 5-10 engineers (in theory, anyways).
  
  motorest ・ a day ago
  ・ 7 more
  
  Right, but even the way you opted to frame this discussion is based on the idea that there is a drop in demand for software engineers. You need less engineers, not more. A few can get more done, but you need fewer to accomplish your tasks too.
  
  simiones ・ a day ago
  
  This is like claiming that there are fewer people who work in construction now than in the year 1000 because a machine can do what it would have literally taken 100 people to accomplish back then.
  But what has happened instead is that we are now building much more buildings and much more complex ones than we ever would have even conceived of back then. The Three Gorges dam required the work of thousands or even tens of thousands of people when it was built, and it would have required the work of millions in the year 1000. But it didn't actually generate millions of jobs in the year 1000: it was in fact never even conceived of as a possibility, much less attempted.
  Of course, the opposite can also happen. The number of carpenters has reduced to almost nothing, when it used to be a major profession, and there are many other professions that have entirely disappeared.
  
  petersellers ・ a day ago
  ・ 5 more
  
  I didn't frame it that way - perhaps you are thinking of the person you replied to?
  Nevertheless, I don't think they are trying to frame it that way, either. The point is that making software development easier can actually increase the demand of software engineers in some cases (where projects that were previously not considered due to budget constraints are now feasible).
  
  motorest ・ a day ago
  ・ 4 more
  
  > I didn't frame it that way - perhaps you are thinking of the person you replied to?
  You did. You explicitly asserted the following.
  > If a business has the budget for 1 or 2 engineers though, they might be able to task them with work that previously required 5-10 engineers (...).
  In your own words, a project that would take 5-10 engineers is now feasible to be tackled with 1 or 2. Your own words.
  > (...) The point is that making software development easier can actually increase the demand of software engineers in some cases (...)
  I think that's somewhere between unrealistic and wishful thinking. Even in your problem statement, "making software development easier" lowers demand. Even if you argue that some positions might open where none existed before, the truth of the matter is that at the core of your scenario lies a drop in demand for software engineers. Shops who currently employ engineers won't need to retain as many to maintain their current level of productivity.
  
  petersellers ・ a day ago
  ・ 3 more
  
  > In your own words, a project that would take 5-10 engineers is now feasible to be tackled with 1 or 2. Your own words.
  That statement != lower demand for software engineers.
  If a firm needs to perform project X that previously cost 10 engineers to do, but they only have the budget for 2, they will not tackle that project. Engineers used = 0.
  However, if due to productivity enhancements with AI, the project can now be done with just 2 engineers, the company can now afford to tackle the project. Engineers used = 2.
  That is the point that the person you were originally replying to was making.
  > Even in your problem statement, "making software development easier" lowers demand.
  Incorrect, as shown above.
  > Even if you argue that some positions might open where none existed before, the truth of the matter is that at the core of your scenario lies a drop in demand for software engineers.
  I see what you are trying to say, but it's not that clear cut. The fact is, no one knows what will actually happen to software engineering demand in the long run. Some scenarios will increase demand for engineers, others will decrease it. No one knows what the net demand will be, everyone is only guessing at this point.
  
  basfo ・ a day ago
  ・ 2 more
  
  > If a firm needs to perform project X that previously cost 10 engineers to do, but they only have the budget for 2, they will not tackle that project. Engineers used = 0.
  0 on that Project, but those 2 engineers will still be used on a different Project that needs just 2 Engineers.
  BUT a company that sees that project as a critical part of the bussines and MUST tackle that project, will only need the 2 engineers in the payroll. Or hire just 2 instead of 10.
  Engineers not hired = 8
  Or.. maybe they don't really need that project that needs 10 engineers. They are ok as they are today, but they realize that with AI, they don't need those 2 engineers anymore to produce the same output, probably can be handled by just one with AI assistance.
  Engineers fired = 1
  
  ath92 ・ 6 hours ago
  
  But now every firm has access to AI. If a firm that doesn’t fire people but instead simply boosts productivity, they will out compete their competitors. The only way to compete with that firm is to also hire enough employees and give them AI tools.
  
  aerhardt ・ 19 hours ago
  
  I work for SMEs as a consulting CTO, and this is exactly where I see things going in this domain. I can take care of workloads that would've been prohibitively expensive in the past. In the case of SMEs, this may cover critical problems whose resolution can unlock new levels of growth. LLMs can be an absolute boon for them, and I'm fairly optimistic about being able to capitalize on the opportunity.
  
  hn_acc1 ・ 2 days ago
  ・ 3 more
  
  After 30+ years in the software field, and a user for 40+, having at times heavily customized my desktop or editor, for example - I've concluded that the best thing for most apps is for me to learn to use them with stock settings.
  Why? Inevitably, I changed positions / jobs / platforms, and all that effort was lost / inapplicable, and I had to relearn to use the stock settings anyway.
  Now, I understand that some companies have different setups, but it might just make more sense to change the company's accounting procedures (if possible) to conform to most accounting software defaults, rather than invest heavily in modifying the setup, unless you're a huge conglomerate and can keep people on staff. Why? Because someone, somewhere will have to maintain those changes. Sure, you can then hire someone else to update those changes - but guess what? Most likely, unless they open-source their changes, no LLM will have seen those changes, and even if they are allowed to fine-tune on it, they'll have seen exactly ONE instance of these changes. Odds they'll get everything right, AND the person using the LLM will recognize when it doesn't go right? Oh right, they invested in hundreds of unit tests to ensure everything works as expected even with changes, and I'm the tooth fairy..
  
  blharr ・ a day ago
  
  This just isn't true and will probably never be true. Using all the defaults is... probably optimal in the general sense and when things come to scale, but most companies (or just leadership) at some point want to leave the "standards" with custom design or additions. Also, any company providing payroll/accounting/ software has an inherent interest in going against standardization and providing features to promote lock-in.
  
  kentonv ・ 2 days ago
  
  There are good arguments to just conform. But it is in fact true nevertheless that many companies and teams continue to choose bespoke workflows over standardized ones. So I guess there must be something driving that.
  I don't actually think this is going to take the form of LLMs implementing custom patches to off-the-shelf software. I think instead it's going to look like LLMs writing code that uses APIs offered by off-the-shelf software to script specific workflows.
  
  int_19h ・ 2 days ago
  ・ 2 more
  
  It's interesting that you bring up accounting software as an example. In jurisdictions where legal requirements around it are a lot more specific than in e.g. US, accounting suites usually already come with a lot of customization hooks (up to and including full-fledged scripting DSLs), and there are software engineers and companies who specialize in using those to implement bespoke accounting requirements.
  
  kentonv ・ 2 days ago
  
  I admit I have no specific knowledge of accounting and just meant to reference any random department that isn't engineering.
  (Though I think it's true of engineering too. We all have our own weird team-specific processes for code reviews and CI and deployments which could probably use better automation.)
  But even where lots of customization exists today (such as in engineering!), more is always possible. It's always just a question of whether the automation saves as much time as it took to build. If the automations can be built faster, then it makes sense to build more of them.
  
  intended ・ a day ago
  
  Which solves the now problem for the tomorrow problem.
  We assume quite a bit about the challenge when we say it’s getting feature out.
  It’s sort of like saying we can sprint faster with these tools, when the race is a marathon.
  Or a better example is Coke vs Pepsi.
  How do LLMs impact long term project, firm, process viability ?
  
  the_sleaze_ ・ 2 days ago
  
  Banking allegedly runs on ancient cobalt cathedrals and mystical runes.
  Will AI be able translate all that into rust?
  
  mikeocool ・ 2 days ago
  
  Though arguably cloud infra made it so that a lot more companies who never would have built out a data center or leased a chunk of space in one were spinning up some serious infra in AWS or Azure -- and thus hiring at least 1-2 devops engineers.
  Before the end of zero interest rate policy, all the sysadmins I knew who the made the transition to devops were never stuck looking for a job for long.
  
  achierius ・ 2 days ago
  
  To be clear, the number of people employed as "SREs" or "production engineers" is actually far, far higher (at least an order of magnitude) than in the days before cloud became a thing. There are simply far more apps / companies / businesses / etc. who use cloud hosting than there ever were doing on-prem work.
  
  tkiolp4 ・ 2 days ago
  ・ 2 more
  
  I don’t think we would need less engineers… the work to be done will increase instead. Example: now it takes 10 engineers to release a product in 10 months without AI. With AI it takes lets say 1 engineer to release the same product in 1 month. What’s the company gonna do now? Release 10 products in 10 months without AI 10 engineers (each using AI).
  It’s an exaggeration I know, but you get the point.
  
  motorest ・ a day ago
  
  > What’s the company gonna do now? Release 10 products in 10 months without AI 10 engineers (each using AI).
  Software is often not the bottleneck. If instead of 10 engineers you just need the one, the company will shed headcount it doesn't need. This might mean, for example, that instead of 10 developers and a software testing engineer, now a team changes to perhaps add testers while firing half of the developers.
  
  intended ・ a day ago
  
  I’m going to bet that it’s going to need far less AI.
  There was another article posted somewhere that made a parallel between the AI hype and no-code, outsourcing and other waves that have come.
- paxys ・ 2 days ago
  
  Increased productivity means increased opportuntity. There isn't going to be a time (at least not anytime soon) when we can all sit back and say "yup, we have accomplished everything there is to do with software and don't need more engineers".
  
  spiderice ・ 2 days ago
  ・ 17 more
  
  But there very well might be a time very soon where human's no longer offer economic value to the software engineering process. If you could (and currently you can't) pay an AI $10k/year to do what a human could do in a year, why would you pay the human 6 figures? Or even $20k?
  Nobody is claiming that human's won't have jobs simply because "we have accomplished everything this is to do". It's that humans will offer zero economic value compared to AI because AI gets so good and so cheap.
  
  paxys ・ 2 days ago
  ・ 11 more
  
  And there might be a giant asteroid that strikes the earth a few years down the line ending human civilization.
  If there is some magic $10k AI that can fully replace a $200k software engineer then I'd love to see it. Until that happens this entire discussion is science fiction.
  
  alastairr ・ 2 days ago
  
  You don’t need to completely replace a whole 200k engineer. You just need to increase each engineer’s productivity sufficiently that you can reduce the total number of engineers in your company.
  
  motorest ・ a day ago
  
  > If there is some magic $10k AI that can fully replace a $200k software engineer then I'd love to see it.
  I think you have multiple offers of that very AI dangling in front of you, but you might be refusing to acknowledge them. One of the problems is the way you opt to frame the issue. Does "replacing" means firing the guy hoping to replace him with a Slack webhook? Or does it mean your team decides they don't need the same headcount of medior/senior engineers because a team of junior engineers mentored by someone focusing on quality ends up being more productive?
  
  spiderice ・ 2 days ago
  ・ 5 more
  
  If experts were saying the astroid will hit earth in the next 5 years, would it still be science fiction?
  You acting like those two scenarios are the same is disingenuous. Fuck that.
  
  lukeschlather ・ 2 days ago
  
  Experts understand orbital mechanics pretty well. If experts say an asteroid in the next 5 years it's pretty similar to saying that a rock dropped from the top of a skyscraper will hit the ground. It happens billions of times every day, we know the cause and effect.
  With AI, there's no real expertise involved in saying "well, it was very stupid 5 years ago, now it's starting to seem smart, if we extrapolate it's going to be smarter than me in 5 years." But no one really knows what level of effort is required to make it smarter than me. No one is an expert in something that doesn't exist yet.
  
  paxys ・ 2 days ago
  ・ 3 more
  
  Remove all the "experts" who have a major conflict of interest (running AI startups, selling AI courses, wanting to pump their company's stock price by associating with AI) and you'll find that very few actual experts in the field hold this view.
  
  TeMPOraL ・ 2 days ago
  
  Yup, because it's a stupid view. Good enough AI is right here, right now, today; it's already impacting day-to-day work in the software industry. That one is blindingly obvious to anyone who actually bothers to look around. You don't need experts to tell you the water is wet. It takes something special to try and deny this.
  It may not manifest as job loss yet, but the market response to changes is a whole other thing. For one, it's likely to first manifest as slowing down hiring relative to amount of projects being started and then released. Software is a growing market after all.
  
  motorest ・ a day ago
  
  > Remove all the "experts" who have a major conflict of interest (...) and you'll find that very few actual experts in the field hold this view.
  You might seek comfort in your conspiracy theories, but back in the real world the likes of me were already quite capable of creating complete and fully working projects from scratch using yesterday's LLMs.
  We are talking about afternoons where you grab your coffee, saying to yourself "let's see what this vibecode thing is all about", and challenging yourself to create projects from scratch using nothing but a definition of done, LLM prompts, and a free-tier LLM configured to run in agent mode.
  What, then?
  You then can proceed to nitpick about code quality and bugs, but I can also say the same thing about your work, which you take far longer to deliver.
  
  TeMPOraL ・ 2 days ago
  ・ 3 more
  
  It's not. Consider that replacing the only $200k software engineer on the project is different than replacing the third or tenth $200k software engineer on the project. To the extent AI is improving productivity of those engineers, it reduces the need for adding more engineers to that team. That may mean firing some of them, or just not hiring new ones (or fewer of them) as the project expands, as existing ones + AI can keep up with increased workload.
  
  nand_gate ・ 2 days ago
  ・ 2 more
  
  I'm biased but my money's on the end result of AI being fewer engineers per team but also teams as a concept becoming obsolete.
  Why keep legacy structures, with luxuries like POs or PMs if AI becomes powerful as you say - it'll just be 'one man startups' for better or worse.
  Any empire-building VP should probably fear the wishful AI future they're praying for!
  
  TeMPOraL ・ 6 hours ago
  
  > it'll just be 'one man startups' for better or worse.
  Not necessarily. The reality is, whatever some people can do individually, if they team up, they can do more together. The teams and small startups will remain for now, and so will big companies.
  I do imagine however that the internal structure will change. As the AI gets better and able to do more independently, people will shift from pair programming to more of a PM role (this is happening now), and this I imagine will quickly collapse further.
  Even today, LLMs seem more suited for project management than doing actual coding - it's just the space in-between that's the problem. I.e. LLMs can code great in the small, and can break down work very well, but keeping the changes consistent and following the plan is where they still struggle. As that gap closes, I'm not really sure how the team composition would look like. But I don't doubt there'd still be teams.
  
  nand_gate ・ 2 days ago
  
  In this scenario who would be buying this product that offers 'zero economic value compared to AI because AI gets so good and so cheap'.
  
  hooverd ・ 2 days ago
  ・ 3 more
  
  You run into knowledge collapse because nobody is socially reproducing that knowledge.
  
  amanaplanacanal ・ 2 days ago
  ・ 2 more
  
  This seems an important thing that somebody should be concerned about. How do we get the next generation of engineers? And how will they even be able to do the senior engineer work of validating the LLM output if they haven't had the years of experience writing code themselves?
  
  tonyhart7 ・ a day ago
  
  well they just need an information archive to learn that knowledge online, no human needed
  in software atleast but if you involve in hardware. good things AI cant just replace you outright
  
  lanthissa ・ 2 days ago
  
  it doesn't even have to be that. software engineer used to be a medium pay job, theres no law of the universe that says it cant go back to that.
- simonw ・ 2 days ago
  
  I guess I have trouble emphasizing with "But what if you only need 2 kentonv's instead of 20 at the end?" because I'm an open source oriented developer.
  What's open source for if not allowing 2 developers to achieve projects that previously would have taken 20?
dkdcio ・ 2 days ago

> The million dollar (perhaps literally) question is – could @kentonv have written this library quicker by himself without any AI help?
I *think* the answer to this is clearly no: or at least, given what we can accomplish today with the tools we have now, and that we are still collectively learning how to effectively use this, there's no way it won't be faster (with effective use) in another 3-6 months to fully-code new solutions with AI. I think it requires a lot of work: well-documented, well-structured codebases with fast built-in feedback loops (good linting/unit tests etc.), but we're heading there no
- motorest ・ a day ago
  
  > I think the answer to this is clearly no: or at least, given what we can accomplish today with the tools we have now, and that we are still collectively learning how to effectively use this, there's no way it won't be faster (with effective use) in another 3-6 months to fully-code new solutions with AI.
  I think these discussions need to start from another point. The techniques changed radically, and so did the way problems are tackled. It's not that a software engineer is/was unable to deliver a project with/without LLMs. That's a red herring. The key aspects are things like the overall quality of the work being delivered vs how much time it took to reach that level of quality.
  For example, one of the primary ways a LLM is used is not to write code at all: it's to explain to you what you are looking at. Whether it's used as a Google substitute or a rubber duck, developers are able to reason with existing projects and even explore approaches and strategies to tackle problem like they were never able to do so. You no longer need to book meetings with a principal engineer to as questions: you just drop a line in Copilot Chat and ask away.
  Another critical aspect is that LLMs help you explore options faster, and iterate over them. This allows you to figure out what approach works best for your scenario and adapt to emerging requirements without having to even chat with anyone. This means that, within the timeframe you would deliver the first iteration of a MVP, you can very easily deliver a much more stable project.
  
  james_marks ・ a day ago
  
  Exactly this
  > Another critical aspect is that LLMs help you explore options faster, and iterate over them. This allows you to figure out what approach works best for your scenario and adapt to emerging requirements without having to even chat with anyone. This means that, within the timeframe you would deliver the first iteration of a MVP, you can very easily deliver a much more stable project
  
  colonCapitalDee ・ a day ago
  
  I'm had great success with downloading source code and docs and using Claude Code to query them
- necovek ・ a day ago
  
  In a "well-documented, well-structured codebase with fast built-in feedback loops", a human programmer is really empowered to make changes fast. This is exactly what's needed for fast iteration, including in unfamiliar codebases.
  When you are not introducing a new pattern in the code structure, it's mostly copy-paste and then edit.
  But it's also extremely rare, so a pretty high bar to be able to benefit from tools like AI.
bigstrat2003 ・ 2 days ago

> but rather experienced engineers using AI to generate bits of code and then meticulously testing and reviewing them.
My problem is that (in my experience anyways) this is slower than me just writing the code myself. That's why AI is not a useful tool right now. They only get it right sometimes so it winds up being easier to just do it yourself in the first place. As the saying goes: bad help is worse than no help at all, and AI is bad help right now.
- motorest ・ a day ago
  
  > My problem is that (in my experience anyways) this is slower than me just writing the code myself.
  In my experience, the only times LLMs slow down your task is when you don't use them effectively. For example, if you provide barely any context or feedback and you prompt a LLM to write you the world, of course it will output unusable results, primarily because it will be forced to interpolate and extrapolate through the missing context.
  If you take the time to learn how to gently prompt a LLM into doing what you need, you'll find out it makes you far more productive.
- JimDabell ・ 2 days ago
  
  > My problem is that (in my experience anyways) this is slower than me just writing the code myself.
  How much experience do you have writing code vs how much experience do you have prompting using AI though? You have to factor in that these tools are new and everybody is still figuring out how to use them effectively.
  
  imiric ・ a day ago
  
  > You have to factor in that these tools are new and everybody is still figuring out how to use them effectively.
  I think that the skills required are highly overblown.
  The user should be aware of what each model excels at, its context size, temperature, and other parameters; how to communicate well, set system prompts and phrase tasks in a clear, succinct yet informative way; how to refocus the session when it veers off track; keep up to date with the latest (<~6mo) concepts and tooling, and so on.
  All of this is trivial for a competent software engineer. The idea that it requires some specialized training that couldn't be attained by experimentation and reading a blog post is absurd. "Prompt engineering" just isn't a thing.
- uludag ・ 2 days ago
  
  I feel this is on point. So not only is there the time lost correcting and testing AI generated code, but there's also the mental model you build of the code when you write it yourself.
  Assuming you want a strong mental model of what the code does and how it works (which you'd use in conversations with stakeholders and architecture discussions for example), writing the code manually, with perhaps minor completion-like AI assistance, may be the optimal approach.
0xbadcafebee ・ a day ago

That's not the million dollar question; anyone who's done any kind of AI coding will tell you it's ridiculously faster. I haven't touched JavaScript, CSS & HTML in like a decade. But I got a whole website created with complex UI interactions in 20 minutes - and no frameworks - by just asking ChatGPT to write stuff for me. And that's the crappy, inefficient way of doing this work. Would have taken me a week to figure out all that. If I'd known how to do it already, and I was very good, perhaps it would have taken the same amount of time? But clearly there is a force-multiplier at work here.
The million dollar question is, what are the unintended, unpredicted consequences of developing this way?
If AI allows me to write code 10x faster, I might end up with 10x more code. Has our ability to review it gotten equally fast? Will the number of bugs multiply? Will there be new classes of bugs? Will we now hire 1 person where we hired 5 before? If that happens, will the 1 person leaving the company become a disaster? How will hiring work (cuz we have such a stellar track record at that...)? Will the changing economics of creating software now make SaaS no longer viable? Or will it make traditional commercial software companies no longer viable? Will the entire global economy change, the way it did with the rise of the first tech industry? Are we seeing a rebirth?
We won't know for sure what the consequences are for a while. But there will be consequences.
motorest ・ a day ago

> This is exactly the direction I expect AI-assisted coding to go in. Not software engineers being kicked out and some business person pressing a few buttons to have a fully functional app (as is playing out in a lot of fantasies on LinkedIn & X), but rather experienced engineers using AI to generate bits of code and then meticulously reviewing and testing them.
There is a middle ground: software engineers being kicked out because now some business person can hand over the task of building the entire OAuth infrastructure to a single inexperienced developer with a Claude account.
- petersellers ・ a day ago
  
  I'm not so sure that would work well in practice. How would the inexperienced developer know that the code created by the AI was correct? What if subtle bugs are introduced that the inexperienced developer didn't catch until it went out into production? What if the developer didn't even know how to debug those problems correctly? Would they know that the code they are writing is maintainable and extensible, or are they just going to generate a new layer of code on top of the old one any time they need a new feature?
  
  motorest ・ a day ago
  ・ 3 more
  
  > I'm not so sure that would work well in practice. How would the inexperienced developer know that the code created by the AI was correct?
  Not a problem. The industry has evolved to tolerate buggy code that barely works. In fact, in some circles that's what's already expected from the baseline. LLMs change nothing in this regard. In fact, they arguably improve upon this problem as it becomes trivial to implement extensive automated test suites.
  > What if subtle bugs are introduced that the inexperienced developer didn't catch until it went out into production?
  That's what is happening in the real world without LLMs entering the picture.
  
  petersellers ・ a day ago
  ・ 2 more
  
  I disagree strongly with this conclusion.
  I've seen firsthand what happens to large software projects that collapse under their own weight of tech debt. The software literally could not function as intended - customers were lost, the product went under. Low quality being "expected" (which isn't true in my experience, either) is irrelevant when the software doesn't work at all.
  The chances of all of that happening are a lot higher with a lone inexperienced engineer at the wheel. You still need experienced engineers to maintain your software, period.
  > That's what is happening in the real world without LLMs entering the picture.
  The difference is that most firms have experienced software engineers to fix those defects.
  
  sensanaty ・ a day ago
  
  > Low quality being "expected" (which isn't true in my experience, either) is irrelevant when the software doesn't work at all.
  Yep, fully agree. We're going through this ourselves at $CURRENT_JOB, where the instability of the platform and product as a whole due to the immensely bad decisions made in the project's past is leading to massive churn from every single customer other than the smallest ones that make us no money anyway.
  And it's not just the customers, the devs are feeling it too. There's constant fires and breakages all over the place because management doesn't care to give us any time to focus on quality, and people (myself included) are getting tired of having to read through some 10kLOC monstrosity that not even God Himself could understand, and it's made worse by the clueless management saying "Have you tried having AI find the bugs for you?" like a bunch of brainless sheep being injected with that sweet ol' VC hype machine.
  Sure, people will put up with some bugs from time to time, and I'm not even saying I could've or do make perfect choices as well. But there's only so many times people will put up with a broken experience before they cut ties and quit, and in this vibe-coded hallucination world we're entering, are people really going to be okay with the products they use day-in, day-out changing behavior drastically every single day based on whatever the AI decided to hallucinate this time around to "fix" that 1 persistent bug that can't seem to die?
  
  intended ・ a day ago
  
  By then the person who suggested the idea has left the firm.
stackskipton ・ 2 days ago

>experienced engineers using AI to generate bits of code and then meticulously reviewing and testing them
And where are supposed to get experienced engineers if replaced all Jr Devs with AI? There is a ton of benefit from drudgery of writing classes even if seems like grunt work at the time.
jstummbillig ・ a day ago

This is not where AI-assisted coding is going. Where it is going is: The AI will quickly become better at avoiding these types of mistakes than humans ever were (and are ever going to be), because they can and thus will be RL'ed away. What will be left standing longest is providing the vision wrt what the actual problem is, you want to solve.
danans ・ 2 days ago

> Not software engineers being kicked out and some business person pressing a few buttons to have a fully functional app (as is playing out in a lot of fantasies on LinkedIn & X)
The theory of enshittification says that "business person pressing a few buttons" approach will be pursued, even if it lowers quality, to save costs, at least until that approach undermines quality so much that it undermines the business model. However, nobody knows how much quality tradeoff tolerance is there to mine.
tkiolp4 ・ 2 days ago

Why is speed important in this context? If the code is published one week/month later, would that affect what exactly? It’s open source.
- kentonv ・ 2 days ago
  
  As it happens, if this were released a month later, it would have been a huge loss for us.
  This OAuth library is a core component of the Workers Remote MCP framework, which we managed to ship the day before the Remote MCP standard dropped.
  And because we were there and ready for customers right at the beginning, a whole lot of people ended up building their MCP servers on us, including some big names:
  https://blog.cloudflare.com/mcp-demo-day/
  (Also if I had spent a month on this instead of a few days, that would be a month I wasn't spending on other things, and I have kind of a lot to do...)
belter ・ 2 days ago

The million-dollar question is not whether you can review at the speed the model is coding. It is whether you can trust review alone to catch everything.
If a robot assembles cars at lightning speed... but occasionally misaligns a bolt, and your only safeguard is a visual inspection afterward, some defects will roll off the assembly line. Human coders prevent many bugs by thinking during assembly.
- pton_xd ・ 2 days ago
  
  > Human coders prevent many bugs by thinking during assembly.
  I'm far from an AI true believer but come on -- human coders write bugs, tons and tons of bugs. According to Peopleware, software has "an average defect density of one to three defects per hundred lines of code"!
  
  belter ・ a day ago
  
  My point is that the bugs generated by LLM or human coders are different.
- chrisweekly ・ 2 days ago
  
  THIS.
  IMHO more rigorous test automation (including fuzzing and related techniques) is needed. Actually that holds whether AI is involved or not, but probably more so if it is.
- Shorn ・ 2 days ago
  
  And yet, doors still fall off airplanes without any AI in sight.
kypro ・ a day ago

> Not software engineers being kicked out and some business person pressing a few buttons to have a fully functional app (as is playing out in a lot of fantasies on LinkedIn & X), but rather experienced engineers using AI to generate bits of code and then meticulously reviewing and testing them.
Why would a human review the code in a few years when AI is far better than the average senior developer? Wouldn't that be as stupid as a human reviewing stockfish's moves in Chess?
hooverd ・ 2 days ago

AI is great for undifferentiated heavy lifting and surfacing knowledge, but by the time I've made all the decisions, I can just write the code that matters myself there.
undefined ・ 2 days ago

[deleted]

c-linkage ・ 2 days ago

I very much appreciate the fact that the OP posted not just the code developed by AI but also posted the prompts.

I have tried to develop some code (typically non-web-based code) with LLMs but never seem to get very far before the hallucinations kick in and drive me mad. Given how many other people claim to have success, I figure maybe I'm just not writing the prompts correctly.

Getting a chance to see the prompts shows I'm not actually that far off.

Perhaps the LLMs don't work great for me because the problems I'm working on a somewhat obscure (currently reverse engineering SAP ABAP code to make a .NET implementation on data hosted in Snowflake) and often quite novel (I'm sure there is an OpenAuth implementation on gitbub somewhere from which the LLM can crib).

8-prime ・ a day ago

This is something that I have noticed as well. As soon as you venture into somewhat obscure fields, the output quality of LLMs drastically drops in my experience.
Side note, reverse engineering SAP ABAP sounds torturous.
- Vicinity9635 ・ 15 hours ago
  
  The worst part isn't even that the quality drops off, it's that the quality drops off but the tone of the responses don't. So hallucinations can start and it's just confidently wrong or even dangerous code and the only way to know better is to be better than the LLM in the first place.
  They might surpass us someday, but we aren't there yet.
theshrike79 ・ a day ago

The usual solution is a multi-tiered one.
First you use any LLM with a large context to write down the plan - preferably in a markdown file with checkboxes "- [ ] Task 1"
Then you can iterate on the plan and ask another LLM more focused on the subject matter to do the tasks one by one, which allows it to work without too much hallucination as the context is more focused.

mtlynch ・ 2 days ago

>In all seriousness, two months ago (January 2025), I (@kentonv) would have agreed.

I'm confused by "I (@kentonv)" means here because kentonv is a different user.[0] Are you saying this is your alt? Or is this a typo/misunderstanding?

Edit: Figured out that most of your post is quoting the README. Consider using > and * characters to clarify.

[0] https://news.ycombinator.com/user?id=kentonv

kentonv ・ 2 days ago

He is quoting from the project readme. I wrote all this text.
- mdaniel ・ 2 days ago
  
  Thanks for weighing in here
  If I might make a suggestion, based on how fast things change, even within a model family, you may benefit from saying Claude what. I was especially cognizant of this given the recent v4 release which (of course) hailed as the second coming. Regardless, you may want to update your readme to say
  It may also be wildly out of scope for including in a project's readme, but knowing which of the bazillions of coding tools you used would also help a tiny bit with this reproduction crises found in every single one of these style threads
  
  kentonv ・ 2 days ago
  ・ 2 more
  
  I believe it's important to say when AI was used so heavily in building a library -- it would feel dishonest to me to claim I wrote it all myself. I also think it's just a pretty interesting thing to know about. So I think it belongs in the readme. (But I'm not making a moral judgment on what anyone else does.)
  It was almost entirely Claude Sonnet 3.7. I agree I should add the version to the readme.
  
  pera ・ 2 days ago
  
  That's interesting. My experience with Sonnet 3.7 early this year was pretty poor: It simply couldn't reach the correct solution alone, even when explaining the issues explicitly. The proposed invalid solution was not too far from the correct one, so you could fix it manually if you knew what you were doing, but then the way the code was structured was not something that I would like to maintain in a real project. All this on top of the usual UX issues like hallucinated APIs. The experience refactoring was even worse.
  I guess your mileage is highly dependent on the domain of your problem? In my case was GIS by the way
  
  diggan ・ 2 days ago
  ・ 5 more
  
  > It may also be wildly out of scope for including in a project's readme
  The entire point of the repository seems to be to invalidate/validate the thesis if LLMs are good enough to be pair programmers right now. Removing it from the README makes no sense in that context.
  
  kentonv ・ 2 days ago
  
  This library is a core component of our MCP framework, it's not just an experiment.
  
  mdaniel ・ 2 days ago
  ・ 3 more
  
  I did consider that, but the repo isn't called "kentonv does a yolo" it's straight-up labeled as a provider library for CF workers under Cloudflare's brand
  Some hair splitting about whether including the Claude stanza is "full disclosure," or "AI advocacy," or just because it's cool
  Anyway, I mentioned the out of scope because if half the readme is about correct usage of the library, and half is about the sausage making, I'd be confused as a reader about whether this was designed to be for real or for funzies
  
  uludag ・ 2 days ago
  ・ 2 more
  
  I found it pretty strange to include in the readme as well. Like, imagine someone relied on fiverr or codementor.io to write this code. It'd be weird to say in the readme "I was fairly skeptical that I could get quality code written on Fiverr, but I tried it and it turns out it was pretty good!"
  My guess is there were some push to doing anything related to AI at the company. I feel a lot of companies are doing this these days.
  
  kentonv ・ a day ago
  
  Other way around. I pushed myself to try this, and then based on the result ended up pushing the company to adopt AI more.
diggan ・ 2 days ago

It's a literal copy-paste from the README, I think it was supposed to be quoted but parent messed it up somehow.
https://github.com/cloudflare/workers-oauth-provider/blob/fe...
dang ・ a day ago

(this comment was originally a reply to https://news.ycombinator.com/item?id=44159167, which summarized the readme in a confusing way.)
undefined ・ 2 days ago

[deleted]
undefined ・ 2 days ago

[deleted]

jes5199 ・ 2 days ago

I’ve been using Claude (via Cursor) on a greenfield project for the last couple months and my observation is:

1. I am much more productive/effective

2. It’s way more cognitively demanding than writing code the old-fashioned way

3. Even over this short timespan, the tools have improved significantly, amplifying both of the points above

pton_xd ・ 2 days ago

This mirrors my experience and those I've talked to.
LLM assisted coding is a way to get stuff done much faster, at a greatly increased mental cost / energy spent. Oddly enough.
- piker ・ 2 days ago
  
  The small dopamine hits you get from "it compiles" are completely automated away, and you're forced to survive on the goal alone. The issues are necessarily complex and require thinking about how the LLM has gotten it subtly wrong.
  Painful, but effective?
- chii ・ a day ago
  
  > Oddly enough.
  i actually dont find that outcome odd at all. The high cognative demand comes from the elimination of spurious busy work that would normally come with coding (things like syntax sugars, framework outline, and such). If an AI takes care of all of these things, and lets an author "code" at the speed of thought, you'd be running your engine at maximum.
  Not to mention the need to also critically look at the generated code to ensure it's actual correctness (hopefully this can also be helped/offloaded by an ai in the future).
diggan ・ 2 days ago

> It’s way more cognitively demanding than writing code the old-fashioned way
How are you using it?
I've been mainly doing "pair programming" with my own agent (using Devstral as of late) and find the reviewing much easier than it would been to literally type all of the code it produces, at least time wise.
I've also tried vibe coding for a bit, and for that I'd agree with you, as you don't have any context if you end up wanting to review something. Basically, if the project was vibe coded from the beginning, it's much harder to get into the codebase.
But when pair programming with the LLM, I already have a built up context, and understand how I want things to be and so on, so reviewing pair programmed code goes a lot faster than reviewing vibe coded code.
- jes5199 ・ 2 days ago
  
  I’ve tried a bunch of things but now I’m mostly using Cursor in agent mode with Claude Sonnet 4, doing small-ish pull-request-sized prompts. I don’t have to review code as carefully as I did with Claude 3.7
  but I’m finding the bottleneck now is architecture design. I end up having these long discussions with chatGPT-o3 about design patterns, sometimes days of thinking, and then relatively quick implementation sessions with Cursor
  
  jcims ・ a day ago
  
  It will be interesting to see if and how all of this improves standards around how we document the architecture and concepts of software.
SkyPuncher ・ 2 days ago

> 2. It’s way more cognitively demanding than writing code the old-fashioned way
Funnily, enough, I find the exact opposite. I feel so much relief that I don't have to waste time figuring out every, single detail. It frees me up to focus on architectural and higher level changes.
- jes5199 ・ 2 days ago
  
  I guess what I mean is, I found the details sort of “mindless” before. Code that I could write in my sleep. Now I only have to do the thinky parts
- layer8 ・ 2 days ago
  
  This means that you fully trust the LLM to get the details right.
undefined ・ 2 days ago

[deleted]

infinitebattery ・ 2 days ago

From this commit: https://github.com/cloudflare/workers-oauth-provider/commit/...

===

"Fix Claude's bug manually. Claude had a bug in the previous commit. I prompted it multiple times to fix the bug but it kept doing the wrong thing.

So this change is manually written by a human.

I also extended the README to discuss the OAuth 2.1 spec problem."

===

This is super relatable to my experience trying to use these AI tools. They can get halfway there and then struggle immensely.

diggan ・ 2 days ago

> They can get halfway there and then struggle immensely.
Restart the conversation from scratch. As soon as you get something incorrect, begin from the beginning.
It seems to me like any mistake in a messages chain/conversation instantly poisons the output afterwards, even if you try to "correct" it.
So if something was wrong at one point, you need to go back to the initial message, and adjust it to clarify the prompt enough so it doesn't make that same mistake again, and regenerate the conversation from there on.
- int_19h ・ 2 days ago
  
  Chatbot UIs really need better support for conversation branching all around. It's very handy to be able to just right-click on any random message in the conversation in LM Studio and say, "branch from here".
  
  diggan ・ 2 days ago
  ・ 3 more
  
  Maybe it's contrarian, maybe it's not, but I don't think Chat UIs are well suited for software engineering/programming at all, we need something completely different. Being able to branch conversations and such would be useful, but probably not for the way I do software. Besides, I'm rarely beyond 3 messages (1 system, 1 user, 1 assistant) in any usage of the chat UIs. Maybe it's more useful to people with different workflows.
  
  int_19h ・ a day ago
  ・ 2 more
  
  I don't see how you'd avoid using chat if you need the bot to work on some bug end-to-end. I usually have many rounds in a chat session, first asking it to identify the overall approach, reviewing and approving that, then one or more rounds for coding, and several more to request edits as needed.
  If you only ever ask it for trivial changes that don't require past context to make sense, then chat is indeed overkill. But we already have different UX approaches for that - e.g. some IDEs watch for specially formatted comments to trigger code generation, so you literally just type what you want right there in the editor, exactly where you want the code to go.
  
  diggan ・ 19 hours ago
  
  Yeah, I'd agree you want to iterate, but I'm not sure the UX of "Log of messages, where some of yours, some are tool calls, others are the assistant" and the workflow of "Add more messages into the log of messages"/"Change existing messages" is the right broad UX for this type of work.
  I'm sorry I can't substantiate it more than that, as my own head is still trying to wrap itself around what I think is needed instead. Still, sounds very "fluffy" even when I read it back myself.
  
  impure-aqua ・ a day ago
  ・ 2 more
  
  Certainly in my version of LM Studio (0.3.15) it has a branch button at the end of every message [0]
  [0] https://i.imgur.com/xZ2Fkn7.png
  
  int_19h ・ a day ago
  
  It does indeed. What I'm saying is that, for some mysterious reason, none of the first-party chatbot apps do that - ChatGPT, Claude, Gemini all lack this feature.
  
  carlosjs23 ・ 2 days ago
  
  AI Studio has this, I usually ask it to plan and I do some rounds of refining until the plan covers all my requirements, then I branch this conversation, a branch for each feature, none of the branches get polluted this way.
- eikenberry ・ 2 days ago
  
  I thought Claude still has a problem generating the same output for the same input? That you can't just rewind and rerun and get to the same point again.
  
  diggan ・ 2 days ago
  
  > I thought Claude still has a problem generating the same output for the same input?
  I haven't used Anthropic's models/software in a long time (months, basically forever in AI ecosystem), so don't know exactly how it works now.
  But last time I used Claude, you could edit the first message, and then re-generate the assistants next message based on your edit. Most of the LLM interfaces has one or another way of doing this, I can't imagine they got rid of that feature.
  What I'm suggesting isn't to use the exact same input (the first message), but rather change it so you remove the chances of something incorrect happening later after that.
  
  throwaway314155 ・ 2 days ago
  ・ 3 more
  
  > can't just rewind and rerun and get to the same point again
  Why would you want to? The whole point of a retry is that your previous conversation attempt went poorly.
  
  eikenberry ・ 2 days ago
  ・ 2 more
  
  Good engineering? You want automated steps to be repeatable so you know your tweak to the previous conversation have the effect you desire. Though using an AI for coding is probably closer in spirit the the art of writing code than the engineering of writing code and art is pretty much unrepeatable by definition.
  
  throwaway314155 ・ 2 days ago
  
  Fair enough. Use the respective API or Google Gemini which will let you set temperature to zero resulting in deterministic output barring FP errors accumulating when paired with non-standard GPU/TPU configurations. Likely not to differ by much in the vast majority of cases though.
- viktorcode ・ 2 days ago
  
  It can be done, but for my environment the sum of all prompts that I end up typing to get the right result ends up being longer than the actual code.
  So now I'm using LLMs as crapshoot machines for generating ideas which I then implement manually
- dingnuts ・ 2 days ago
  
  Can you imagine if Excel worked like this? the formula put out the wrong result, so try again! It's like that scene from The Office where Michael has an accountant "run it again." It's farcical. They have created computers that are bad at math and I will never forgive them.
  Also, each try costs money! You're pulling the lever on a god damned slot machine!
  I will TRY AGAIN with the same prompt when I start getting a refund for my wasted money and time when the model outputs bullshit, otherwise this is all confirmation and sunk cost bias talking, I'm sure if it.
  
  diggan ・ 2 days ago
  
  > Can you imagine if Excel worked like this?
  I mean, why would I imagine that? Who would want that? It's like the argument against legal marijuana, and someone replies "But would you like your pilot to be high when flying?!". Right tool for the right job, clearly when you want 100% certainty then LLMs aren't the tool for that. Just because they're useful for some things don't mean we have to replace everything with them.
  > Also, each try costs money!
  I guess you're using some paid API? Try a different way then. I mostly use the web UI from OpenAI, or Codex lately, or ran locally with my own agent using local weights, neither is "each try costs money" more than writing data to my SSD is costing me money.
  It's not a holy grail some people paint it, and not sure we're across the "productivity threshold" (https://news.ycombinator.com/item?id=44160664) yet, but it's worth trying it out probably before jumping to conclusions. But no one is forcing you either, YMMV and all that.
krooj ・ 2 days ago

The comment in lines 163 - 172 make some claims that are outright false and/or highly A/S dependent, to the point where I question the validity of this post entirely. While it's possible that an A/S can be pseudo-generated based on lots of training data, each implementation makes very specific design choices: i.e.: Auth0's A/S allows for a notion of "leeway" within the scope of refresh token grant flows to account for network conditions, but other A/S implementations may be far more strict in this regard.
My point being: assuming you have RFCs (which leave A LOT to the imagination) and some OSS implementations to train on, each implementation usually has too many highly specific choices made to safely assume an LLM would be able to cobble something together without an amount of oversight effort approaching simply writing the damned thing yourself.
arendtio ・ 2 days ago

One way to mitigate the issue is to use tests or specifications and let the AI find a solution to the spec.
A few months ago, solving such a spec riddle could take a while, and most of the time, the solutions that were produced by long run times were worse than the quick solutions. However, recently the models have become significantly better at solving such riddles, making it fun (depending on how well your use case can be put into specs).
In my experience, sonnet 3.7 represented a significant step forward compared to sonnet 3.5 in this discipline, and Gemini 2.5 Pro was even more impressive. Sonnet 4 makes even fewer mistakes, but it is still necessary to guide the AI through sound software engineering practices (obtaining requirements, discovering technical solutions, designing architecture, writing user stories and specifications, and writing code) to achieve good results.
Edit: And there is another trick: Provide good examples to the AI. Recently, I wanted to create an app with the OpenAI Realtime API and at first it failed miserably, but then I added the most important two pages of the documentation and one of the demo projects into my workspace and just like that it worked (even though für my use-case the API calls had to be use quite differently).
- fxnn ・ 2 days ago
  
  That's one thing where I love Golang. I just tell Aider to `/run go doc github.com/some/package`, and it includes the full signatures in the chat history.
  It's true: often enough AI struggles to use libraries, and doesn't remember the usage correctly. Simply adding the go doc fixed that often.
nicce ・ 2 days ago

I am waiting for studies whether we have just an illusion of production or these actually save man hours in the long term in creation of production-level systems.
mysterydip ・ 2 days ago

This to me is why I think these tools don't have actual understanding, and are instead producing emergent output from pooling an incomprehensibly large set of pattern-recognized data.
- diggan ・ 2 days ago
  
  > these tools don't have actual understanding, and are instead producing emergent output from pooling an incomprehensibly large set of pattern-recognized data
  I mean, bypassing the fact that "actual understanding" doesn't have any consensus about what it is, does it matter if it's "actual understanding" or "kind of understanding", or even "barely understanding", as long as it produces the results you expect?
  
  sceptic123 ・ 2 days ago
  ・ 8 more
  
  > as long as it produces the results you expect?
  But it's more the case of "until it doesn't produce the results you expect" and then what do you do?
  
  JW_00000 ・ a day ago
  
  Then you do that part yourself. You let AI automate the 20/50/80% (*) of work it can, and you now only need to do the remainder manually.
  (*) which one of these it is depends on your case. If you're writing a run-of-the-mill Next.js app, AI will automate 80%; if you're doing something highly specific, it'll be closer to 20%.
  
  diggan ・ 2 days ago
  ・ 4 more
  
  > "until it doesn't produce the results you expect" and then what do you do?
  I'm not sure I understand what you mean. You're asking it to do something, and it doesn't do that?
  
  dingnuts ・ 2 days ago
  ・ 3 more
  
  if you give an LLM a spec with a new language and no examples, it can't write the new language.
  until someone does that, I think we've demonstrated that they do not have understanding or abstract thought. they NEED examples in a way humans do not.
  
  Powdering7082 ・ 2 days ago
  ・ 2 more
  
  https://openreview.net/pdf?id=GTHD2UnDIb
  
  mysterydip ・ 2 days ago
  
  Interesting paper, thanks for sharing. I assume the effectiveness depends greatly on the syntax of the language to be learned (c-like, etc).
  
  seunosewa ・ 2 days ago
  ・ 2 more
  
  Then you teach it. Even humans don't always produce the results we expect.
  
  sceptic123 ・ 3 hours ago
  
  Have you tried that? It generally doesn't go so well.
  In this example there are several commits where you can see they needed to fix the code because they couldn't get (teach) the LLM to generate the required code.
  And there's no memory there, you open a new prompt and it's forgotten everything you said previously.
  
  mysterydip ・ 2 days ago
  
  No, I was not making a critique on its effectiveness at generating usable results. I was responding to what I've seen in several other articles here arguing towards anthropomorphism.
nisegami ・ 2 days ago

Same. But I personally find it a lot easier to do those bits at the end than to begin from a blank file/function, so it's a good match for me.
- SkyPuncher ・ 2 days ago
  
  Same here. Sometimes you just need time to stew in the problem/solution space.
  LLMs let me be ultraproductive upfront then come in at the end to clean up when I have a full understanding.

aeneas_ory ・ a day ago

Very impressive, and at the same time very scary because who knows what security issues are hidden beneath the surface. Not even Claude knows! There is very reliable tooling like https://github.com/ory/hydra readily available that has gone through years of iteration and pentests. There are also lots of libraries - even for NodeJS - that have gone through certification.

In my view this is an antipattern of AI usage and „roll your own crypto“ reborn.

vaidhy ・ a day ago

I think the discussions are also missing another key element. The time in takes to read someone else code is way more mentally tiring.

When I am writing the code, my mind tracks what I have done and the new pieces flow. When I am reading code written by someone else, there is no flow.. I have to track individual pieces and go back and forth on what was done before.

I can see myself using LLMs for short snippets rather than start something top down.

jauntywundrkind ・ 2 days ago

> Again, please check out the commit history -- especially early commits -- to understand how this went.

Direct link to earliest page of history: https://github.com/cloudflare/workers-oauth-provider/commits...

A lot of very explicit & clear prompting, with direct directions to go. Some examples on the first page: https://github.com/cloudflare/workers-oauth-provider/commit/... https://github.com/cloudflare/workers-oauth-provider/commit/...

simonw ・ 2 days ago

The most clearly Claude-written commits are on the first page, this link should get you to them: https://github.com/cloudflare/workers-oauth-provider/commits...

Luker88 ・ a day ago

Hello Cloudflare, impressive result, I did not think things were this advanced.

Still, legal question where I'd like to be wrong: AFAIK (and IANAL) if I use AI to generate images, I can't attach copyright to it.

But the code here is clearly copyrighted to you.

Is that possible because you manually modify the code?

How does it work in examples like this one where you try to have close to all code generated by AI?

kentonv ・ a day ago

I am also not a lawyer, but I believe the law here is yet to be fully settled. Here in the US, there have been lower-court rulings but surely it will go to the supreme court.
There are parts of the library that I did write by hand, which are presumably copyright Cloudflare either way. As for whether the AI-generated parts are, I guess we'll see.
But given the thing is MIT-licensed, it doesn't seem like it matters much in this case?
dvrp ・ a day ago

Did you check the latest documents from copyright.gov? They’re interesting exactly because of what you’re saying
- Luker88 ・ a day ago
  
  I did not, especially seeing as I am not from the USA, so I'd like to have the point of view of a multinational company
  --edit: didn't the same office have a controversy a few weeks ago where AI training was almost declared not-fair-use, and the boss was fired on the spot byt the administration, or something like that?
  Things sounds confusing to me, which is why I'm asking
fastball ・ a day ago

I believe you are wrong about AI-generated images as well.

declan_roberts ・ 2 days ago

Getting a "Too Many Requests" error is kind of hilarious given the company involved.

rcastellotti ・ 2 days ago

same

_tqr3 ・ 2 days ago

I’ve tried building a web app with LLMs before. Two of them went in circles—I'd ask them to fix an infinite loop, they’d remove the code for a feature; I’d ask them to add the feature back, they’d bring back the infinite loop, and so on. The third one kept losing context—after just 2–3 messages, it would rebuild the whole thing differently.

They’ll probably get better, but for now I can safely say I’ve spent more time building and tweaking prompts than getting helpful results.

diggan ・ 2 days ago

Rather than doing that approach which eventually builds up to 10+ messages or more, iterate on your initial prompt and you'll see better results. So if the first prompt correctly fixed the infinite loop, but removed something else, instead of saying "Add that back again", change the initial prompt to include "Don't remove anything else than what's explicitly mentioned" or similar, and you'll either get exactly what you want, or some other issue. Then rinse and repeat until completed.
Eventually you'll build up a somewhat reusable template you can use as a system prompt to guide it exactly how you want.
Basically, you get what you ask for, nothing else and nothing more. If you're unclear, it'll produce unclear outputs, if you didn't mention something, it'll do whatever with that. You have to be really, really explicit about everything.

qsort ・ 2 days ago

I think this is pretty cool, but it doesn't really move my priors that much. Looking at the commit history shows a lot of handholding even in pretty basic situations, but on the other hand they probably saved a lot of time vs. doing everything manually.

tveita ・ 2 days ago

Some examples of prompt exchanges that seem representative:

https://claude-workerd-transcript.pages.dev/oauth-provider-t... ("Total cost: $6.45")!

https://github.com/cloudflare/workers-oauth-provider/commit/...

The first transcript includes the cost, would be interesting to know the ballpark of total Claude spend on this library so far.

This is opportune for me, as I've been looking for a description of AI workflows from people of some presumed competency. You'd think there would be many, but it's hard to find anything reliable amidst all the hype. Is anyone live coding anything but todo lists?

antirez: https://antirez.com/news/144#:~:text=Yesterday%20I%20needed%...

tptacek: https://news.ycombinator.com/item?id=44163292

kentonv ・ 2 days ago

I didn't keep extract track but I'd estimate the total cost of Claude credits to build this library was somewhere around $50, which is pretty negligible compared to the time saved.

kentonv ・ 2 days ago

I'm the author of this library! Or uhhh... the AI prompter, I guess...

I'm also the lead engineer and initial creator of the Cloudflare Workers platform.

--------------

Plug: This library is used as part of the Workers MCP framework. MCP is a protocol that allows you to make APIs available directly to AI agents, so that you can ask the AI to do stuff and it'll call the APIs. If you want to build a remote MCP server, Workers is a great way to do it! See:

https://blog.cloudflare.com/remote-model-context-protocol-se...

https://developers.cloudflare.com/agents/guides/remote-mcp-s...

--------------

OK, personal commentary.

As mentioned in the readme, I was a huge AI skeptic until this project. This changed my mind.

I had also long been rather afraid of the coming future where I mostly review AI-written code. As the lead engineer on Cloudflare Workers since its inception, I do a LOT of code reviews of regular old human-generated code, and it's a slog. Writing code has always been the fun part of the job for me, and so delegating that to AI did not sound like what I wanted.

But after actually trying it, I find it's quite different from reviewing human code. The biggest difference is the feedback loop is much shorter. I prompt the AI and it produces a result within seconds.

My experience is that this actually makes it feels more like I am authoring the code. It feels similarly fun to writing code by hand, except that the AI is exceptionally good at boilerplate and test-writing, which are exactly the parts I find boring. So... I actually like it.

With that said, there's definitely limits on what it can do. This OAuth library was a pretty perfect use case because it's a well-known standard implemented in a well-known language on a well-known platform, so I could pretty much just give it an API spec and it could do what a generative AI does: generate. On the other hand, I've so far found that AI is not very good at refactoring complex code. And a lot of my work on the Workers Runtime ends up being refactoring: any new feature requires a bunch of upfront refactoring to prepare the right abstractions. So I am still writing a lot of code by hand.

I do have to say though: The LLM understands code. I can't deny it. It is not a "stochastic parrot", it is not just repeating things it has seen elsewhere. It looks at the code, understands what it means, explains it to me mostly correctly, and then applies my directions to change it.

davidwu ・ 2 days ago

Thanks for so meticulously documenting the prompts you used and whether or not a commit was done manually or via the AI.
rethab ・ 2 days ago

Fancy! Why are the first twenty commits or so created in the same minute though? Surely you can’t be that fast if you need to prompt for each commit
- kentonv ・ 2 days ago
  
  That's weird! It must be due to a history rewrite I did later on to clean up the repo, removing some files that weren't really part of the project. I didn't realize when I first started the experiment that we'd actually end up releasing the code so I had to go back and clean it up later. I am surprised though that this messed up the timestamps -- usually rebases retain timestamps. I think I used `git filter-branch`, though. Maybe that doesn't retain timestamps.
  
  euiq ・ 2 days ago
  
  I know that `git rebase` changes the committer date while keeping the author date the same, so I'm assuming something similar happened here. For example, many of the early commits have this committer date with varying author dates:
  $ git show --format=fuller 3dafc8f5de6ffe46fb223a75a46a6bd848b6daf8 commit 3dafc8f5de6ffe46fb223a75a46a6bd848b6daf8 Author: Kenton Varda <kenton@cloudflare.com> AuthorDate: Thu Feb 27 17:15:37 2025 -0600 Commit: Kenton Varda <kenton@cloudflare.com> CommitDate: Tue Mar 4 14:48:59 2025 -0600 Add storage schema by Claude.
  GitHub uses the committer date for its history, which is annoying if you rebase frequently; I like to run a non-interactive `git rebase` with `--commmiter-date-is-author-date` in such cases.

gcr ・ 10 hours ago

This library has some pretty bad security bugs. For example, the author forgot to check that redirect_uri — matches one of the URLs listed during client registration.

The CVE is uncharacteristically scornful: https://nvd.nist.gov/vuln/detail/cve-2025-4143

I’m glad this was patched, but it is a bit worrying for something “not vibe coded” tbh

mmaunder ・ 2 days ago

Claude 4 in agent mode is incredible. Nothing compares. But you need to have a deep technical understanding of what you’re building and how to split it into achievable milestones and build on each one. It also helps to provide it with URLs with specs, standards, protocols, RFCs etc that are applicable and then tell it what to use from the docs.

multimoon ・ 2 days ago

I think this reinforces that “vibecoding” is silly and won’t survive. It still needed immensely skilled programmers to work with it and check its output, and fix several bugs it refused to fix.

Like anything else it will be a tool to speed up a task, but never do the task on its own without supervision or someone who can already do the task themselves, since at a minimum they have to already understand how the service is to work. You might be able to get by to make things like a basic website, but tools have existed to autogenerate stuff like that for a decade.

NitpickLawyer ・ 2 days ago

I don't think it does. Vibecoding is currently best suited for low-stakes stuff. Get a gui up, crud stuff, write an app for a silly one time use, etc. There's a ton of usage there. And it's putting that power in the hands of people that didn't have the capabilities before.
This isn't vibecoding. This is LLM-assisted coding.
subarctic ・ 2 days ago

I get the sense that "vibecoding" is used like a strawman these days, something people keep moving the goal posts on so they can keep saying it's silly. Getting an LLM to write code for you that mostly works with some tweaks is vibe coding, isn't it?
- Izkata ・ a day ago
  
  No. Vibe coding is never even looking at the code and using the LLM as the only interface.

weinzierl ・ 2 days ago

"I thoughts LLMs were glorified Markov chain generators"

"the code actually looked pretty good. Not perfect, but I just told the AI to fix things, and it did. I was shocked."

These two views are by no means mutually exclusive. I find LLMs extremely useful and still believe they are glorified Markov generators.

The take away should be that that is all you need and humans likely are nothing more than that.

kentonv ・ 2 days ago

I suppose it's all a continuum and we can each have different opinions on what the threshold for "glorified markov generator" is.
But there have been many cases in my experience where the LLM could not possibly have been simply pattern-matching to something it had seen before. It really did "understand" the meaning of the code by any definition that makes sense to me.
- palata ・ 2 days ago
  
  > It really did "understand" the meaning of the code by any definition that makes sense to me.
  I find it dangerous to say it "understands". People are fast to say it "is sentient by any definition that makes sense to them".
  Also, would we say that a compiler "understands" the meaning of the code?
smallnix ・ 2 days ago

> humans likely are nothing more than that
Relevant post: https://news.ycombinator.com/item?id=44089156
bufferoverflow ・ 2 days ago

> I find LLMs extremely useful and still believe they are glorified Markov generators.
Then you should be able to make a markov chain generator without deep neural nets, and it should be on the same level of performance as current LLMs.
But we both know you can't.
- ronsor ・ 2 days ago
  
  You can, but it will require far more memory than any computer has.
Flemlo ・ 2 days ago

The way the input doesn't match the output should imply that it's not just statistics.
As soon as compression happens, optimization happens which can lead to rules/learning of principles which got feed by statistics.
- immibis ・ 2 days ago
  
  That's "just" more statistics though.
  
  Flemlo ・ a day ago
  ・ 3 more
  
  Are you good in math definitions or is this an opinion?
  For me a compressed model learning rules through statistics is not statistics anymore.
  Physic rules are not statistics.
  
  immibis ・ 20 hours ago
  ・ 2 more
  
  Of course they are. Force has a strong correlation with mass times acceleration. Objects at rest have a high chance of being observed to remain at rest. And so on.
  
  Flemlo ・ 18 hours ago
  
  Statistic is not the same as constant equations.

hattmall ・ 2 days ago

I guess for me the questions is, at what point do you feel it would be reasonable to this without the experts involved in your case?

As an edit, after reading some of the prompts, what is the likelihood that a non-expert could even come up with those prompts?

The really really interesting thing would be if an AI could actually generate the prompts.

kentonv ・ 2 days ago

(I'm the author of this library -- or, the guy who prompted the AI at least.)
I absolutely would not vibe code an OAuth implementation! Or any other production code at Cloudflare. We've been using more AI internally, but made this rule very clear: the human engineer directing the AI must fully understand and take responsibility for any code which the AI has written.
I do think vibe coding can be really useful in low-stakes environments, though. I vibe-coded an Android app to use as a baby monitor (it just streams audio from a Unifi camera in the kid's room). I had no previous Android experience, and it would have taken me weeks to learn without AI, but it only took a few hours with AI.
I think we are in desperate need of safe vibe coding environments where code runs in a sandbox with security policies that make it impossible to screw up. That would enable a whole lot of people to vibe-code personal apps for personal use cases. It happens I have some background building such platforms...
But those guardrails only really make sense at the application level. At the systems level, I don't think this is possible. AI is not smart enough yet to build systems without serious bugs and security issues. So human experts are still going to be necessary for a while there.
- diggan ・ 2 days ago
  
  > I think we are in desperate need of safe vibe coding environments where code runs in a sandbox with security policies that make it impossible to screw up.
  OpenAI's new Rust version of Codex might be of interest, haven't dived deeper into the codebase but seems they're thinking about sandboxing from the get-go: https://github.com/openai/codex/blob/7896b1089dbf702dd079299...
- freedomben ・ 2 days ago
  
  What tools did you use for the vibe coding an Android app? And was it able to do the UI stuff too?
  I've wanted to do this but am not sure how to get started. For example, should I generate a new app in Android Studio and then point Claude Code at it? Or can I ask Claude Code (or another agent) to start it from scratch? (in the past that did not work, but I'm curious if it's just a PEBKAC error)
  
  kentonv ・ 2 days ago
  ・ 2 more
  
  I used Claude Code. I actually just asked it what tools I needed for a CLI-driven build, and it told me what to install (or even installed it for me in some cases). I basically didn't read any documentation, just asked Claude what I should do.
  
  freedomben ・ 2 days ago
  
  Amazing, thank you!
  Edit: Holy shit, in 30 minutes I used Claude code to make a simple PDF viewer app, and it totally works. I did have to prompt it through the process quite a bit, including correcting some obvious flubs, but I'm super impressed.
  I didn't even have to install the android dev tools because I asked it to generate a Dockerfile in which to do the build, and a simple script to copy the apk out when done :-D
- aerhardt ・ 18 hours ago
  
  > I do think vibe coding can be really useful in low-stakes environments, though. I vibe-coded an Android app to use as a baby monitor (it just streams audio from a Unifi camera in the kid's room). I had no previous Android experience, and it would have taken me weeks to learn without AI, but it only took a few hours with AI.
  Bro, you're still an engineer at Cloudflare!
  One problem I see with "vibe coding" is how it means one thing if Ilya Sutskever says it, and another if a non-tech executive parrots it and imagines "citizens developers" coding their own business apps.
rangerelf ・ 2 days ago

> I guess for me the questions is, at what point do you feel it would be reasonable to this without the experts involved in your case?
I don't know if it was the intent but these kind of questions bother me, the seem to hint at an agenda, "when can I have a farm of idiots with keyboards paid minimum wage churn out products indistinguishable from expertly designed applications".
To me that's the danger of AI, not it's purported intelligence, but our manifested greed.
- hattmall ・ 2 days ago
  
  Yeah, I mean that is definitely the intent of the question and it's absolutely one of the factors that's driving money into AI.
  Assisting competent engineers certainly has value, but it's not an easy calculation to assess that value compared to the actual non-subsidized cost of AI's current state.
  On the other hand having a farm of idiots, or even no idiots at all, just computers, churning out high quality applications is a completely different value proposition.
dkdcio ・ 2 days ago

Why do you need a non-expert? We built on layers of abstractions, AI will help you at whichever layer you're the "expert" at. Of course you'll need to understand low-level stuff to work on low-level code
i.e. I might not use AI to build an OAuth library, but I might use AI to build a web app (which I am an expert at) that may use an OAuth library Cloudfare developed (which theya are experts at). Trying to make "anyone" code "anything" doesn't seem like the point to me
nisegami ・ 2 days ago

GP is just quoting the readme, they aren't the author.
My 2 cents:
>I guess for me the questions is, at what point do you feel it would be reasonable to this without the experts involved in your case?
No sooner and no later than we could say the same thing about a junior developer. In essence, if you can't validate the code produced by a LLM then you shouldn't really have been writing that code to begin with.
>The really really interesting thing would be if an AI could actually generate the prompts.
I think you've hit on something that is going underexplored right now in my opinion. Orchestration of AI agents, where a we have a high level planning agent delegating subtasks to more specialized agents to perform them and report back. I think an approach like that could help avoid context saturation for longer tasks. Cline / Aider / Roo Code / etc do something like this with architect mode vs coding mode but I think it can be generalized.

freedomben ・ 2 days ago

On a meta-note, it's (seriously) kind of refreshing to see that other people make this same typo when trying to type Cloudflare. I also often write CLoudflare, Cloudlfare, and Cloudfare:

> Cloudlflare builds OAuth with Claude and publishes all the prompts

kiitos ・ 11 hours ago

Is this not... embarrassing? to the engineers who submit these commits?

It seems that way to me...

Certainly if I were on a hiring panel for anyone who had this kind of stuff in their Google search results, it would be a hard-no from me -- but what do i know?

eGQjxkKF6fif ・ 2 days ago

Looking at all of these arguments and viewpoints really is something to witness.

Congratulations Cloudflare, and thank you for showing that a pioneer, and leader in the internet security space can use the new methods of 'vibe coding' to build something that connects people in amazing ways, and that you can use these prompts, code, etc to help teach others to seek further in their exploration of programming developments.

Vibe programming has allowed me to break through depression and edit and code the way I know how to do; it is a helpful and very meaningful to me. I hope that, it can be meaningful for others.

I envision the current generation and future generations of people to utilize these things; but we need to accept, that this way of engineering, developing things, creation, is paving a new way for peoples.

Not a single comment in here is about people traumatized, broken, depressed, or have a legitimate reason for vibe coding.

These things assist us, as human beings; we need to be mindful that it isn't always about us. How can we utilize these things to to the betterment of the things we are passionate about? I humbly look forward to seeing how projects in the open source space can showcase not only developmental talent, but the ability to reason and use logic and project building thoughtfulness to use these tools to build.

Good job, Cloudflare.

nop_slide ・ 2 days ago

It literally says in the post it’s not “vibe coded”. That has a very specific meaning of not reviewing the code at all and accepting everything.

zeroq ・ a day ago

Holy cow!

My latest try with Gemini went like this:

  - Write me a simple todo app on CloudFlare with auth0 authentication.
  - Let's proceed with a simple todo app on CloudFlare. We start by importing the @auth0-cloudflare and...
  - Does that @auth0-cloudflare actually exists?
  - Oh, it doesn't. I can give you a walkthrough on how to set up an account on auth0. Would you like me to?
  - Yes, please.
  - Here. I'm going to write the walkthrough in a document... (proceed to create an empty document)
  - That seems to be an empty document.
  - Oh, my bad. I'll produce it once more. (proceed to create another empty document)
  - Seems like you're md parsing library is broken, can you write it in chat instead?
  - Yes... (Your Gemini trial has expired. Would you like to pay $100 to continue?)

My idea was to try the new model with a low hanging fruit - as kentov mentioned, it's a very basic task that has been made thousand of times on the internet with extremely well documented APIs (officially and on reddit/stackoverflow/etc.).

Sure, it was a short hike before my trial expired, and kentov himself admited it took him couple of days to put it together, but... holy cow.

lapcat ・ a day ago

If my future career consists of constantly prompting and code-reviewing a semi-competent, nonhuman coder in order to eventually produce something decent, then I want no part in that future, even if it's more "efficient" in the sense of taking less time overall. That sounds extremely frustrating, personally unrewarding, alienating. I've read the prompts and the commit messages, and to be honest, I don't have the patience to deal with a Claude-level coder. I'd be yelling at the idiot and shaking my fists the whole time. I'd rather just take more time and write the code myself. It's much more pleasant that way. This future of A.I. work sounds like a dystopia to me. I didn't sign up for that. I never wanted to be a glorified babysitter.

It feels infinitely worse than mentoring an inexperienced engineer, because Claude is inhuman. There's no personal relationship, it doesn't make human mistakes or achieve human successes, and if Claude happens to get better in the future, that's not because you personally taught it anything. And you certainly can't become friends.

They want to turn artists and craftsmen into assembly line supervisors.

chii ・ a day ago

> They want to turn artists and craftsmen into assembly line supervisors.
the same was uttered by blacksmiths and other craftsman who has been displaced by technology. Yet they are mercilessly crushed.
Your enjoyment of a job is not a consideration to those paying you to do it; and if there's a more efficient way, it will be adopted. The idea that your job is your identity may be at fault here - and when someone's identity is being threatened (as it very much is right now with these new AI tools), they respond very negatively.
- lapcat ・ a day ago
  
  > the same was uttered by blacksmiths and other craftsman who has been displaced by technology. Yet they are mercilessly crushed.
  This is misleading. The job of blacksmith wasn't automated away. There's just no demand for their services anymore, because we no longer have knights wearing armor, brandishing swords, and riding horses. In contrast, computer software is not disappearing; if anything, it's becoming ubiquitous.
  > Your enjoyment of a job is not a consideration to those paying you to do it
  But it is a consideration to me in offering my services. And everyone admits that even with LLMs and agents, experienced senior developers are crucial to keep the whole process from falling into utter crap and failure. Claude can't supervise itself.
  > The idea that your job is your identity may be at fault here
  No, it's just about not wanting to spend a large portion of my waking hours doing something I hate.
  
  aerhardt ・ 18 hours ago
  ・ 2 more
  
  > This is misleading. The job of blacksmith wasn't automated away. There's just no demand for their services anymore, because we no longer have knights wearing armor, brandishing swords, and riding horses. In contrast, computer software is not disappearing; if anything, it's becoming ubiquitous.
  Why didn't blacksmiths produce rail tracks, if not because they were replaced by more efficient processes? One could say iron and steel became as ubiquitous during the Industrial Revolution as computer software is becoming today...
  
  lapcat ・ 18 hours ago
  
  > Why didn't blacksmiths produce rail tracks
  This is really a silly and pointless discussion, as well as totally irrelevant.
  The linked article made very clear that Claude had to be closely, strictly guided and supervised by expert software developers. Claude is not threatening them with extinction.
mattgreenrocks ・ a day ago

I deeply resent the notion that we engineers should let non-engineers tell us how to achieve agreed-upon objectives (e.g. "use LLMs more!"). I'm happy to use LLMs when they are useful. If I have to babysit them excessively, then it's a double loss: I'm not accruing domain knowledge, and I'm wasting time. The contract of work I was sold in the early 2000s: decision makers specify what should be be built, and what the time constraints are. This bounds the space of possibilities along with the local engineering culture. I bear the responsibility of execution, clarifying requirements, and bringing up potential issues sooner rather than later.
However, at no point was the exact technical approach prescribed to me. It'd be asinine if someone came to me and said, "you need to be using VSCode, not vim." It's irrelevant to execution. Yet, that's exactly what's happening with LLMs.
The denial of agency to devs via prescriptive LLM edicts will only end badly.
- aerhardt ・ 18 hours ago
  
  > I deeply resent the notion that we engineers should let non-engineers tell us how to achieve agreed-upon objectives
  This is not how it works. The technology will prevail or not based on whether people using it are noticeably more efficient using it, not the whims of your CEO - nor yours!
  You then make an argument as to why you think the net gain will not be positive, which is fine, but that crucial question is what everything hinges on.

alienbaby ・ 19 hours ago

Reading the authors comments on the github page I can relate. Over this paast weekend I attempted to use copilot to write some code for a home project and expected it to be terrible, like the last time I tried.

Except, this time it wasn't. It got most things right first time, and fixed things I asked it to.

I was pleasantly surprised.

dang ・ a day ago

We changed the URL from https://github.com/cloudflare/workers-oauth-provider/commits... to the project page.

thih9 ・ 2 days ago

Congrats and thanks for sharing, both the code and the story.

Which Claude plan did you use? Was it enough or did you feel limited by the quotas?

kentonv ・ 2 days ago

This was mostly Claude Code, which runs on API credits. I think I spent a two-digit number of dollars. The model was Sonnet 3.7 (this was all a couple months ago, before Claude 4).

alanfranz ・ 2 days ago

Carefully reviewed greenfield project; I don’t think this is astonishing, and I very much love they recorded the prompts.

Question is: will this work for non-greenfield projects as well? Usually 95% of work in a lifetime is not greenfield.

Or will we throw away more and more code as we go, since AI will rewrite it, and we’ll probably introduce subtle bugs as we go?

DaiPlusPlus ・ 2 days ago

> Question is: will this work for non-greenfield projects as well?
Depends on the project. Word on the street is the closer your project is to an archetypical React tutorial TODO App then you'll likely be pleased with the results. Whereas if your project is a WDK driver in Rust where every file is a minefield then you'll spend the next few evenings having to audit everything with a fine toothed comb.
> since AI will rewrite it
That depends if you believe in documentation-first or program-first definitions of a specification.

ookblah ・ 2 days ago

AI critics always have to make strawmen arguments about how there has to be a human in the loop to "fix" things when that's never been the argument AI proponents ever make (at least those who deal with it day to day). This will only get better with time. AI can frequently one-shot throwaway scripts that I need get things done. For actual features I typically start and have it go thru the initial slog and then finish it off. You must be reviewing the entire time, but it takes a huge cognitive load off. You can rubber-duck debug with it.

I do agree if you have no idea what you are doing or are still learning it could be a detriment, but like anything it's just a tool. I feel for junior devs and the future. Lazy coders get lazier, those who utilize them to the fullest extent get even better, just like with any tech.

skydhash ・ 2 days ago

The one thing about concocting throwaway scripts yourself is the increased familiarity with the tooling you use. And you're not actually throwing away those scripts. I have random scripts laying around my file system (and my shell history) to check how I did a task in the past.
- NitpickLawyer ・ 2 days ago
  
  > increased familiarity with the tooling you use
  In general I agree, but sometimes you want something that you haven't done in years but vaguely remember.
  ~20 years ago I worked with ffmpeg and vlc extensively in an IPTV project. It took me months to RTFM, implement stuff, test and so on. Documentation was king, and really the only thing I could use. Old-school. But after that project I moved on.
  In 2018 I worked on a ML - CV project. I knew vlc / ffmpeg could do everything that I needed, but I had forgotten most of everything by then. So I googled/so/random-blogs, plus a bit of RTFM where things didn't match. But it still took a few days to cobble together the thing I needed.
  Now I just ask, and the perfect one-liner pops-up, I run it, check that it does what I need it to, and go on my merry way. Verification is much faster than context changing, searching, reading, understanding, testing it out, using a work-around for the features that ffmpeg supports but not that python wrapper, and so on.
- bongodongobob ・ 2 days ago
  
  I used to do that too. I find I don't really need to save anything less than 100 lines these days because I can just ask again when I need it.

skybrian ・ 2 days ago

Looking at the commit history, there’s a fair bit of manual intervention to fix bugs and remove unused code.

wooque ・ 2 days ago

Not surprised, this is perfect task for AI, boilerplaty code that implements something that is implemented 100 times. And it's small project, 1200 lines of pure code.

I'm surprised I took them more than 2 days to do that with AI.

EtienneK ・ 2 days ago

> This is a TypeScript library that implements the provider side of the OAuth 2.1 protocol with PKCE support.

What is the "provider" side? OAuth 2.1 has no definition of a "provider". Is this for Clients? Resource Servers? Authorization Server?

Quickly skimming the rest of the README it seems this is for creating a mix of a Client and a Resource Server, but I could be mistaken.

> To emphasize, this is not "vibe coded". Every line was thoroughly reviewed and cross-referenced with relevant RFCs, by security experts with previous experience with those RFCs

Experience with the RFCs but have not been able to correctly name it.

kentonv ・ 2 days ago

This library helps implement both the resource server and authorization server. Most people understand these two things to be, collectively, the "provider" side of OAuth -- the service provider, who is providing an API that requires authorization. The intent when using this library is that you write one Worker that does both. This library has no use on the client side.
This is intended for building lightweight services quickly. Historically there has been no real need for "lightweight" OAuth providers -- if you were big enough that people wanted to connect to you using OAuth, you were not lightweight. MCP has sort of changed that as the "big" side of an MCP interaction is the client side (the LLM provider), whereas lots of people want to create all kinds of little MCP servers to do all kinds of little things. But MCP specifies OAuth as the authentication mechanism. So now people need to be able to implement OAuth from the provider side easily.
> Experience with the RFCs but have not been able to correctly name it.
These docs are written for people building MCP servers, most of whom only know they want to expose an API to AIs and have never read OAuth RFCs. They do not know or care about the difference between an authorization server and a resource server.
- EtienneK ・ a day ago
  
  > Most people understand these two things to be, collectively, the "provider" side of OAuth
  Citation needed. As another commenter already noted, the term "Provider" is rarely used in OAuth itself. When it is mentioned, it's typically in the context of OpenID Connect, where it refers specifically to the Authorization Server - not the Resource Server.
  > the service provider, who is providing an API that requires authorization
  That’s actually the Resource Server.
  I understand that the current MCP spec [1] merges the Authorization Server and Resource Server roles, similar to what your library does. However, there are strong reasons to keep these roles separate [2].
  In fact, the MCP spec authors acknowledge this [3], and the latest draft [4] makes implementing an Authorization Server optional for MCP services.
  That’s why I’m being particular about clearly naming the roles your library supports in the OAuth flow. Going forward, MCP servers will always act as OAuth Resource Servers, but will only optionally act as Authorization Servers. Your library should make that distinction explicit.
  [1]: https://modelcontextprotocol.io/specification/2025-03-26/bas...
  [2]: https://aaronparecki.com/2025/04/03/15/oauth-for-model-conte...
  [3]: https://github.com/modelcontextprotocol/modelcontextprotocol...
  [4]: https://modelcontextprotocol.io/specification/draft/basic/au...
DaiPlusPlus ・ 2 days ago

> OAuth 2.1 has no definition of a "provider"
Strictly speaking, yes. But speaking of IDPs more broadly, it’s perfectly acceptable to refer to the authorisation-server as an auth-provider, especially in OIDC (which is OAuth, with extensions) where it’s explicitly called “OpenID provider” - so it’s natural for anyone well-versed in both to cross terminology like that.

abroadwin ・ 2 days ago

Oh hey, looks like it's mostly Kenton Varda, who you may recognize from his LAN party house: https://news.ycombinator.com/item?id=42156977

davidjfelix ・ 2 days ago

Or Cap'n'Proto, Protobuf, Cloudflare workers, Cloudflare Durable Objects. The LAN house is cool too.

scherlock ・ a day ago

Is it really good form in TypeScript to make all functions async, even when functions don't use await? like this, https://github.com/cloudflare/workers-oauth-provider/blob/fe...

kentonv ・ a day ago

env.OAUTH_KV.get(...) is reading from Workers KV storage. It returns a promise.
ccorcos ・ a day ago

No. It’s possible env[?].get returns a promise though

undefined ・ 2 days ago

[deleted]

_pdp_ ・ 2 days ago

It is a single file with 2630 locs and it is a straightforward problem. 1/3 of the code is just interface definitions and comments.

topspin ・ a day ago

Been here many times:

    This time Claude fixed the problem, but:
    - It also re-ordered some declarations, even though I told it not to. AFAICT they aren't changed, just reordered, and it also added some doc comments.
    - It fixed an unrelated bug, which is that `getClient()` was marked `private` in `OAuthProvider` but was being called from inside `OAuthHelpers`. I hadn't noticed this before, but it was indeed a bug.

Frequently can't get LLMs to limit themselves to what has been prompted, and instead they run around and "best practice" everything, "fixing" unrelated issues, spewing commentary everywhere, and creating huge, unnecessary diffs.

dboreham ・ a day ago

Tbf I've seen human developers do this and similar irritating things many times.

DJBunnies ・ 2 days ago

I feel like well defined RFCs and standards are easily coded against, and I question the investment/value/time tradeoff here. These things happily regurgitate training data, but seriously struggle when they don’t have a pool of perfect examples to pull from.

When Claude can do something new, then I think it will be impressive.

Otherwise it’s just piecing together existing examples.

ab_testing ・ a day ago

Sorry this might be a dumb question but where are the prompts in the source code ? I was thinking like I prompt ChatGPT and it prints some code , there would be similar prompts. Is the readme the prompt?

animex ・ a day ago

https://github.com/cloudflare/workers-oauth-provider/commits...
Start at the bottom...they are in the commit messages, or sometimes the .md file
undefined ・ a day ago

[deleted]

zackify ・ a day ago

Oauth isn’t that complicated. It’s not a surprise to see an llm build out from the spec. Honestly I was playing around writing a low level implementation recently just for fun as I built out my first oauth mcp server.

Bluestein ・ 21 hours ago

From the docs:

> "NOOOOOOOO!!!! You can't just use an LLM to write an auth library!"

> "haha gpus go brrr"

ZiiS ・ 2 days ago

Shouldn't they really have asked it to read https://developers.cloudflare.com/workers/examples/protect-a...

kentonv ・ 2 days ago

The secret token is hashed first, and it's the hash that is looked up in storage. In this arrangement, an attacker cannot use timing to determine the correct value byte-by-byte, because any change to the secret token is expected to randomize the whole hash. So, timing-safe equality is not needed.
That said, if you have spotted a place in the code where you believe there is such a vulnerability, please do report it. Disclosure guidelines are at: https://github.com/cloudflare/workers-oauth-provider/blob/ma...
- ZiiS ・ 2 days ago
  
  I am not confident enough in this area to to report a vunrability, the networking alone probably makes timing impractical. I thought it was now practical to generate known prefix Sha256, so some information could be extracted? Not enough to compromise but the function is right there.
  
  kentonv ・ 2 days ago
  ・ 2 more
  
  Learning a prefix of the hash doesn't really get you anywhere. The hash itself isn't a secret -- it could be published publicly without breaking the security model. You still need to derive a token that hashes to that value in full, and if you can do that then you've broken the hash algorithm by definition.
  
  ZiiS ・ 2 days ago
  
  Yes I guess if you trust the hash implementation completly; I just favour a bit more defence in depth.

catigula ・ 2 days ago

When I want to spend a dollar or two, it's much faster to just instruct Claude on how to write my code and prompt/correct it than to write it myself.

It feels probably similarly from going from dumb or semi-dumb text editor to an IDE.

vjerancrnjak ・ 2 days ago

I like how it is just 1 file.

Wonder how well incremental editing works with such a big file. I keep pushing for 1 file implementations, yet people split it up into bazillion files because it works better with AI.

weird-eye-issue ・ a day ago

Unfortunately Claude Code falls apart as soon as you hit 25k tokens in a single file. It's a hard coded limit where they will no longer put the full file into the prompt so it's up to the model to read it chunk by chunk or by using search tools

jwally ・ a day ago

fwiw, I feel like LLM code generation is scaffolding on steroids, and strapped to a rocket. A godsend if you know what you're doing, but really easy to blow yourself up if you complacent. At least with where models are today; imho.

mehdibl ・ 2 days ago

That's great.

But Claude don't allow yet to add APPS in their backend.

Mainly only closed beta for integration.

How you can configure an app to leverage correctly Oauth and have your own app secret ID/ Client ID!

jbeus ・ 2 days ago

Getting rate limited…Cloudflare, do you you think you can help with caching and load balancing for github?

jonplackett ・ 2 days ago

This doesn’t seem like much of a surprise that it’s possible - if you are a security expert, you can make LLMs write secure code.

rienbdj ・ 2 days ago

Why not use an existing OAuth library?

helsinki ・ a day ago

I don’t see any prompts?

sensanaty ・ a day ago

The expanded commit messages have the prompts

caycep ・ 2 days ago

tbh I would find it annoying to have to go audit someone else (i.e. an LLM's) code...

Also, maybe the humbling question is, maybe we humans aren't so exceptional if 90% of the sum of human knowledge can be predicted by next-word-prediction

horacemorace ・ 2 days ago

I did the same thing a few months ago with 4o. This stuff works fine if done with care.

unshavedyak ・ 2 days ago

Yup. I'm more skeptic than pro-AI these days, but nonetheless i'm still trying to use AI in my workflows.

I don't actually enjoy it, i generally find it difficult to use as i have more trouble explaining what i want than actually just doing it. However it seems clear that this is not going away and to some degree it's "the future". I suspect it's better to learn the new tools of my craft than to be caught unaware.

With that said i still think we're in the infancy of actual tooling around this stuff though. I'm always interested to see novel UXs on this front.

qsort ・ 2 days ago

Probably unrelated to the broader discussion, but I don't think the "skeptic vs pro-AI" distinction even makes that much sense.
For example, I usually come off as being relatively skeptic within the HN crowd, but I'm actually pushing for more usage at work. This kind of "opinion arbitrage" is common with new technologies.
- steveklabnik ・ 2 days ago
  
  One recent post I read about improving the discourse (which I seem to have lost the link...) agrees, but in a different way: adding a "capable vs not" axis. that is, "I believe AI is good enough to replace humans, and I am pro" is different than "I believe AI is good enough to replace humans, and I am against" and while "I believe AI is not good enough to replace humans, and I am pro" is a weird position to take, "I believe AI is not good enough to replace humans, and I am against."
  These things are also not binary, they're a full grid of space.
  
  jakeydus ・ 2 days ago
  ・ 2 more
  
  > "I believe AI is not good enough to replace humans, and I am pro" is a weird position to take
  I think that's just the opinion of someone who doesn't think AI currently lives up to the hype but is optimistic about developing it further, not really that weird of a position in my opinion.
  Personally I'm moving more into the "I think AI is good enough to replace humans, and I am against" category.
  
  steveklabnik ・ 2 days ago
  
  Yeah, I meant like, at the full extreme of "and it never will". Someone with the position you describe wouldn't be at the far end, but somewhere closer to the middle.
  
  baq ・ 2 days ago
  ・ 3 more
  
  > "I believe AI is not good enough to replace humans, and I am pro" is a weird position to take
  Huh? The recipe how to be in this position is literally in the readme of the linked project. You don’t even have to believe it, you just have to work it.
  
  steveklabnik ・ 2 days ago
  ・ 2 more
  
  I mean at the most extreme: that it can NEVER do so. Someone who holds this position would point to commits like https://news.ycombinator.com/item?id=44159659
  
  baq ・ a day ago
  
  To that I can only respond with never say never. Not this year? Yes. Not next year? Sign me up. Not in the next 10 years? Let’s say I’m looking at my hardware career options after 20 years in software.
  
  immibis ・ 2 days ago
  
  I believe compilers are not good enough to replace humans, and I am pro
- unshavedyak ・ 2 days ago
  
  > but I don't think the "skeptic vs pro-AI" distinction even makes that much sense.
  Imo it does, because it frames the underlying assumptions around your comment. Ie there was some very pro-AI folks who think it's not just going to replace everything, but already is. That's an extreme example of course.
  I view it as valuable anytime there's extreme hype, party lines, etc. If you don't frame it yourself, others will and can misunderstand your comment when viewed through the wrong lens.
  Not a big deal of course, but neither is putting a qualifier on a comment.
- aerhardt ・ 18 hours ago
  
  Someone has mentioned in another thread that there is an increasing divide between "discourse skeptics" and fundamental skeptics, more like luddites.
- diggan ・ 2 days ago
  
  > but I don't think the "skeptic vs pro-AI" distinction even makes that much sense
  Tends to be like that with subjects once feelings get involved. Make any skepticism public, even if you don't feel strongly either way, and you get one side of extremists yelling at you about X. At the same time, say anything positive and you get the zealots from the other side yelling at you about Y.
  Us who tend to be not so extremist gets push back from both sides, either in the same conversations or in different places, while both see you as belonging to "the other side" while in reality you're just trying to take a somewhat balanced approach.
  These "us vs them" never made sense to me, for (almost) any topic. Truth usually sits somewhere around the middle, and a balanced approach seems to usually result in more benefits overall, at least personally for me.
thewebguyd ・ 2 days ago

> I don't actually enjoy it, i generally find it difficult to use as i have more trouble explaining what i want than actually just doing it.
This is my problem I run into quite frequently. I have more trouble trying to explain computing or architectural concepts in natural language to the AI than I do just coding the damn thing in the first place. There are many reasons we don't program in natural language, and this is one of them.
I've never found natural language tools easier to use, in any iteration of them, and so I get no joy out of prompting AI. Outside of the increasingly excellent autocomplete, I find it actually slows me down to try and prompt "correctly."
cheema33 ・ a day ago

> I don't actually enjoy it, i generally find it difficult to use as i have more trouble explaining what i want than actually just doing it.
Most people who are good at a skill start here with AI. Over time, they learn how to explain things better to AI. And the output from AI improves significantly as a result. Additionally AI models keep improving over time as well.
If you stay with it, you will reach an inflection point soon enough and will actually start enjoying it.

stego-tech ・ 2 days ago

On the one hand, I would expect LLMs to be able to crank out such code when prompted by skilled engineers who also understand prompting these tools correctly. OAuth isn’t new, has tons of working examples to steal as training data from public projects, and in a variety of existing languages to suit most use cases or needs.

On the other hand, where I remain a skeptic is this constant banging-on that somehow this will translate into entirely new things - research, materials science, economies, inventions, etc - because that requires learning “in real time” from information sources you’re literally generating in that moment, not decades of Stack Overflow responses without context. That has been bandied about for years, with no evidence to show for it beyond specifically cherry-picked examples, often from highly-controlled environments.

I never doubted that, with competent engineers, these tools could be used to generate “new” code from past datasets. What I continue to doubt is the utility of these tools given their immense costs, both environmentally and socially.

btown ・ 2 days ago

It's said that much of research is data janitorial work, and from my experience that's not just limited to the machine learning space. Every research scientist wishes that they had an army of engineers to build bespoke tooling for their niche, so they could get back to trying ideas at the speed of thought rather than needing to spend a day writing utility functions for those tools and poring over tables to spot anomalies. Giving every researcher a priceless level of leverage is a tremendous social good.
Of course, we won't be able to tell the real effects, now, because every longitudinal study of researchers will now be corrupted by the ongoing evisceration of academic research in the current environment. Vibe-coding won't be a net creativity gain to a researcher affected by vibe-immigration-policy, vibe-grant-availability, and vibe-firings, for all of which the unpredictability is a punitive design goal.
Whether fear of LLMs taking jobs has contributed to a larger culture of fear and tribalism that has emboldened anti-intellectual movements worldwide, and what the attributable net effect on research and development will be... it's incredibly hard to quantify.
- stego-tech ・ 2 days ago
  
  > Vibe-coding won't be a net creativity gain to a researcher affected by vibe-immigration-policy, vibe-grant-availability, and vibe-firings, for all of which the unpredictability is a punitive design goal.
  Quite literally this is what I’m trying to get at with my resistance to LLM adoption in the current environment. We’re not using it to do hard work, we’re throwing it everywhere in an intentional decision to dumb down more people and funnel resources and control into fewer hands.
  Current AI isn’t democratizing anything, it’s just a shinier marketing ploy to get people to abandon skilled professions and leave the bulk of the populace only suitable for McJobs. The benefits of its use are seen by vanishingly few, while its harms felt by distressingly many.
  At present, it is a tool designed to improve existing neoliberal policies and wealth pumps by reducing the demand for skilled labor without properly compensating those affected by its use, nor allowing an exit from their walled gardens (because that is literally what all these XaaS AI firms are - walled gardens of pattern matchers masquerading as intelligence).
  
  btown ・ 2 days ago
  
  This is a bit stronger than my point, I should say. I do think that LLMs would have a net benefit to society, by way of their effects on research and innovation... if we could get our political houses in order such that we weren’t negating those effects, and such that we were empowering small businesses and high-tech startups to build with the results of this innovation sustainably.
  And in a world where policy is horrid and the effects are mainly negated, things would be even worse if the remaining researchers lost AI as a tool. For better or for worse, fire has been shared with humanity, and we might as well cook.
  
  ThrowawayTestr ・ a day ago
  
  >Current AI isn’t democratizing anything
  I don't work anywhere close with software but I have used chatgpt to program small tools and scripts for me I never would have written myself.
  The real boon of AI programming is when normal people use it to program things custom tailored for their use case.
  
  cpursley ・ 2 days ago
  ・ 12 more
  
  That’s one perspective, but it’s wrong and typical gatekeeping (do you have a software degree by any chance?). People had the same attitude towards open source tooling and low code frameworks - god forbid someone not certified and ordained build a solution in something other than Java...
  AI code tools are allowing people to build things they couldn't before due to lack of skillset, time or budget. I’ve seen all sorts of problems solved by semi technical and even non-technical people. My brother for example built a thing with Microsoft copilot that helped automate more in his manufacturing facility (used to be paper).
  But yeah, keep yelling at that cloud - the rest of us will keep shipping cool things that we couldn’t before, and faster.
  
  Workaccount2 ・ 2 days ago
  ・ 8 more
  
  >My brother for example built a thing with Microsoft copilot that helped automate more in his manufacturing facility (used to be paper).
  I have harped on this endlessly as a non-programmer working a non-tech job, with 7 "vibe-coded" programs now being used daily by people at my company.
  I am sorry, but the tech world is completely missing the forest for the trees here. LLM's are talked about purely as tools that were created to help devs. Some love them, some hate them, but pretty much all of them seem unaware that LLMs allow non-tech people to automate tasks with a computer without having to go through a 3rd-party-created interface.
  So yea, maybe Claude is useless troubleshooting your cloud platform. But it certainly isn't useless in helping me forgo a cloud platform by setting up a simple local database to use instead.
  
  cpursley ・ 2 days ago
  ・ 3 more
  
  Yep, and it allows them to build POCs that they can pass to "real" devs in a way that was not possible before.
  
  Workaccount2 ・ 2 days ago
  ・ 2 more
  
  Real devs excel at writing software for hundreds, thousands, millions of users with fractal use cases and feature needs.
  LLMs excel at writing software for one or a handful of users with a very narrow but very well defined use cases.
  I don't need an LLM to write Excel.exe for keeping track of 20 employee's hours. A simple GUI on a SQLite database can easily do that.
  
  kentonv ・ 2 days ago
  
  Yes!
  We're about to enter a world where everyone has their own custom software for their specific use cases. Each of these is relatively simple, yet they may replace something complex. Excel is complex because it needs to handle everyone's use cases, but for any one particular spreadsheet, you could pretty easily vibe-code a replacement that does that one spreadsheet's job better than Excel can.
  I've also found that vibe-coding a presentation as a React app is better than using Power Point.
  
  lsllc ・ 2 days ago
  ・ 3 more
  
  > >My brother for example built a thing with Microsoft copilot that helped automate more in his manufacturing facility (used to be paper).
  > I have harped on this endlessly as a non-programmer working a non-tech job, with 7 "vibe-coded" programs now being used daily by people at my company.
  Aren't AI coding agent(s) just the next iteration of democratizing app development? This has happened before with Microsoft Access (even Visual Basic), or going back further FoxPro, dBase & Clipper etc? With all of these tools, non-programmers had been able to create apps to help them with their businesses.
  
  cpursley ・ a day ago
  
  From what I understand, what he built was a copilot assisted Access app. He would have not had the time nor skillset without the copilot thing. And they don't have the budget for a bespoke app.
  
  Workaccount2 ・ a day ago
  
  This has never happened before. There has never been a plain English programming language.
  Public Class Form1 Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load MessageBox.Show("Hello, World!") End Sub End Class
  Becomes
  "Make a message box pop up on the screen that says Hello World!"
  
  felipeerias ・ 2 days ago
  
  Devs have a hard time seeing vibe coding as UI, but that is what it effectively is for the average user. Describe the problem, get an interactive tool to handle it.
  
  sitharus ・ 2 days ago
  ・ 3 more
  
  The problem isn't that people can quickly prototype an idea that they've had without contracting an expensive professional, I think this is great. This will give ideas that would never see the light of day a chance. Plus this gives a much better talking point if they do choose to get a professional onboard.
  The problem is that it's sold as a complete solution. Use the LLM and you'll get a fully working product. However if you're not an experienced programmer you won't know what's missing, if it's using outdated and insecure options, or is just badly written. This still needs a professional.
  The technology is great and it has real potential to change how things are made, but it's being marketed as something it isn't (yet).
  
  kentonv ・ 2 days ago
  
  > However if you're not an experienced programmer you won't know what's missing, if it's using outdated and insecure options, or is just badly written. This still needs a professional.
  I think a lot of this could be solved by a platform that implements appropriate guardrails so that the application code literally cannot screw up the security. Not every conceivable type of software would fit in such a platform, but a lot of what people want to do to automate their day-to-day lives could.
  
  immibis ・ 2 days ago
  
  You know the industry that will need a lot more professionals after this - cybersecurity.
  
  prmph ・ 2 days ago
  ・ 8 more
  
  This is one of the best comments about the current AI hype.
  The elite really don't see why the proletariat should be interested in, or enjoy the dignity of, actual skill and quality.
  Hence the enshitification of everything, and now AI promises to commoditize everything into slop.
  Sad because it is the very deoth of society that has birthe
  
  bethekidyouwant ・ 2 days ago
  ・ 4 more
  
  This could be a comment about the industrial revolution.
  
  pessimizer ・ 2 days ago
  ・ 3 more
  
  The industrial revolution's immediate effect was to radically lower lifespans and lengthen working hours. After all of those people suffered and died, lifespans gradually got long again, although working hours never went back to normal.
  You might think it was worth it now because you got an iphone, but they didn't get an iphone.
  
  bethekidyouwant ・ 2 days ago
  
  This could be a comment about the Neolithic Revolution
  
  CamperBob2 ・ 2 days ago
  
  Well, what they were doing before was sure as shit not going to get them an iPhone.
  Just because a paradigm shift doesn't miraculously catapult us all into a post-scarcity economy overnight, that doesn't mean it's not an important milestone on a longer road.
  
  Invictus0 ・ 2 days ago
  ・ 3 more
  
  John Rockefeller didn't sit down in a big chair, twirl his mustache, and invent AI to funnel money to the hands of the wealthy. This technology was created by researchers and has been mostly accessible to everyone for as long as it has been around.
  All technology has the effect of concentrating wealth, and anyone who insists on using their two hands to fashion things when machines exist that can do it better will always be relegated to the "artisan" bin as time rolls on.
  
  drewbug ・ 2 days ago
  ・ 2 more
  
  > All technology has the effect of concentrating wealth
  Even GNU/Linux? :)
  
  actionfromafar ・ a day ago
  
  Android, cloud platforms etc.
- QuantumGood ・ 2 days ago
  
  There's too much "close enough" in virtually all these discussions. LLM is not a hand grenade. It's important to keep in mind what LLMs and related tech can be relied upon do or assist with, can't be relied upon to do or assist with, and might be relied on to do in the future.
- JumpCrisscross ・ a day ago
  
  > rather than needing to spend a day writing utility functions for those tools and poring over tables to spot anomalies
  A ridiculous amount of most researchers' time is spent cleaning up data.
- manquer ・ 2 days ago
  
  > much of research is data janitorial work
  In applied research perhaps, Fundamental research is nothing like that in any field including ML.
  
  JumpCrisscross ・ a day ago
  
  > Fundamental research is nothing like that in any field including ML
  Grant writing.
  
  freehorse ・ 2 days ago
  
  All experimental or empirical research is like that, is closer to the point.
abalone ・ 2 days ago

I like to make a rough analogy with autonomous vehicles. There's a leveling system from 1 (old school cruise control) to 5 (full automation):
* We achieved Level 2 autonomy first, which requires you to fully supervise and retain control of the vehicle and expect mistakes at any moment. So kind of neat but also can get you in big trouble if you don't supervise properly. Some people like it, some people don't see it as a net gain given the oversight required.
^ This is where Tesla "FSD beta" is at, and probably where LLM codegen tools are at today.
* After many years we have achieved a degree of Level 4 autonomy on well-trained routes albeit with occasional human intervention. This is where Waymo is at in certain cities. Level 4 means autonomy within specific but broad circumstances like a given area and weather conditions. While it is still somewhat early days it looks like we can generally trust these to operate safely and ask for help when they are not confident. Humans are not out of the loop.[1]
^ This is probably what where we can expect codegen to grow after many more years of training and refinement in specific domains. I.e. a lot of what CloudFlare engineers did with their prompt engineering tweaking was of this nature. Think of them as the employees driving the training vehicles around San Francisco for the past decade. And similarly, "L4 codegen" needs to prioritize code safety which in part means ensuring humans can understand situations and step in to guide and debug when the tool gets stuck.
* We are still nowhere close to Level 5 "drive anywhere and under any conditions a human can." And IMHO it's not clear we ever will based purely on the technology and methods that got us to L4. There are other brain mechanisms at work that need to be modeled.
[1] https://www.cnbc.com/2023/11/06/cruise-confirms-robotaxis-re...
- pphysch ・ 2 days ago
  
  That's a good analogy. OAuth libraries and integrations are like a highly-mapped California city. Just because you can drive a Waymo or coding agent there, doesn't mean you can drive it through the Rockies.
  
  abalone ・ 2 days ago
  
  Note that even with OAuth it took, as of today, security engineers many iterations of review and prompt tweaking to get this result. We’re still in the “mapping” phase.
diggan ・ 2 days ago

> On the other hand, where I remain a skeptic is this constant banging-on that somehow this will translate into entirely new things - research, materials science, economies, inventions, etc
Does it even have to be able to do so? Just the ability to speed up exploration and validation based on what a human tells it to do is already enormously useful, depending on how much you can speed up those things, and how accurate it can be.
Too slow or too inaccurate and it'll have a strong slowdown factor. But once some threshold been reached, where it makes either of those things faster, I'd probably consider the whole thing "overall useful". Nut of course that isn't the full picture and ignoring all the tradeoffs is kind of cheating, there are more things to consider too as you mention.
I'm guessing we aren't quite over the threshold because it is still very young all things considered, although the ecosystem is already pretty big. I feel like generally things tend to grow beyond their usefulness initially, and we're at that stage right now, and people are shooting it all kind of directions to see what works or not.
- butlike ・ 2 days ago
  
  So isn't the heuristic that if your job is easily digestible by an LLM, you're probably replaceable, but if the strong slowdown factor presents itself, you're probably doing novel work and have job security?
  
  diggan ・ a day ago
  
  > So isn't the heuristic that if your job is easily digestible by an LLM, you're probably replaceable
  Yeah, that sounds about right to me. I wasn't talking about wholesale replacement though, but as a tool/augmentation, I'm not very confident an LLM would be able replace a software engineer, but I can definitely see many workflows of a software engineer being sped up, like the exploration and validation process.
- dingnuts ・ 2 days ago
  
  > Just the ability to speed up exploration and validation based on what a human tells it to do is already enormously useful, depending on how much you can speed up those things, and how accurate it can be.
  The big question is: is it useful enough to justify the cost when the VC subsidies go away?
  My phone recently offered me Gemini "now for free" and I thought "free for now, you mean. I better not get used to that. They should be required to call it a free trial."
  
  jsnell ・ 2 days ago
  ・ 8 more
  
  Inference is actually quite cheap. Like, a highly competitive LLM can cost 1/25th of a search query. And it is not due to inference being subsidized by VC money.
  It's also getting cheaper all the time. Something like 1000x cheaper in the last two years at the same quality level, and there's not yet any sign of a plateau.
  So it'd be quite surprising if the only long-term business model turned out to be subscriptions.
  
  Denzel ・ 2 days ago
  ・ 7 more
  
  Can you link to any sources that support your claim?
  
  jsnell ・ 2 days ago
  ・ 6 more
  
  Sure. Here's something I'd written on the subject that I'd left lying in my drafts folder for a month, but I've now published just for you :)
  https://www.snellman.net/blog/archive/2025-06-02-llms-are-ch...
  It has links to public sources on the pricing of both LLMs and search, and explains why the low inference prices can't be due the inference being subsidized. (And while there are other possible explanations, it includes a calculator for what the compound impact of all of those possible explanations could be.)
  
  Denzel ・ 2 days ago
  ・ 4 more
  
  Thanks for sharing!
  It's worthwhile to note that https://github.com/deepseek-ai/open-infra-index/blob/main/20... shows cost vs. theoretical income. They don't show 80% gross margins and there's probably a reason they don't share their actual gross margin.
  OpenAI is the easiest counterexample that proves inference is subsidized right now. They've taken $50B in investment; surpassed 400M WAUs (https://www.reuters.com/technology/artificial-intelligence/o...); lost $5B on $4B in revenue for 2024 (https://finance.yahoo.com/news/openai-thinks-revenue-more-tr...); and project they won't be cash-flow positive until 2029.
  Prices would be significantly higher if OpenAI was priced for unit profitability right now.
  As for the mega-conglomerates (Google, Meta, Microsoft), GenAI is a loss leader to build platform power. GenAI doesn't need to be unit profitable, it just needs to attract and retain people on their platform, ie you need a Google Cloud account to use Gemini API.
  
  jsnell ・ 2 days ago
  ・ 3 more
  
  Thanks,
  I believe the API prices are not subsidized, and there's an entire section devoted to that. To recap:
  1) pure compute providers (rather than companies providing both the model and the compute) can't really gain anything from subsidizing. That market is already commoditized and supply-limited.
  2) there is no value to gaining paid API market share -- the market share isn't sticky, and there's no benefit to just getting more usage since the terms of service for all the serious providers promise that the data won't be used for training.
  3) we have data from a frontier lab on what the economics of their paid API inference are (but not the economics of other types of usage)
  So the API prices set a ceiling on what the actual cost of inference can be. And that ceiling is very low relative to the prices of a comparable (but not identical) non-AI product category.
  That's a very distinct case from free APIs and consumer products. The former is being given out for no cost in exchange for data, the latter for data and sticky market share. So unlike paid APIs, the incentives are there.
  But given the cost structure of paid APIs, we can tell that it would be trivial for the consumer products to be profitably monetized with ads. They've got a ton of users, and the way users interact with their main product would be almost perfect for advertising.
  The reason OpenAI is not making a profit isn't that inference is expensive. It's that they're choosing not to monetize like 95% of their users, despite the unit economics being very lucrative in principle. They're making a loss because for now they can, and for now the only goal of their consumer business is to maximize their growth and consumer mindshare.
  If OpenAI needed to make a profit, they would not raise their prices on things being paid for. They'd just need to extract a very modest revenue from their unpaid users. (It's 500M unpaid users. To make $5B/year in revenue from them, you'd need just a $1 ARPU. That's an order of magnitude below what's realistic. Hell, that's lower than the famously hard to monetize Reddit's global ARPU.)
  
  Denzel ・ 2 days ago
  ・ 2 more
  
  Yes, I read your entire article and that section, hence my response. :)
  1) Help me understand what you mean by “pure compute providers” here. Who are the pure compute providers and what are their financials including pricing?
  2) I already responded to this - platform power is one compelling value gained from paid API market share.
  3) If the frontier lab you’re talking about is DeepSeek, I’ve already responded to this as well, and you didn’t even concede the point that the 80% margin you cited is inaccurate given that it’s based on a “theoretical income”.
  
  jsnell ・ a day ago
  
  1) Any companies that host APIs using open-weights models (LLama, Gemma, Deepseek, etc) in exchange for money. There's a lot of them around, at different scales and different parts of a hosting provider's lifecycle. Check for example the Openrouter page for any open-weights model for hosters of that model with price data.
  2) (API) platform power having no value in this space has been demonstrated repeatedly. There are no network effects, because you can't use the user data to improve models. There is no lock-in, as the models are easy to substitute due to how incredibly generic the interface is. There is no loyalty, the users will jump ship instantly when better models are released. There is no purchasing power from having more scale, the primary supplier (Nvidia) isn't giving volume discounts and is actually giving preferential allocations to smaller hosting providers to fragment the market as much as possible.
  Did you have some other form of platform power in mind?
  3) I did not concede that point because I don't think it's relevant. They provide the exact data for their R1 inference economics:
  - The cost per node: a 8*H800 node costs $16/hour=$0.0045/s to run (rental price, so that covers capex + opex).
  - The throughput per node: Given their traffic mix, a single node will process 75k/s input tokens and generate 15k/s output tokens.
  - Pricing ($0.35/1M input when weighing for cache hit/miss, $2.2/1M output)
  - From which it follows that the per-node revenue is $0.35/(1M/75k/s) = $0.026/s for input, and $2.2/(1M/15k/s)=$0.033/s for output. That's $0.06/s in revenue, substantially higher than the cost of revenue.
  Like, that just is what the economics of paid R1 inference are (there being V3 in the mix doesn't matter, they're the same parameter count). Inference is really, really cheap both in absolute cost/token terms and relative to the prices people are willing to pay.
  Their aggregate margins are different, and we don't know how different, because here too they choose to also provide free service with no ads. But that too is a choice. If they just stopped doing that and rented fewer GPUs, their margins would be very lucrative. (Not as high as the computation suggests since the unpaid traffic allows them to batch more efficiently, but hat's not going to make a 5x difference.)
  But fair enough, it might be cleaner to use the straight cost per token data rather than add the indirection of margins. Either way, it seems clear that API pricing is not subsidized.
  
  whilenot-dev ・ 2 days ago
  
  Just had a quick glance, but I think I found something to add to the Objection!-section of your post:
  Brave's Search API is 3$ CPM and includes Web search, Images, Videos, News, Goggles[0]. Anthropic's API is 10$ CPM for Web search (and text only?), excluding any input/output tokens from your model of choice[1], that'd be an additional 15$ CPM, assuming 1KTok per request and Claude Sonnet 4 as a good model, so ~25$ CPM.
  So your default "Ratio (Search cost / LLM cost): 25.0x" seems to be more on the 0.12x side of things (Search cost / LLM cost). Mind you, I just flew over everything in 10 mins and have no experience using either API.
  [0]: https://brave.com/search/api/
  [1]: https://www.anthropic.com/pricing#anthropic-api
  
  diggan ・ 2 days ago
  
  > The big question is: is it useful enough to justify the cost when the VC subsidies go away?
  I won't claim local LLMs as nearly as good as various top models behind paid subscriptions/APIs, but I'm certain I'd be able find a way (for me) of working with them well enough, if the entire paid/hosted ecosystem disappeared over night. Even with models released today.
  I think the VC subsidies probably "make stuff happen" faster, and without it we'd see slower progress, but I don't think 100% of the ecosystem would disappear even if 100% of VC funding disappeared. We're bound for another AI winter at one point, and some will surely survive even that :)
svara ・ 2 days ago

> On the other hand, where I remain a skeptic is this constant banging-on that somehow this will translate into entirely new things
Really a lot of innovation, even at the very cutting edge, is about combining old things in new ways, and these are great productivity tools for this.
I've been "vibe coding" quite a bit recently, and it's been going great. I still end up reading all the code and fixing issues by hand occasionally, but it does remove a lot of the grunt work of looking up simple things and typing out obvious code.
It helps me spend more time designing and thinking about how things should work.
It's easily a 2-3x productivity boost versus the old fashioned way of doing things, possibly more when you take into account that I also end up implementing extra bells and whistles that I would otherwise have been too lazy to add, but that come almost for free with LLMs.
I don't think the stereotype of vibe coding, that is of coding without understanding what's going on, actually works though. I've seen the tools get stuck on issues they don't seem to be able to understand fully too often to believe that.
I'm not worried at all that LLMs are going to take software engineering jobs soon. They're really just making engineers more powerful, maybe like going from low level languages to high level compiled ones. I don't think anyone was worried about the efficiency gains from that destroying jobs either.
There's still a lot of domain knowledge that goes into using LLMs for coding effectively. I have some stories on this too but that'll be for another day...
TeMPOraL ・ 2 days ago

> where I remain a skeptic is this constant banging-on that somehow this will translate into entirely new things - research, materials science, economies, inventions, etc - because that requires learning “in real time” from information sources you’re literally generating in that moment, not decades of Stack Overflow responses without context.
Personally I hope this will materialize, at the very least because there's plenty of discoveries to be made by cross-correlating discoveries already made; the necessary information should be there, but reasoning capability (both that of the model and that added by orchestration) seems to be lacking. I'm not sure if pure chat is the best way to access it, either. We need better, more hands-on tools to explore the latent spaces of LLMs.
- stego-tech ・ 2 days ago
  
  I don’t consider that “new” research, personally - because AI boosters don’t consider that “new”. The future they hype is one where these LLMs can magic up entirely new fields of research and study without human input, which isn’t how these models are trained in the first place.
  That said, yes, it could be highly beneficial for identifying patterns in existing research that allows for new discoveries - provided we don’t trust it blindly and actually validate it with science. Though I question its value to society in burning up fossil fuels, polluting the atmosphere, and draining freshwater supplies compared to doing the same work with Grad Students and Scientists with the associated societal feedback involved in said employment activities.
  
  TeMPOraL ・ 2 days ago
  ・ 3 more
  
  > Though I question its value to society in burning up fossil fuels, polluting the atmosphere, and draining freshwater supplies compared to doing the same work with Grad Students and Scientists with the associated societal feedback involved in said employment activities.
  I'd imagine AI is much cheaper on that front than grad students, whether you count marginal contribution, or total costs of building and utilization. Humans are damn expensive and environmentally intensive to rear and keep around.
  
  stego-tech ・ 2 days ago
  ・ 2 more
  
  You really should read the papers and reporting coming out about the sheer cost of these AI models and their operation. It might seem significantly cheaper in the context of immediate impact, but those humans provide knock-on impacts that can decrease their environmental impact (especially if done in concert), while the current crop of AI is content burning NatGas turbines and guzzling up groundwater just so a human isn’t tasked with reading a full paragraph of information, or a white paper of important content - and that’s the most optimistic view, at present.
  Evaluating a technology in a vacuum does not work when trying to assess its impact, and in that wider context I don’t see the value-add of these models deployed at scale, especially when their marketing continues focusing on synthetic benchmarks and lofty future-hype instead of immediately practicable applications (like this one was).
  
  TeMPOraL ・ a day ago
  
  I'm not evaluating it in a vacuum.
  > You really should read the papers and reporting coming out about the sheer cost of these AI models and their operation.
  Unless I've missed something big, they're still showing what I said.
  Obviously, AI has its cost. And it's going to be big, because the whole world is using it, and trying to develop better models.
  > those humans provide knock-on impacts that can decrease their environmental impact (especially if done in concert)
  Can you name three? As far as I know, humans are energy intensive and strongly carbon-negative in general - and there's only so much they can do to decrease it; otherwise we wouldn't be facing a climate crisis.
  > the current crop of AI is content burning NatGas turbines
  That's a misleading statement, not an argument. AI is powered by electricity, not natural gas. Electricity is fungible, and how it's generated is not relevant to to how it's used. Even if you can point at a data center that gets power directly and exclusively from a fossil fuel generator, the problem has nothing to do with AI, and the solution is not "less AI", but "power the data center from renewables or nuclear instead".
  > I don’t see the value-add of these models deployed at scale, especially when their marketing continues focusing on synthetic benchmarks and lofty future-hype instead of immediately practicable applications (like this one was)
  That's the crux of the issue. You don't see the value-add. I respectfully suggest to stop looking at benchmarks, to stop reading marketing materials and taking it seriously (always a good idea, regardless of the topic), to stop listening to linkedin "thought leaders". Instead, just look at it. Try using it, see how others are using it.
  The value-add is real, substantial, and blindingly obvious. To me, it's one of the best uses of electricity today, in terms of value-add per kilowatt hour.
waynenilsen ・ 2 days ago

most engineering is glorified plumbing so as far as labour productivity goes, this should go a long way
- stego-tech ・ 2 days ago
  
  I doubt it, for the simple reason that literal plumbers still make excellent money because plumbing is ultimately bespoke output built on standards.
  Everyone wants to automate the (proverbial) plumbing, until shit spews everywhere and there’s nobody to blame but yourself.
  
  realreality ・ 2 days ago
  ・ 11 more
  
  Plumbers make excellent money because regulations require licensed plumbers to do the work, and plumbing unions have a financial interest in limiting the number of plumbers.
  But anybody can do plumbing. It’s not rocket science.
  
  trollbridge ・ 2 days ago
  ・ 2 more
  
  You can hire a non-union plumber. There isn’t usually much of a price difference. Where I live, you can easily find a non-licenced plumber (called “moonlighting”, usually done by apprentices of licenced plumbers). A lot of people prefer not to since you’re on your own if something goes wrong.
  Plumbing requires skill, particularly for difficult jobs, and also requires advanced equipment to do such a job in a reasonable amount of time, such as special cameras to inspect a septic tank or drain line without having to actually cut into it.
  
  realreality ・ 2 days ago
  
  Where I live, permits are only given to licensed plumbers, and all work on plumbing requires a permit (though I’m sure many people ignore the rule).
  
  stego-tech ・ 2 days ago
  ・ 7 more
  
  > regulations require licensed plumbers to do the work
  Regulations come about because of repeated failures that end up harming the public. Regulations aren’t a dirty word, and aren’t obstacles to be “disrupted” in most cases.
  > plumbing unions have a financial interest in limiting the number of plumbers
  Golly gee, it’s almost as if - because we live in a society where everyone must work in order to survive - that skilled professionals have a vested interest in ensuring only qualified candidates may join their ranks, to make it harder to depress wages below subsistence levels (the default behavior of unregulated capital).
  > But anybody can do plumbing. It’s not rocket science.
  Oh wow, I had no idea I was qualified to design sewage infrastructure for my township just because I plumbed my Amazon bidet into the cold water line! Sure glad there’s no regulations stopping me from becoming a licensed plumber since apparently that’s all it takes to succeed!
  Sarcasm aside, your argument holds about as much substance as artificial sweetener: it sounds informed and wise, but anyone with substantial experience in reality and collaborating with other people knows that all you’re spewing is ignorance of the larger systems at work and their interplay.
  
  dragonwriter ・ 2 days ago
  ・ 4 more
  
  > Regulations come about because of repeated failures that end up harming the public.
  Sometimes, but see also the concepts of “iron triangles” and “regulatory capture”.
  
  stego-tech ・ 2 days ago
  ・ 3 more
  
  You’re not wrong (examples include the US FCC, ZA’s Telekom, ye olde Standard Oil, vertical integrations…the list goes on, and even includes modern cloud services and AI tools, since the regulations they champion are often intended to block competitors with onerous compliance requirements), but in the context of the person I was replying to, they used “regulation” very much in the same context Uber/AirBnB and the SV Libertarian ilk decry “regulations”.
  Regulations aren’t a binary (exclusively good or exclusively bad), yet so many of the HN cohort have drank the “exclusively bad and everyone can be trusted to make good decisions forever” koolaid that seeks to dismantle regulations wholesale.
  
  realreality ・ 2 days ago
  ・ 2 more
  
  You’re wasting your time fighting a straw man. I never said all regulations are bad.
  The question was why plumbers are expensive. I assert that it’s not because plumbing is especially difficult.
  
  stego-tech ・ 2 days ago
  
  > You’re wasting your time fighting a straw man.
  Smartest thing you’ve said all day. Thanks for reminding me that trying to convince someone of something when they cannot be bothered to do research beyond first order impacts is a waste of my time.
  
  realreality ・ 2 days ago
  ・ 2 more
  
  Designing sewage infrastructure isn’t rocket science, either. If citizens in your town needed to do it, they could figure it out, regardless of their credentials.
  Sometimes regulations come about to protect the public. Often, they’re enacted to protect the profits of insurance companies, banks, and other influential industries. Don’t be naive about “the systems at work and their interplay”.
  
  undefined ・ 2 days ago
  
  [deleted]
  
  Bud ・ 2 days ago
  
  [dead]
lincoln20xx ・ 2 days ago

I have a non-zero number of industrial process patents under my belt. Allegedly, that means that I had ideas that had not previously been recorded. Once I wrote them down, paid some lawyers a bunch of money, and did some paperwork, I have the right to pay lawyers more money to make someone's life difficult if I think that someone ever tries to do something with the same thoughts, regardless of if they had those thoughts before, after, or independently of me.
In my opinion, there is a very valid argument that the vast majority of things that are patented are not "new" things, because everything builds on something else that came before it.
The things that are seen as "new" are not infrequently something where someone in field A sees something in field B, ponders it for a minute, and goes "hey, if we take that idea from field B, twist it clockwise a bit, and bolt it onto the other thing we already use, it would make our lives easier over in this nasty corner of field A." Congratulations! "New" idea, and the patent lawyers and finance wonks rejoice.
LLMs may not be able to truly "invent" "new" things, depending on where you place those particular goalposts.
However, even a year or two ago - well before Deep Research et al - they could be shockingly useful for drawing connections between disparate fields and applications. I was working through a "try to sort out the design space of a chemical process" type exercise, and decided to ask whichever GPT was available and free at the time about analogous applications and processes in various industries.
After a bit of prodding it made some suggestions that I definitely could have come up on my own if I had the requisite domain knowledge, but would almost certainly never have managed on my own. It also caused me to make a connection between a few things that I don't think I would have stumbled upon otherwise.
I checked with my chemist friends, and they said the resulting ideas were worth testing. After much iteration, one of the suggested compounds/approaches ended up generating the least bad result from that set of experiments.
I've previously sketched out a framework for using these tools (combined with other similar machine learning/AI/simulation tools) to massively improve the energy consumption of industrial chemical processes. It seems to me that that type of application is one where the LLM's environmental cost could be very much offset by the advances it provides.
The social cost is a completely different question though, and I think a very valid one. I also don't think our economic system is structured in such a way that the social costs will ever be mitigated.
Where am I going with this? I'm not sure.
Is there a "ghost in the machine"? I wouldn't place a bet on yes, at least not today. But I think that there is a fair bit of something there. Utility, if nothing else. They seem like a force multiplier to me, and I think that with proper guidance, that force multiplier could be applied to basic research, material science, economics, and "inventions".
Right now, it does seem that it takes someone with a lot of knowledge about the specific area, process, or task to get really good results out of LLMs.
Will that always be true? I don't know. I think there's at least one piece of the puzzle we don't have sorted out yet, and that the utility of the existing models/architectures will ride the s-curve up a bit longer but ultimately flatten out.
I'm also wrong a LOT, so I wouldn't bet a shiny nickel on that.

Phiality ・ 18 hours ago

This is so cool

okthrowman283 ・ a day ago

The sheer amount of copium in this thread is illuminating, it’s fascinating the lengths people will go to downplaying advancements like this when their egos/livelihoods are threatened - pretty natural though I suppose.

arrty88 ・ a day ago

im using AI to build a cloudflare replica :)

chrisweekly ・ 2 days ago

mods: typo in title "CloudLflare"

mdaniel ・ 2 days ago

There is no "@" system here, you are welcome to email hn@ycombinator.com or hope that we're still within the edit window for the title

blibble ・ 2 days ago

> I thoughts LLMs were glorified Markov chain generators that didn't actually understand code and couldn't produce anything novel.

so he's been convinced by it shitting out yet another javascript oauth library?

this experiment proves nothing re: novelty

kentonv ・ 2 days ago

While implementing the OAuth standard itself is not novel, many of the specific design details in this implementation are. I gave it a rather unusual API spec, an unusual storage schema, and an unusual end-to-end encryption scheme. It was totally able to understand these requests, even reasoning about the motivation behind them, and implement what I wanted. That's what convinced me.
BTW, the vast majority of JS OAuth libraries are implementing the client side of OAuth. Provider-side implementations are relatively rare, as historically it's mostly only big-name services that ever get to the point of being a OAuth providers, and they tend to build it all in-house and not release code.
- blibble ・ 2 days ago
  
  I think you're easily convinced.
  
  ThrowawayTestr ・ a day ago
  
  I think you'll never be impressed.
ayuhito ・ 2 days ago

Good thing most of my tasks don’t require novelty, just working code.

bigcat12345678 ・ a day ago

Kenton at it again!

csmpltn ・ 2 days ago

There are tens (if not hundreds) of thousands of OAuth libraries out there. Probably millions of relevant codebases on GitHub, Bitbucket, etc. Possibly millions of questions on StackOverflow, Reddit, Quora. Vast amounts of documentation across many products and websites. RFCs. All kinds of forums. Wikipedias…

Why are you so surprised an LLM could regurgitate one back? I wouldn’t celebrate this example as a noteworthy achievement…

ThrowawayTestr ・ a day ago

Could you imagine typing the words "write an oauth library in typescript" into a computer and it actually working even 5 years ago? This is literally science fiction.
- blibble ・ a day ago
  
  yeah, I remember putting this sort of query into Google 5 years ago
  and the computer produced it!
  "literally science fiction"
  
  ThrowawayTestr ・ a day ago
  ・ 2 more
  
  If you're not willing to have a good faith discussion I won't bother.
  
  csmpltn ・ a day ago
  
  It is a good faith argument though. LLMs are trained on this exact kind of data - and a lot of times, chat frontends (Claude, ChatGPT, etc) will simply search the web and summarize the results for you...

paulddraper ・ a day ago

> To emphasize, this is not "vibe coded". Every line was thoroughly reviewed and cross-referenced with relevant RFCs, by security experts with previous experience with those RFCs.

tonyhart7 ・ 2 days ago

same argument with me but only for claude

another models feels like shit to use, but claude is good

globular-toast ・ 2 days ago

Should I be impressed? Oauth already exists and there are countless libraries implementing it. Is it impressive that an LLM can regurgitate yet another one?

IncreasePosts ・ 2 days ago

Has this source been compared with other oauth libraries, to see if it is just license-violating some other open source code it was trained on?

pier25 ・ 2 days ago

Did you really save time given that every line of code was "thoroughly reviewed"?

yapyap ・ a day ago

bit of a typo in the title

cyberax ・ a day ago

I looked through the source code, and it looks reasonable. The code is well-commented (even _over_ commented a bit). There are probably around ~700 meaningful lines of pure code. So this should be about 2 weeks of work for a good developer. This is without considering the tests.

And OAuth is not particularly hard to implement, I did that a bunch of times (for server and the client side). It's well-specified and so it fits well for LLMs.

So it's probably more like 2x acceleration for such code? Not bad at all!

keeda ・ 2 days ago

A number of comments point out that OAuth is a well known standard and wonder how AI would perform on less explored problem spaces. As it happens I have some experience there, which I wrote about in this long-ass post nobody ever read: https://www.linkedin.com/pulse/adventures-coding-ai-kunal-ka...

It’s now a year+ old and models have advanced radically, but most of the key points still hold, which I've summarized here. The post has way more details if you need. Many of these points have also been echoed by others like @simonw.

Background:

* The main project is specialized and "researchy" enough that there is no direct reference on the Internet. The core idea has been explored in academic literature, a couple of relevant proprietary products exist, but nobody is doing it the way I am.

* It has the advantage of being greenfield, but the drawback of being highly “prototype-y”, so some gnarly, hacky code and a ton of exploratory / one-off programs.

* Caveat: my usage of AI is actually very limited compared to power users (not even on agents yet!), and the true potential is likely far greater than what I've described.

Highlights:

* At least 30% and maybe > 50% of the code is AI-generated. Not only are autocompletes frequent, I do a lot of "chat-oriented" and interactive "pair programming", so precise attribution is hard. It has written large, decently complicated chunks of code.

* It does boilerplate extremely easily, but it also handles novel use-cases very well.

* It can refactor existing code decently well, but probably because I'ver worked to keep my code highly modular and functional, which greatly limits what needs to be in the context (which I often manage manually.) Errors for even pretty complicated requests are rare, especially with newer models.

Thoughts:

* AI has let me be productive – and even innovate! – despite having limited prior background in the domains involved. The vast majority of all innovation comes from combining and applying well-known concepts in new ways. My workflow is basically a "try an approach -> analyze results -> synthesize new approach" loop, which generates a lot of such unique combinations, and the AI handles those just fine. As @kentonv says in the comments, there is no doubt in my mind that these models “understand” code, as opposed to being stochastic parrots. Arguments about what constitutes "reasoning" are essentially philosophical at this point.

* While the technical ideas so far have come from me, AI now shows the potential to be inventive by itself. In a recent conversation ChatGPT reasoned out a novel algorithm and code for an atypical, vaguely-defined problem. (I could find no reference to either the problem or the solution online.) Unfortunately, it didn't work too well :-) I suspect, however, that if I go full agentic by giving it full access to the underlying data and letting it iterate, it might actually refine its idea until it works. The main hurdles right now are logistics and cost.

* It took me months to become productive with AI, having to find a workflow AND code structure that works well for me. I don’t think enough people have put in the effort to find out what works for them, and so you get these polarized discussions online. I implore everyone, find a sufficiently interesting personal project and spend a few weekends coding with AI. You owe it to yourself, because 1) it's free and 2)...

* Jobs are absolutely going to be impacted. Mostly entry-level and junior ones, but maybe even mid-level ones. Without AI, I would have needed a team of 3+ (including a domain expert) to do this work in the same time. All knowledge jobs rely on a mountain of donkey work, and the donkey is going the way of the dodo. The future will require people who uplevel themselves to the state of the art and push the envelope using these tools.

* How we create AI-capable senior professionals without junior apprentices is going to be a critical question for many industries. My preliminary take is that motivated apprentices should voluntarily eschew all AI use until they achieve a reasonable level of proficiency.

bsder ・ 2 days ago

> Claude's output was thoroughly reviewed by Cloudflare engineers with careful attention paid to security and compliance with standards.

So, for those of us who are not OAuth experts, don't have a team of security engineers on call, and are likely to fall into all the security and compliance traps, how does this help?

I don't need AI to write my shitty code. I need AI to review and correct my shitty code.

baq ・ a day ago

You want hammers to review your woodwork or your hammering technique, too?
…anyway, Gemini pro is a quite good reviewer if you are specific about what you need reviewed and provide relevant dependencies in the context.

varispeed ・ 2 days ago

The thing is you need to know what exactly LLM should create and you need to know what it is doing wrong and tell it to fix it. Meaning, if you don't already have skill to build something yourself, AI might not be as useful. Think of it as keyboard on steroids. Instead of typing literally what you want to see, you just describe it in detail and LLM decompresses that thought.

throwaway314155 ・ 2 days ago

"built OAuth" here means they "implemented OAuth for CloudFlare workers" FYI.

teaearlgraycold ・ 2 days ago

> Every line was thoroughly reviewed and cross-referenced with relevant RFCs, by security experts with previous experience with those RFCs.

This sounds like coding but slower

kentonv ・ 2 days ago

I would say it ended up being much faster than had I written it by hand. It took a few days to produce this library -- it would almost certainly have taken me weeks to write it myself.
- srhtftw ・ 2 days ago
  
  > It took a few days to produce this library -- it would almost certainly have taken me weeks to write it myself.
  As mentioned in another comment https://news.ycombinator.com/item?id=44162965 I think this "few days" is unrealistic given the commit history. I think it would be more accurate to say it allowed you to do something in under one month that may have have taken two. A definite improvement, but not a reduction of weeks to days.
  Or is that history inaccurate?
  
  kentonv ・ a day ago
  
  I replied there: https://news.ycombinator.com/item?id=44165668
- noodletheworld ・ 2 days ago
  
  If you had written it by hand would the verification process been as time consuming?
  i.e. overall including the time spent verifying that it was correct, do you consider it a net win?
  
  kentonv ・ 2 days ago
  
  I was already including my own time spent verifying the output, which I mostly did right away as the code was being generated (approving or rejecting each edit).
  And the separate security review would have been required either way.
  So yes, it saved time.
- apwell23 ・ 2 days ago
  
  does reviewing have the same fidelity as writing ?
  reminded me of my university classes where i took my own notes vs studied someone else's notes. you can guess which one was superior.
  
  kentonv ・ 2 days ago
  
  It's certainly true that my own recall of the code would be better if I had written it by hand.
  But I don't think the final code is all that far off from what I would have written.
TeMPOraL ・ 2 days ago

The point was validating a hypothesis. That is the validation part.

undefined ・ 2 days ago

[deleted]

undefined ・ a day ago

[deleted]

gregorywegory ・ 2 days ago

From the readme: This library (including the schema documentation) was largely written with the help of Claude, the AI model by Anthropic. Claude's output was thoroughly reviewed by Cloudflare engineers with careful attention paid to security and compliance with standards. Many improvements were made on the initial output, mostly again by prompting Claude (and reviewing the results). Check out the commit history to see how Claude was prompted and what code it produced.

"NOOOOOOOO!!!! You can't just use an LLM to write an auth library!"

"haha gpus go brrr"

In all seriousness, two months ago (January 2025), I (@kentonv) would have agreed. I was an AI skeptic. I thoughts LLMs were glorified Markov chain generators that didn't actually understand code and couldn't produce anything novel. I started this project on a lark, fully expecting the AI to produce terrible code for me to laugh at. And then, uh... the code actually looked pretty good. Not perfect, but I just told the AI to fix things, and it did. I was shocked.

To emphasize, this is not "vibe coded". Every line was thoroughly reviewed and cross-referenced with relevant RFCs, by security experts with previous experience with those RFCs. I was trying to validate my skepticism. I ended up proving myself wrong.

Again, please check out the commit history -- especially early commits -- to understand how this went.

revskill ・ 2 days ago

So who's the experts here ?

JackSlateur ・ 19 hours ago

Fascinating

It's like cooking with a toddler

The end result has a lower quality than your own potential, it takes more time to be producted, and it is harder too because you always need to supervise and correct what's done

aerhardt ・ 19 hours ago

> it takes more time to be producted, and it is harder too because you always need to supervise and correct what's done
This is hogwash, the lead dev in charge of this has commented elsewhere that he's saved inordinate amounts of time. He mentioned that he gets about a day a week to code and produced this in under a month, which under those circumstances would've been impossible without LLM assistance.
- JackSlateur ・ 17 hours ago
  
  It took two months for a lead dev and a bunch of "Cloudflare engineers" (to "thoroughly review .. with careful attention paid to security and compliance with standards") to write ~1300 lines of typescript code for a feature they (he ?) masters
  If that sounds like "save inordinate amounts of time", well, that's your opinion
- dang ・ 18 hours ago
  
  Can you please make your substantive points without name-calling like "This is hogwash"? Your comment would be just fine without that bit.
  This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.
  "When disagreeing, please reply to the argument instead of calling names. 'That is idiotic; 1 + 1 is 2, not 3' can be shortened to '1 + 1 is 2, not 3."

apwell23 ・ 2 days ago

yes there are plenty of examples of ppl writing tic-tac-toe or a flying simulator with llm all over youtube. what does that prove exactly? oauth is as routine as it gets.

dboreham ・ a day ago

It turns out that "actually understanding" is a fictional concept. It's just the delusion some LLMs (yours and mine) has about what's going on inside itself.

ElijahLynn ・ 20 hours ago

[flagged]

kentonv ・ 20 hours ago

https://news.ycombinator.com/item?id=44159167

sahil_sharma0 ・ a day ago

[dead]

animanoir ・ a day ago

[dead]

humanlity ・ 2 days ago

[dead]

curtisszmania ・ 2 days ago

[dead]

Squeeeez ・ 2 days ago

[flagged]

kentonv ・ 2 days ago

What are you talking about? The entire library is 2600 lines. There are no 2500-line methods.
- Squeeeez ・ 2 days ago
  
  Yeah, my bad, I got lost and frustrated while scrolling endlessly and trying to keep track of what was part of what. Look, clearly you are happy with the results, so all good for you.

lichenwarp ・ 2 days ago

[flagged]

drexlspivey ・ 2 days ago

Until you make your own website, then you love them
- diggan ・ 2 days ago
  
  I've made plenty of websites, still don't love them and still get served 5+ captchas sometimes, straight after each other. Perhaps I have to give them money, then I'll love them?
  
  hombre_fatal ・ 2 days ago
  ・ 3 more
  
  It's still just barking up the wrong tree. You're seeing a downstream effect of widespread abuse on the internet and complaining about people trying to mitigate it.
  
  diggan ・ 2 days ago
  ・ 2 more
  
  > It's like complaining that you have to ask someone at Walmart to take a product out of the lockbox so that you can buy it.
  Kind of similar I guess, but mostly not. Never been to Walmart, so not sure what I'd expect, but I'm guessing that if I ask them to give me that item, they'll give it to me?
  Because that's not how Cloudflare's captchas work. Sometimes they'll keep on coming indefinitely, until you give up and try again another day. Doesn't happen often when I'm in Spain, but if I'm in Peru for example that happens a lot. In Spain I usually get away with filling in 3-5 captchas, then I'm good to go.
  So I guess your Cloudflare experience is wildly different depending on what country you live in, which is why you see some of us being very tired of it, and others not caring that much about it, probably because they live in a country with lower "spam-ranking", or however they do it internally.
  
  hombre_fatal ・ 2 days ago
  
  Dang, you're fast. I almost immediately deleted the Walmart part because the analogy misses the detail that website operators are the ones choosing to use Cloudflare to protect their services.
  And it's something you've missed to. Because yeah, captchas are annoying, but it's the result of web service operators trying to avoid the ever increasing amount of abusive traffic, and there's definitely collateral damage as abusive traffic looks more and more human (e.g. residential IP botnets being dirt cheap).
kissgyorgy ・ 2 days ago

Just put a form on a website and you will see why... CloudFlare provides the solution, not causing the problem.
- throwaway84496 ・ 2 days ago
  
  > CloudFlare provides the solution, not causing the problem
  Isn't CloudFlare infamously known for refusing to take action against clients who are causing the problems (e.g. DDoS services that make use of CloudFlare's services)?
  
  undefined ・ 2 days ago
  
  [deleted]
- diggan ・ 2 days ago
  
  > Just put a form on a website and you will see why
  We solved spam submissions on forms way before Cloudflare appeared, it isn't exactly a new problem...
- jppope ・ 2 days ago
  
  not going to lie, an old school honeypot will catch 95% of the bot traffic.
- doublerabbit ・ 2 days ago
  
  > Just put a form on a website and you will see why...
  I host my local community graveyard website and I've had no issue with forms. These forms are for tour bookings and contact.
  And yes they are causing the problems. They restrict me because I use my own self-hosted colocated VPN in the same country on a proper dedicated IP with rDNS.

sceptic123 ・ 2 days ago

If you need to be an expert to use AI tools safely, what does that say about AI tools?

dkdcio ・ 2 days ago

Genuinely curious what your point is? Do you know how to use a ventillator? A A timing gun? A tonometer? A keratometer? Can you use all of those in a "production" setting safely without expertise?
- sceptic123 ・ 3 hours ago
  
  To speak to your analogy, I could possibly use a fully automated tonometer (or maybe a defibrillator). The idea being that the tool can guide a non-expert through the required steps.
  If I had a point it would be that these tools are currently offered as if they are experts and are taking you through the steps as if they can be trusted. The reality is far from that, and understanding that difference is key to how we approach their use.
  Maybe this will change in the future, but right now, you need to be an expert to use AI coding tools, I don't think many people understand that.
  
  sceptic123 ・ 3 hours ago
  
  I think as well, you can be an expert in software development without being an expert in software security. If you rely on a coding assistant in that situation then you run the risk of writing insecure code together.
- Bjartr ・ 2 days ago
  
  They didn't make a point, they asked a question. Sometimes people do still ask questions because they're interested in the answer.
  
  okthrowman283 ・ a day ago
  ・ 4 more
  
  It was clearly rhetorical
  
  Bjartr ・ a day ago
  ・ 2 more
  
  Even if it were, which I disagree, treating it was though it were a plain sincere question instead of rhetorical would lead to better discussions.
  
  dkdcio ・ 19 hours ago
  
  Yeah I was a bit snarky, my bad. I was also genuinely curious though, as it did seem like an odd question that probably had a point behind it.
  
  undefined ・ a day ago
  
  [deleted]