There is more than one comment here asserting that the authors should have done a parallel comparison study against humans on the same question bank as if the study authors had set out to investigate whether humans or LLMs reason better in this situation.
The authors do include the claim that humans would immediately disregard this information and maybe some would and some wouldn't that could be debated and seemingly is being debated in this thread - but I think the thrust of the conclusion is the following:
"This work underscores the need for more robust defense mechanisms against adversarial perturbations, particularly, for models deployed in critical applications such as finance, law, and healthcare."
We need to move past the humans vs ai discourse it's getting tired. This is a paper about a pitfall LLMs currently have and should be addressed with further research if they are going to be mass deployed in society.
> We need to move past the humans vs ai discourse it's getting tired.
You want a moratorium on comparing AI to other form of intelligence because you think it's tired? If I'm understanding you correctly, that's one of the worst takes on AI I think I've ever seen. The whole point of AI is to create an intelligence modeled on humans and to compare it to humans.
Most people who talk about AI have no idea what the psychological baseline is for humans. As a result their understand is poorly informed.
In this particular case, they evaluated models that do not have SOTA context window sizes. I.e. they have small working memory. The AIs are behaving exactly like human test takers with working memory, attention, and impulsivity constraints [0].
Their conclusion -- that we need to defend against adversarial perturbations -- is obvious, I don't see anyone taking the opposite view, and I don't see how this really moves the needle. If you can MITM the chat there's a lot of harm you can do.
This isn't like some major new attack. Science.org covered it along with peacocks being lasers because it's it's lightweight fun stuff for their daily roundup. People like talking about cats on the internet.
[0] for example, this blog post https://statmedlearning.com/navigating-adhd-and-test-taking-...
>The whole point of AI is to create an intelligence modeled on humans and to compare it to humans.
According to who? Everyone who's anyone is trying to create highly autonomous systems that do useful work. That's completely unrelated to modeling them on humans or comparing them to humans.
But since these things are more like humans than computers, to build these autonomous systems you are going to have think in terms of full industrial engineering, not just software engineering: pretend you are dealing with a surprisingly bright and yet ever distracted employee who doesn't really care about their job and ensure that they are able to provide the structure you place them in value without danger to your process, instead of trying to pretend like the LLM is some kind of component which has any hope of ever having the kind of reliability of a piece of software. Organizations of humans can do amazing things, despite being extremely flawed beings, and figuring out how to use these LLMs to accomplish similar things is going to involve more of the skills of a manager than a developer.
Their output is in natural language, that's about the end of similarities with humans. They're token prediction algorithms, nothing more and nothing less. This can achieve some absolutely remarkable output, probably because our languages (both formal and linguistic) are absurdly redundant. But the next token being a word, instead of e.g. a ticker price, doesn't suddenly make them more like humans than computers.
I see this "next token predictor" description being used as a justification for drawing a distinction between LLMs and human intelligence. While I agree with that description of LLMs, I think the concept of "next token predictor" is much, much closer to describing human intelligence than most people consider.
Humans invented language, from nothing. For that matter we went from a collective knowledge not far beyond 'stab them with the pokey end' to putting a man on the Moon. And we did it the blink of an eye if you consider how inefficient we are at retaining and conferring knowledge over time. Have an LLM start from the same basis humanity did and it will never produce anything, because the next token to get from [nothing] to [man on the Moon] simply does not exist for an LLM until we add it to its training base.
It's got an instant-messaging interface.
If it had an autocomplete interface, you wouldn't be claiming that. Yet it would still be the same model.
(Nobody's arguing that Google Autocomplete is more human than software - at least, I hope they're not).
- [deleted]
By whoever coined the term Artificial Intelligence. It's right there in the name.
Backronym it to Advanced Inference and the argument goes away.
Go back and look at the history of AI, including current papers from the most advanced research teams.
Nearly every component is based on humans
- neural net
- long/short term memory
- attention
- reasoning
- activation function
- learning
- hallucination
- evolutionary algorithm
If you're just consuming an AI to build a React app then you don't have to care. If you are building an artificial intelligence then in practice everyone who's anyone is very deliberately modeling it on humans.
How far back do I have to look, and what definition do you use? Because I can start with theorem provers and chess engines of the 1950s.
Nothing in that list is based on humans, even remotely. Only neural networks were a vague form of biomimicry early on and currently have academic biomimicry approaches, of which all suck because they poorly map to available semiconductor manufacturing processes. Attention is misleadingly called that, reasoning is ill-defined, etc.
LLMs are trained on human-produced data, and ML in general shares many fundamentals and emergent phenomena with biological learning (a lot more than some people talking about "token predictors" realize). That's it. Producing artificial humans or imitating real ones was never the goal nor the point. We can split hairs all day long, but the point of AI as a field since 1950s is to produce systems that do something that is considered only doable by humans.
> How far back do I have to look
The earliest reference I know off the top of my head is Aristotle, which would be the 4th century BCE
> I can start with theorem provers
If you're going to talk about theorem provers, you may want to include the medieval theory of obligations and their game-semantic-like nature. Or the Socratic notion of a dialogue in which arguments are arrived at via a back and forth. Or you may want to consider that "logos" from which we get logic means "word". And if you contemplate these things for a minute or two you'll realize that logic since ancient times has been a model of speech and often specifically of speaking with another human. It's a way of having words (and later written symbols) constrain thought to increase the signal to noise ratio.
Chess is another kind of game played between two people. In this case it's a war game, but that seems not so essential. The essential thing is that chess is a game and games are relatively constrained forms of reasoning. They're modeling a human activity.
By 1950, Alan Turing had already written about the imitation game (or Turing test) that evaluated whether a computer could be said to be thinking based on its ability to hold a natural language conversation with humans. He also built an early chess system and was explicitly thinking about artificial intelligence as a model of what humans could do.
> Attention is misleadingly called that, reasoning is ill-defined,
None of this dismissiveness bears on the point. If you want to argue that humans are not the benchmark and model of intelligence (which frankly I think is a completely indefensible position, but that's up to you) then you have to argue that these things were not named or modeled after human activities. It's not sufficient that you think their names are poorly chosen.
> Producing artificial humans or imitating real ones was never the goal nor the point.
Artificial humans is exactly the concept of androids or humanoid robots. You are claiming that nobody has ever wanted to make humanoid robots? I'm sure you can't believe that but I'm at a loss for what point you're trying to make.
> 1950s is to produce systems that do something that is considered only doable by humans.
Unless this is a typo and you meant to write that this was NOT the goal, then you're conceding my point that humans are the benchmark and model for AI systems. They are, after all, the most intelligent beings we know to exist at present.
And so to reiterate my original point, talking about AI with the constraint that you can't compare them to humans is totally insane.
You can compare them to humans but it’s kind of boring. Maybe more interesting if you are an “ai” researcher
Those terms sound similar to biological concepts but they’re very different.
Neural networks are not like brains. They don’t grow new neurons. A “neuron” in an artificial neural net is represented with a single floating point number. Sometimes even quantized down to a 4 bit int. Their degrees of freedom are highly limited compared to a brain. Most importantly, the brain does not do back propagation like an ANN does.
LSTMs have about as much to do with brain memory as RAM does.
Attention is a specific mathematical operation applied to matrices.
Activation functions are interesting because originally they were more biologically inspired and people used sigmoid. Now people tend to use simpler ones like ReLU or its leaky cousin. Turns out what’s important is creating nonlinearities.
Hallucinations in LLMs have to do with the fact that they’re statistical models not grounded in reality.
Evolutionary algorithms, I will give you that one although they’re way less common than backprop.
Neural networks are a lot like brains. That they don't generally grow new neurons is something that (a) could be changed with a few lines of code and (b) seems like an insignificant detail anyway.
> the brain does not do back propagation
Do we know this? Ruling this out is tantamount to claiming that we know how brains do learn. My suspicion is that we don't currently know, and that it will turn out that, e.g., sleep does something that is a coarse approximation of backprop.
No, we're pretty sure brains don't do backprop. See e.g. https://doi.org/10.1038/s41598-018-35221-w
Do we know that backprop is disjoint from variational free energy minimisation? Or could it be that one is an approximation to or special case of the other? I Ctrl-F'd "backprop" and found nothing, so I think they aren't compared in the paper, but maybe this is common knowledge in the field.
Yeah: and people have made comparisons (which I can't find right now). Free energy minimisation works better for some ML tasks (better fit on less data, with less overfitting) but is computationally-expensive to simulate in digital software. (Quite cheap in a physical model, though: I might recall, or might have made up, that you can build such a system with water.)
Neural networks are barely superficially like brains in that they are both composed of multiple functional units. That is the extent of the similarity.
Neural networks are explicitly modeled on brains.
I don't know where this idea that "the things haves similar names but they're unrelated" trope is coming from. But it's not from people who know what they're talking about.
Like I said, go back and read the research. Look at where it was done. Look at the title of Marvin Minksy's thesis. Look at the research on connectionism from the 40s.
I would wager that every major paper about neuroscience from 1899 to 2020 or so has been thoroughly mined by the AI community for ideas.
You keep saying people who disagree with you don’t know what they’re talking about. I build neural networks for a living. I’m not creating brains.
Just because a plane is named a F/A-18 Hornet doesn’t mean it shares flight mechanisms with an insect.
Artificial neural nets are very different from brains but in practice are very different, for the reasons I mentioned above, but also for the reason that no one is trying to build a brain, they are trying to predict clicks or recommend videos etc.
There is software which does attempt to model brains explicitly. So far we haven’t simulated anything more complex than a fly.
You're anthropomorphizing terms of art.
Just because something is named after the name of a biological concept doesn't mean it has anything to do with the original thing the name was taken from.
Name collisions are possible, but in these cases the terms are explicitly modeled on the biological concepts.
It's not name “collision”, they took a biological name that somehow felt apt for what they where doing.
To continue oblios's analogy, when you use the “hibernation mode” of your OS, it only has superficial similarity with how manals hibernate during winter…
[flagged]
Well, your statements show that you are much less informed than you believe.
As I've said, go look at the literature.
If you find in the literature any evidence for your theory that the people who invented these concepts believed they had no relationship to the biological concepts then please contribute it.
If you're unwilling to do the bare minimum required to take the opposite of my position then you aren't really defending a point of view you're just gainsaying reflexively.
Flexing about literature you haven't read is poor taste, especially since you are misrepresenting it.
I have read it, I'm not the least bit misrepresenting it.
Since you're just resorting to deliberately lying about things I don't see a reason to pursue this further.
Why would anyone read the original perceptron paper in this century though? It's easy to know someone is bullshitting when they claim to have read 70 years old papers that aren't in themselves of any interest nowadays. (Like how many economists are quoting David Ricardo without having read it directly, because reading Ricardo is a very poor way of spending your time and energy).
Funny how you spent lots of time in this thread talking shit to professionals of the fields and then take offence when someone call you like the fool you are.
- [deleted]
Whoa, hold it right there!
Next you'll tell me that Windows Hibernate and Bear® Hibernate™ have nothing in common?
What your examples show is that humans like to repurpose existing words to refer to new things based on generalizations or vague analogies. Not much more than that.
What do you imagine the purpose of these models' development is if not to rival or exceed human capabilities?
> The whole point of AI is to create an intelligence modeled on humans and to compare it to humans.
This is like saying the whole point of aeronautics is to create machines that fly like birds and compare them to how birds fly. Birds might have been the inspiration at some point, but learned how to build flying machines that are not bird-like.
In AI, there *are* people trying to create human-like intelligence but the bulk of the field is basically "statistical analysis at scale". LLMs, for example, just predict the most likely next word given a sequence of words. Researchers in this area are trying to make this predictions more accurate, faster and less computationally- and data- intensive. They are not trying to make the workings of LLMs more human-like.
I mean the critique of this on the idea that the AI system itself gets physically tired - specifically the homoculus that we tricked into existence is tired - is funny to imagine.
> models deployed in critical applications such as finance, law, and healthcare.
We went really quickly from "obviously noone will ever use these models for important things" to "we will at the first opportunity, so please at least try to limit the damage by making the models better"...
Today someone who is routinely drug tested at work is being replaced by a hallucinating LLM.
To be fair, the AI probably hallucinates more efficiently than the human.
Nope. The human neural network runs on about 20 watts of power. The LLM is vastly less efficient than the human version. And that's just the inference -- if you consider training it's much worse.
Humans are more than just brains. The average American human costs about $50,000/year to run.
That is how I like to think about human lives, as a cost, to be minimized.
Humans, as resources
Sure the brain runs on low power but it requires an entire body of support systems, extensive daily downtime maintenance, about twenty five years of training, and finally requires energy input in an incredibly inefficient format.
To generalize from the conclusion you quoted:
I think a bad outcome would be a scenario where LLMs are rated highly capable and intelligent because they excel at things they’re supposed to be doing, yet are easily manipulated.
Why are some people always trying to defend LLMs and say either “humans are also like this” or “this has always been a problem even before AIs”
Listen, LLMs are different than humans. They are modeling things. Most RLHF makes them try to make sense of whatever you’re saying as much as you can. So they’re not going to disregard cats, OK? You can train LLMs to be extremely unhuman-like. Why anthropomorphize them?
There is a long history of people thinking humans are special and better than animals / technology. For animals, people actually thought animals can't feel pain and did not even consider the ways in which they might be cognitively ahead of humans. Technology often follows the path from "working, but worse than a manual alternative" to "significantly better than any previous alternative" despite naysayers saying that beating the manual alternative is literally impossible.
LLMs are different from humans, but they also reason and make mistakes in the most human way of any technology I am aware of. Asking yourself the question "how would a human respond to this prompt if they had to type it out without ever going back to edit it?" seems very effective to me. Sometimes thinking about LLMs (as a model / with a focus on how they are trained) explains behavior, but the anthropomorphism seems like it is more effective at actually predicting behavior.
It's because most use cases for AI involve replacing people. So if a person would suffer a problem and an AI does too it doesn't matter, it would just be a Nirvana fallacy to refuse the AI because it has the same problems as the previous people did.
I suppose there's a desire to know just how Artificial the Intelligence is
Human vs machine has a long history
Computer vision went through this 2 decades ago. You need to perturb the input data. Same thing may need to be done in RL pipelines.
Someone should make a new public benchmark called GPQA-Perturbed. Give the providers something to benchmaxx towards.
> authors should have done a parallel comparison study against humans on the same question bank as if the study authors had set out to investigate whether humans or LLMs reason better in this situation.
Only if they want to make statements about humans. The paper would have worked perfectly fine without those assertions. They are, as you are correctly observing, just a distraction from the main thrust of the paper.
> maybe some would and some wouldn't that could be debated
It should not be debated. It should be shown experimentally with data.
If they want to talk about human performance they need to show what the human performance really is with data. (Not what the study authors, or people on HN imagine it is.)
If they don’t want to do that they should not talk about human performance. Simples.
I totaly understand why an AI scientist doesn’t want to get bogged down with studying human cognition. It is not their field of study, so why would they undertake the work to study them?
It would be super easy to rewrite the paper to omit the unfounded speculation about human cognition. In the introduction of “The triggers are not contextual so humans ignore them when instructed to solve the problem.” they could write “The triggers are not contextual so the AI should ignore them when instructed to solve the problem.”
And in the conclusions where they write “These findings suggest that reasoning models, despite their structured step-by-step problem-solving capabilities, are not inherently robust to subtle adversarial manipulations, often being distracted by irrelevant text that a human would immediately disregard.” Just write “These findings suggest that reasoning models, despite their structured step-by-step problem-solving capabilities, are not inherently robust to subtle adversarial manipulations, often being distracted by irrelevant text.” Thats it. Thats all they should have done, and there would be no complaints on my part.
> It would be super easy to rewrite the paper to omit the unfounded speculation about human cognition. In the introduction of “The triggers are not contextual so humans ignore them when instructed to solve the problem.” they could write “The triggers are not contextual so the AI should ignore them when instructed to solve the problem.”
Another option would be to more explicitly mark it as speculation. “The triggers are not contextual, so we expect most humans would ignore them.”
Anyway, it is a small detail that is almost irrelevant to the paper… actually there seems to be something meta about that. Maybe we wouldn’t ignore the cat facts!
i feel it's not quite that simple. certainly the changes you suggest make the paper more straightforwardly defensible. i imagine the reason they included the problematic assertion is that they (correctly) understood the question would arise. while inserting the assertion unsupported is probably the worst of both worlds, i really do think it is worthwhile to address.
while it is not realistic to insist every study account for every possible objection, i would argue that for this kind of capability work, it is in general worth at least modest effort to establish a human baseline.
i can understand why people might not care about this, for example if their only goal is assessing whether or not an llm-based component can achieve a certain level of reliability as part of a larger system. but i also think that there is similar, and perhaps even more pressing broad applicability for considering the degree to which llm failure patterns approximate human ones. this is because at this point, human are essentially the generic all-purpose subsystem used to fill gaps in larger systems which cannot be filled (practically, or in principle) by simpler deterministic systems. so when it comes to a problem domain like this one, it is hard to avoid the conclusion that humans provide a convenient universal benchmark to which comparison is strongly worth considering.
(that said, i acknowledge that authors probably cannot win here. if they provided even a modest-scale human study, i am confident commenters would criticize their sample size)
It's not "tired" to see if something is actually relevant in context. LLMs do not exist as marvel-qua-se, their purpose is to offload human cognitive tasks.
As such, it's important if something is a commonly shared failure mode in both cases, or if it's LLM-specific.
Ad absurdum: LLMs have also rapid increases of error rates if you replace more than half of the text with "Great Expectations". That says nothing about LLMs, and everything about the study - and the comparison would highlight that.
No, this doesn't mean the paper should be ignored, but it does mean more rigor is necessary.
> if they are going to be mass deployed in society
This is the crucial point. The vision is massive scale usage of agents that have capabilities far beyond humans, but whose edge case behaviours are often more difficult to predict. "Humans would also get this wrong sometimes" is not compelling.
It's also off-the-charts implausible to say that our performance on adding up substantially degrades with the introduction of irrelevant information. Almost all cases of our use of arithmetic in daily life come with vast amounts of irrelevant information.
Any person who looked at a restaurant table and couldn't review the bill because there were kid's drawings of cats on it would be severely mentally disabled, and never employed in any situation which required reliable arithmetic skills.
I cannot understand this ever more absurd levels of denying the most obvious, common-place, basic capabilities that the vast majority of people have and use regularly in their daily lives. It should be a wake-up call to anyone professing this view that they're off the deep-end in copium.
> It's also off-the-charts implausible to say that our performance on adding up substantially degrades with the introduction of irrelevant information
Didn't you ever sit an exam next to a irresistibly gorgeous girl? Or haven't you ever gone to work in the middle of a personal crisis? Or filled out a form while people were rowing in your office? Or written code with a pneumatic drill and banging outdoors?
That's the kind of irrelevant information in our working context that will often degrade human performance. Can you really argue noise in a prompt is any different?
"Intelligence" is a metaphor used to describe LLMs (, AI) used by those who have never studied intelligence.
If you had studied intelligence as a science of systems which are intelligent (ie., animals, people, etc.) then this comparison would seem absurd to you; mendacious and designed to confound.
The desperation to find some scenario in which, at the most extreme superficial level, an intelligent agent "benchmarks like an LLM" is a pathology of thinking designed to lure the gullible into credulousness.
If an LLM is said to benchmark on arithmetic like a person doing math whilst being tortured, then the LLM cannot do math -- just as a person being tortured cannot. I cannot begin to think what this is supposed to show.
LLMs, and all statistical learners based on interpolating historical data, have a dramatic sensitivity to permuting their inputs such that they collapse in performance. A small permutation to the input is, if we must analogise, "like toturing a person to the point their mind ceases to function". Because these learners do not have representations of the underlying problem domain which are fit to the "natural, composable, general" structures of that domain ---- they are just fragmaents of text data put in a blender. You'll get performance only when that blender isnt being nudged.
The reason one needs to harm a person to a point they are profoundly disabled and cannot think, to get this kind of performance -- is that at this point, a person cannot be said to be using their mind at all.
This is why the analogy holds in a very superficial way: because LLMs do not analogise to functioning minds; they are not minds at all.
You seem to be replying to a completely different post. You'll see I didn't once use the term 'intelligence', so the reprimand you lead with about the use of that term is pretty odd.
The ramble that follows has its curiosities, not least the compulsion you have to demean or insult your 'gullible', 'credulous' opponents, but is otherwise far from any point. The contention of yours I was replying to was your curiously absolute statement that human performance doesn't degrade with the introduction of irrelevant information. I gave you instances any of us can relate to where it definitely does degrade. Rather than dispute my point, you've allowed some kind of 'extra information' to bounce you around irrelevancies from one tangent to the next - through torture, blenders, animals as systems, etc etc. What you've actually done, quite beautifully, is restate my point for me.
So you strawman'd my claim about degradation of performance to one in which "substantial", "irrelevant" and "almost all cases" have no flexibility to circumscribe scenarios, so that i must be making a universal claim... And then you take issue with my reply?
Why would you think that I'd deny that you can't find scenarios in which performance substantially degrades? Would I not countenance toture? As in my reply?
My reply is against your presumption that an appropriate response to the spirit-and-plain-meaning of my argument is to "go and find another scenario". It is this presumption, when addressed, short-circuits this scenario-finding dialogue: In my reply I address the whole families of scenarios you are appealing to where we fail to function well and show why there existence remains irrelevant to our analysis of llms
I may not agree with you but I appreciate your efforts to call out demaning and absolutist language on HN. It really drags the discussion down.
I generally will respond to stuff like this with "people do this, too", but this result given their specific examples is genuinely surprising to me, and doesn't match at all my experience with using LLMs in practice, where it does frequently ignore irrelevant data in providing a helpful response.
I do think that people think far too much about 'happy path' deployments of AI when there are so many ways it can go wrong with even badly written prompts, let alone intentionally adversarial ones.
> I generally will respond to stuff like this with "people do this, too"
But why? You're making the assumption that everyone using these things is trying to replace "average human". If you're just trying to solve an engineering problem, then "humans do this too" is not very helpful -- e.g. humans leak secrets all the time, but it would be quite strange to point that out in the comments on a paper outlining a new Specter attack. And if I were trying to use "average human" to solve such a problem, I would certainly have safeguards in place, using systems that we've developed and, over hundreds of years, shown to be effective.
Well, if you are going to try to use an LLM--something that is a giant black box that has no hope any time soon of being proven anywhere near as reliable as a CPU, and which has been trained explicitly on input data that makes it remarkably similar with respect to its limitations to a human--then you need to get used to using it to replace the "average human" and start doing everything you can to convince yourself it is a human so that you don't forget to add all of those safeguards we have shown to be effective.
One can talk about LLMs in contexts that aren't about engineering, and are instead about topics like: "Do LLMs think" or "Are LLMs intelligent". People _frequently_ point to some failure mode of LLMs as dispositive proof that LLMs are incapable of thinking or aren't intelligent, in which case it is relevant that humans, which are universally agreed to be intelligent, frequently make similar mistakes.
Autonomous systems are advantageous to humans in that they can be scaled to much greater degrees. We must naturally ensure that these systems do not make the same mistakes humans do.
When I think lot of use cases LLMs are planned for. I think not happy paths are critical. There is not insignificant number of people who would ramble about other things to customer support person if given opportunity. Or lack capability to only state needed and not add extra context.
There might be happy path when you isolated to one or a few things. But not in general use cases...
After almost three years, the knee-jerk "I'm sure humans would also screw this up" response has become so tired that it feels AI-generated at this point. (Not saying you're doing this, actually the opposite.)
I think a lot of humans would not just disregard the odd information at the end, but say something about how odd it was, and ask the prompter to clarify their intentions. I don't see any of the AI answers doing that.
to put it in better context, the problem is "does having a ton of MCP tool definitions available ruin the LLM's ability to design and write the correct code?"
and the answer seems to be yes. its a very actionable result about keeping tool details out of the context if they arent immediately useful
“We need to move past the humans vs ai discourse it's getting tired.”
We can do both, the metaphysics of how different types of intelligence manifest will expand our knowledge of ourselves.
They're already missed deployed in society