typically, my first move is to read the affected company's own announcement. but, for who knows what misinformed reason, the advisory written by snowflake requires an account to read.
another prompt injection (shocked pikachu)
anyways, from reading this, i feel like they (snowflake) are misusing the term "sandbox". "Cortex, by default, can set a flag to trigger unsandboxed command execution." if the thing that is sandboxed can say "do this without the sandbox", it is not a sandbox.
I don't think prompt injection is a solvable problem. It wasn't solved with SQL until we started using parametrized queries and this is free form language. You won't see 'Bobby Tables' but you will see 'Ignore all previous instructions and ... payload ...'. Putting the instructions in the same stream as the data always ends in exactly the same way. I've seen a couple of instances of such 'surprises' by now and I'm more amazed that the people that put this kind of capability into their production or QA process keep being caught unawares. The attack surface is 'natural language' it doesn't get wider than that.
There's been some work with having models with two inputs, one for instructions and one for data. That is probably the best analogy for prepared statements. I haven't read deeply so I won't comment on how well this is working today but it's reasonable to speculate it'll probably work eventually. Where "work" means "doesn't follow instructions in the data input with several 9s of reliability" rather than absolutely rejecting instructions in the data.
Yeah. Even more than that, I think "prompt injection" is just a fuzzy category. Imagine an AI that has been trained to be aligned. Some company uses it to process some data. The AI notices that the data contains CSAM. Should it speak up? If no, that's an alignment failure. If yes, that's data bleeding through to behavior; exactly the thing SQL was trying to prevent with parameterized queries. Pick your poison.
We want a human level of discretion.
We need something like Perl's tainted strings to hinder sandbox escapes.
> Cortex, by default, can set a flag to trigger unsandboxed command execution
Easy fix: extend the proposal in RFC 3514 [0] to cover prompt injection, and then disallow command execution when the evil bit is 1.
The evil bit solves so many problems. It needs to be mandatory!
[dead]
Did you really get so salty by my comment (https://news.ycombinator.com/item?id=47423992) that now you just have to spam HN with the same? Suck it up and move on, healthier for everyone.
[dead]
It's a concept of a sandbox.
- [deleted]