If anything is going to put capabilities into the programmer ecosystem, I think it's this problem.
The neat thing about this particular problem is that you can do some really coarse things and get some immediate benefit. Capabilities in their original form, and perhaps their truest form, carry down the call stack, so that code can do things like "restrict everything that I call can only append to files in this specific subtree" in the most sophisticated implementations. But you could do something more coarse with libraries and just do things like "These libraries can not access the network", and get big wins on some simple assertions. If you're a library for turning jpegs into their pixels, you don't need network access, and with only a bit more work, you shouldn't even need filesystem access (get passed in files if you need them, but no ability to spontaneously create files).
This would not be a complete solution, or perhaps even an adequate solution, but it would be a great bang-for-the-buck solution, and a great way to step into that world and immediately get benefits without requiring the entire world to immediately rewrite everything to use granular capabilities everywhere.
Capabilities are the only way.
It is insane to me that in 2025 there is no easy way for me to run a program that, say, "can't touch the filesystem or network". As you say, even a few simple, very coarse grained categories of capabilities would be sufficient for 95% of cases.
systemd-run inherits afaik all the extensive sandboxing features from systemd
https://www.freedesktop.org/software/systemd/man/latest/syst...
https://www.freedesktop.org/software/systemd/man/latest/syst...
sure, the command line get bit verbose but nothing that an alias or small wrapper couldn't solve
the big problem is that modern operating systems have huge surface area and applications tend to expect all sorts of things, so figuring out what you need to allow is often non-trivial
> say, "can't touch the filesystem or network"
Well, OpenBSD has pledge(2) and unveil(2), both of which are very easy to use.
https://man.openbsd.org/pledge.2 https://man.openbsd.org/unveil.2
I'm curious what all you want to run that you don't want to access the filesystem? Or the network?
Like, I get it for a few things. But it is a short path to wanting access to files I have created for other things. Pictures and videos being obvious files that I likely want access to from applications that didn't create them.
Similarly, it is a short hop from "I don't want any other application to be able to control my browser" to "except for this accessibility application, and this password management application, and ..." As we push more and more to happen in the browser, it should be little surprise that we need more and more things to interact with said browser.
I think you misunderstood. Among the coarse grained capabilities I mentioned would be "access to folder X and it's subfolders" (read or write).
But to answer your question there are, eg, tons of programming packages in any language that I want purely for their computational abilities, and I know this for certain when using them. In fact for the vast majority of GUI programs I use, or programming packages I use, I know exactly what kind of permissions they need, and yet I cannot easily restrict them in those ways.
It is specifically running applications that always trips me up here. As a user/operator of the computer, I have been bitten by applications being too locked down for them to be useful in the past. I /think/ we have gotten better such that it is easy to have better OS behavior when it wants to restrict an application. But specifically sandboxing by default has been a source of terrible application behavior for me, in the past. Is a lot like using a shadow banned account where everything looks correct, but nothing is actually showing up. Very confusing.
Now, I think your point on restricting the libraries that are imported to a program makes a ton of sense. I'm not entirely clear where I would want the "breaker box" of what an application is allowed to do to be located, but it is crazy how much just importing some library will do in many programs.
I think this sort of stuff implies a switch to new APIs that understand that the app is inside a sandbox, instead of "just trying to do things". For example, XDG Portals for opening/saving documents, instead of open(2) syscall.
Well you are ofc free to give applications full reign if you want. But you should at least be able to say, "No, desktop calculator I just downloaded, you can't do anything but compute and draw things in your application window".
More broadly, creating a good UI around granting capabilities is non-trivial. But that's a separate problem from simply not being able to make even the most basic kinds of restrictions that you want in most cases.
I think the ideas in qubes OS (https://www.qubes-os.org/) is reasonable in implementation given today's applications, and the need for backwards compatibility.
Unfortunately, the performance is what suffers, and morse law hasn't kept up such that vm based OS can be used by the regular laymen.
Totally fair. I just don't know of that many (any?) "desktop calculator" applications that people download. I'm far more expecting that people are downloading and running social applications than they are isolated things.
Mostly fair that it would be good if we could say "on site foo.com, request for any access to not-foo.whatever that happens." I can't remember the last time I saw the sheer number of third party network accesses that happens on far too many sites. It was sobering.
Oh, but they do! There used to exist a boatload of malware on Android disguised as common conscience apps, famously flashlight apps/widgets.
As a random example, see this one ( https://www.welivesecurity.com/2017/04/19/turn-light-give-pa... ) which is a banking trojan cosplaying as a flashlight widget.
Now there is a more or less sophisticated permission system which users then bypass by still accepting any prompt if you promise them anything shiny...
Apologies, I had dropped offline.
I actually am less against these ideas on the phone. Quite the contrary, I think I'm largely agreed that more efforts need to be done to let people control those.
I am also sadly skeptical that this works, there. I've seen my family that is all too eager to just click "ok" on whatever an app says it needs. :(
> Totally fair. I just don't know of that many (any?) "desktop calculator" applications that people download.
Quite a few apps fall into this category: single player games, photo editors, word editors, video players, pdf editors ...
It seems very reasonable to restrict these applications from accessing the internet.
Gaming, I'm willing to largely get behind as something that should be more locked down. Networked games, of course, are a thing. Single player games should be a lot more isolated, though.
Any sort of editing software, though, gets tough. That is precisely the are that I have had bad experiences in in the past. Would try to edit raw photos and export them to a place I could draw or publish with them. Using a shadow banned application is the only way I know on how to describe how that felt.
Anything really. I tried to print from a sandboxed application the other day and turns out it can't be done - or at least it can't be done by "common user". As an educated person who has been using unix of all sorts I could probably figure it out, but it isn't something I should have to figure out in 2025. (it was a pain in 1998, but it worked better than the snap sandboxes of today despite the pushers of snap from an organization claiming that ease of use is important)
Right, but this is largely to my point? I said in another thread that sandboxing often feels like being shadow banned on your own computer.
I get wanting "safe" computers. I'm not clear that we can technically define what legally "safe" means, though. :(
Now, i grant, we can probably get further than I would spit ball based on some bad interactions in the past.
> Right, but this is largely to my point? I said in another thread that sandboxing often feels like being shadow banned on your own computer.
> I get wanting "safe" computers. I'm not clear that we can technically define what legally "safe" means, though. :(
You are currently using a web browser. When you go to ycombinator, the site cannot read the contents of your email in the next tab. This isn't a shadow ban you on your own machine, it's just a reasonable restriction.
Imagine you just installed a new web browser (or pdf reader, tax software, video game, ...). It should not be able to read and send all the pictures in your camera roll to a third party.
> Imagine you just installed a new web browser (or pdf reader, tax software, video game, ...). It should not be able to read and send all the pictures in your camera roll to a third party.
But I use my web browser to upload my photos to the cloud, so it absolutely should.
(I do somewhat agree with the general point, but I find it very funny that your very first example would break my workflow, and I do think that highlights the problem with trying to sandbox general-purpose programs)
Cell phones show this can be done: you can pick individual files ot sets of files using system file picker, and that one file (and only that file!) is opened for browser.
If it needs more, there is always "access all photos" permission, and "access all files" too.. but this is explicit and requires user prompt. And the last part ia very important - if freshly installed browser requires full files access without explanation, this is likely for spyware, so uninstall it and leave bad review.
How about “curl” and “wget” shouldn’t have free rein to read/upload and/or overwrite every damn file owned by my user?
Why does “ping” need to have file system access?
Moving out of the world of "applications" into shell commands, we're gonna need a new shell that understands that `wget -o myfile https://example.com` needs to be handed a capability to write data, or we need change our habits a lot into always shuffling everything over pipes or such. In either scenario, if you want that level of granularity, I don't think UNIX will survive as we remember it.
(More likely path for now: start a new sandbox, run things in it, put result files in an "outbox", quit sandbox, consume files from outbox. Also not very convenient with current tools.)
Most things you run in a pipeline don't need access to the filesystem or the network.
Something dangerous like ffmpeg would be better if the codecs were running without access to files or the network, although you'd need a not fully sandboxed process to load the media in the first place.
Many things do need file access, but could work well with an already opened fd, rather than having to open things themselves (although forcing that results in terrible UX).
Of course, filesystem access gets tricky because of dynamic loading, but lets pretend away that for now.
ping(8) has no particular access to the filesystem, and can only do inet and stdio. At least on OpenBSD. I have a modified version of vi(1) that cannot write outside of /tmp or ~/tmp, nor access the internet, nor can it run programs. Other text editors could easily access ~/.ssh keys and cloud them. Whoops? sshd(8) and other daemons use privsep so that the likely to be exploited bits have no particular access to the system, only pipes off to other portions of the OpenSSH complex.
Maybe if rsync were better designed exploits could be better contained; alas, there was a recent whoopsiedoodle—an error, as Dijkstra would call them—and rsync can read from and write to a lot of files, do internet things, execute whatever programs. A great gift to attackers.
It may help if the tool does one thing and one thing well (e.g. the unix model, as opposed to the "I can't believe it's not bloat!"™ model common elsewhere) as then you can restrict, say, ping to only what it needs to do, and if some dark patterner wants to shove ads, metrics, and tracking into ls(1) how about a big fat Greek "no" for those network requests. It may also help if the tool is designed (like, say, OpenSSH) to be well partitioned, and not (like, say, rsync) to need the entire unix kitchen.
Image libraries have had quite a few CVE or whoopsiedoodles over the years, so there could be good arguments made to not allow those portions of the code access to the network and filesystem. Or how about a big heap of slow and expensive formal verification… what's that, someone with crap security stole all your market share? Oh, well. Maybe some other decade.
A non-zero number of people feel that "active content" e.g. the modern web is one of the worst security missteps made in the last few decades. At least flash was gotten rid of. So many CVE.
P.S. web browsers have always sucked at text editing, so this was typed up in vi yielding a file for w3m to read. No, w3m can't do much of anything besides internet and access a few narrow bits of the filesystem. So, for me, web browsers are very much in the "don't want to access the filesystem" category. I can also see arguments for them not having (direct) access to the network, to avoid mixing the "parse the bodge that is HTML and pray there are no exploits" with the "has access to the network" bits of the code, but I've been too lazy to write that as a replacement for w3m.
Simple example: Third party SW in a corporate context. Maybe you want to extend some permissions to some internal sites/parts of the FS, but fundamentally, there's limited trust.
This is an odd one. At face value, I want to agree. At the same time, if you don't trust the operator of the computer with access to data, why are we also worried about programs they run? If you don't trust them with access, then just don't give them access?
I'm open to the idea that some people are locked down such that they can't install things. And, that makes a lot of sense. You can have a relationship that is basically, "I trust them with access to data running this closed set of applications." Managing system configurations makes a ton of sense.
But, as soon as you have full trust of system management on a group, you start getting in odd worlds where you want to allow them to have full access, but want to stop unauthorized use. Which, we don't have a way to distinguish use from access for most data.
Trusting the user does not transitively extend to the software they use. You might be OK with them e.g. looking at company financials, but you'd really like to be sure e.g. that the syntax highlighter they use doesn't go and exfil that data. You still want them to be able to use the syntax highlighter. (Yes, it's an obviusly made-up example_
You _can_ fully vet apps, each and every one. Or you can choose a zero-trust approach and only vet the apps where it's necessary to extend trust.
What about Deno? https://docs.deno.com/runtime/fundamentals/security/#key-pri...
The key requirement to solve this problem is that you can ensure that third party libraries get a subset of the permissions that the code calling them has. E.g. My photo editor might need read and write access to my photo folder, but the 3rd party code that parses jpegs to get their tags needs only read access and shouldn't have the ability to encrypt my photos and make ransom demands.
Deno took a step in a good direction, but it was an opportunity to go much further and meaningfully address the problem, so I was a bit disappointed that it just controlled restrictions at the process level.
Kind of two different things being addressed here. The article is talking about doing this at the granularity of preventing imported library code from having the same capabilities as the caller, which requires support from the language runtime, but the comment being responded to was saying there is no way in 2025 to run a program and keep it from accessing the network or the filesystem.
That is simply not true. There are many ways to do that, which have been answered already. SELinux. Seccommp profiles. AppArmor. Linux containers (whether that be OCI, bubblewrap, snap, app images, or systemd-run). Pledge and jails.
These are different concerns. One is software developers wanting to code into their programs upper limits to what imported dependencies can do. That is poorly supported and mostly not possible outside of research systems. The other is end users and system administrators setting limits on what resources running processes can access and what system calls they can make. That is widely supported with hundreds of ways to do it and the main reasons it is perceived as complicated is because software usually assumes it can do anything, doesn't tell you what it needs, and trying to figure it out as an end user is an endless game of playing whack-a-mole with broken installs.
Deno controls access at the process level, so it's better than nothing but it doesn't really help with this specific problem. Also it delegates setting the permissions up to the user, and we know that in practice everyone is just going to --allow-all.
OpenBSD has had “pledge” for quite a while. I think it’s a good idea, I wish it was supported by Linux because, as you note, a few basic patterns could help immensely.
Linux ended up with seccomp and landlock.
Can do this with Qubes OS by running in a non-networked qube
Pledge
Doing this at the program level is implemented in Linux by SELinux, which defines mandatory access controls (aka limitations on capabilities). This was difficult to get right by default and make a smoothly functioning distro with policies enabled. But it is enabled by default in Fedora.
https://en.m.wikipedia.org/wiki/Security-Enhanced_Linux
To enable this at the programming level would require an enforcement mechanism at the level of a language VM or OS. It would require more overhead to enforce at that level, but the safety benefits within a language may be worth it.
bubblewrap?
Nothing in the container space would qualify as "easy" for me. I am talking about native OS features. Also linux only (afaik)...
Or even, for that matter, some sane capabilities for browser extensions.
SELinux!
Half jokes aside, I'd love to be able to have a decent mobile style permissions experience on browser extensions and desktop apps.
> in 2025 no easy way for me to run a program that, say, "can't touch the filesystem or network".
Qubes OS exists for more than 10 years already. My daily driver, can't recommend it enough.
Windows pro also has a sandbox that can disable filesystem and network access, but they wasted the opportunity by allowing only one sandbox process at a time.
That is almost the exact backwards way to talk about capabilities. It is not about "restricting" access, it is about "granting" access.
"These libraries can not access the network." No. "These libraries have not been given access to the network (and by default none are given access)."
From an implementation perspective, this is just passing in access rights as "local" resources instead of using "global" resources. For instance, it is self-evident that other code can not use your B-Tree local variable if you did not pass a reference to it to any called functions (assuming no arbitrary pointer casts). You just do the same with these "resources". It is just passing things to functions instead of relying on globals. The only difficulty is making these actions/resources "passable", which is trivial at the language-level, and "fine-grained/divisible" to avoid over-granting.
Clearly these things are dual and you can easily model them either way, and indeed, should think about them both ways.
Just to say, but You are describing an effect type system. (Or, to the "types are static" people, possibly a dynamic effect system.)
Capabilities taken literally are more of a network thing (it's how you prove you have access to a computer that doesn't trust you). On a language, you don't need the capabilities themselves.
You can do a lot with capabilities at the language level, if the language supports it. It doesn't require effect types: it's extremely boring from a type system perspective, which is an advantage.
Imagine these changes to a language, making it "capability safe":
- There is a `Network` object, and it's the only way to access the network. Likewise, there's a `Filesystem` object, and it's the only way to access the file system.
- Code cannot construct a `Network` or `Filesystem` object. Like, you just can't, there's no constructor.
- The `main()` function is passed a `Network` object and a `Filesystem` object.
Consider the consequences of this. The log4j vulnerability involved a logging library doing network access. In a capability safe language, this could only happen if you passed a `Network` object to the logger, either in its constructor or in one of its methods. So the vulnerability is still possible. But! Now you can't be surprised that your logger was accessing the network because you gave it the network. And a logger asking for network access is sufficiently sketchy that log4j almost certainly would have made network access optional for the few users that wanted that feature, which would have prevented the vulnerability for everyone else!
I talked about there being a single monolithic `Filesystem` object. That is what would be passed into `main()`, but you should also be able to make finer grained capabilities out of it. For example, you should be able to use the `Filesystem` object to construct an object that has read/write access to a directory and its subdirectories. So if a function takes one of these as an argument, you know its not mucking around outside of the directory you gave it.
Capability safety within a language is stronger than the sort of capability safety you can get at the process level or at the level of multiple machines. But we want those too!
In theory Rust where you don't allow unsafe can do that. (Reality is not perfect: https://github.com/rust-lang/rust/issues/25860)
Theseus OS is a research project that created single address space kernel that loads dynamic libraries written in safe-only Rust as "untrusted entities". Compiled with a trusted compiler that forbids unsafe. If they have no access to unsafe, and you're not giving them functions to link to that would hand them these a-capability-if-you-squint objects, they're supposedly sandboxed.
this was the original approach for Java, and the main idea behind being able to ship untrusted apps directly to the user and run them in the browser
tomcat also gave different privilege levels to different parts of the classpath
nice idea, but didn't seem to work in practice
That's what I'm calling capabilities at the application level (where you say "this .jar can't access the network"), rather than at the language level (where you say "this function wasn't given access to the network as an argument"). I don't think any widely known language has tried capabilities at the language level. (The language not widely known that does it is E.)
The real power will come when you can mix them. In one direction, that looks like `main()` not getting passed a `Network` object because the ".jar" wasn't granted network access. In the other direction, that looks like the system C library you invoked failing to access the network because when you invoked it from Java you failed to provide a `Network` object, so it was automatically sandboxed to not be able to access the network.
Have you looked at "Emily" language at all? Same person or two, their original paper isn't reachable at their linked URL, but is on the Wayback Machine. "How Emily Tamed the Caml" - https://web.archive.org/web/20221231184748/https://www.hpl.h...
Emily took the approach of restricting a subset of OCaml, which they were running on Windows machines. No idea how tough it would be to get it running on a modern version.
Also for OCaml, the MirageOS unikernel is neat - develop on Linux etc., then recompile configured as a virtual machine with only the drivers (network, filesystem) needed by that one app. - https://mirage.io/
> That's what I'm calling capabilities at the application level (where you say "this .jar can't access the network"), rather than at the language level (where you say "this function wasn't given access to the network as an argument")
it wasn't, it was done at the type and classloader level
have a read about SecurityManager
- [deleted]
> this could only happen if you passed a `Network` object to the logger
Yeah, this is not only an effects type system, it's a static effects type system.
No it's not. I'd guess you're imagining a type signature like this one?
I mean a type signature like this one:void effect:Network my_function_to_connect_to_server() { ... }
Where there's absolutely nothing special about the Network type, except that (i) it has no constructor, and (ii) nothing in the Java language lets you talk to a network without it. All of this is expressible with Java's type system today.void my_function_to_connect_to_server(Network network) { ... }
> void my_function_to_connect_to_server(Network network)
See how there's the Network right there on the type?
That's an ordinary argument type. Like:
It's not categorically different than the other arguments, and doesn't require an effect type system.void my_function_to_connect_to_server( Network network, int port, String server_name )
At this point I'm not sure if you're misunderstanding what I'm proposing, or if you don't know what an effect system is. If you care to communicate, you'll need to say something more substantial. Walk me through what you're trying to say.
Make it generic, with some kind of Resource interface that Network is one of the implementations¹, and you get a dynamic effect system.
But that thing you wrote is a static effect system. Having a singleton that you need to carry with the type doesn't make it not part of the type system.
1 - Or actually use Resource as a class. But Java wouldn't do this.
I think you're talking about what power a type system needs to say that this function doesn't have undesirable side effects, while Justin is saying if you can't just "import std.net" and access the network by side effects (and can't make raw syscalls either, or poke around in raw memory), and nothing hands you a Network value, a plain old boring type system is enough.
Thank you for the clarity.
> But that thing you wrote is a static effect system.
The thing I wrote is expressible in Java's type system as it is today. So you're saying that Java has a static effect system?
> Capabilities taken literally are more of a network thing (it's how you prove you have access to a computer that doesn't trust you). On a language, you don't need the capabilities themselves.
You may be thinking of the term in a different context. In this context, they are a general security concept and definitely apply to more than the network, including languages:
https://en.wikipedia.org/wiki/Capability-based_security
http://habitatchronicles.com/2017/05/what-are-capabilities/ (this is a great article)
etc...
> On a language, you don't need the capabilities themselves.
Why not? Sometimes I want that. Just today I was looking at some code and realized that it was being used by groups that shouldn't. The code needs to be there and is great for the correct users, but someone is going around our architecture and we didn't notice it.
You are replying to the wrong post. I was quoting the GP.
Effect typing can achieve capability security statically, but I've also frequently seen capabilities used to describe dynamic capability systems via various methods involving passing capabilities down as reified objects of some sort.
This of course also depends on some related static guarantees that a function can't access ambient capabilities not passed directly in their arguments, but this is a much simpler static guarantee than effect typing and doesn't require specific analysis over normal parameter types.
Capabilities are more general than that. They have been around for a long time; one academic design implemented them in hardware in 1970[1]. they can be used in as fine grained a way as 'you are allowed to access this array in memory and nothing else'- which is the sort of thing that needs to be built into languages to be maximally useful.
I disagree about your last point. I looked at a product design once which allowed one to associate a profile with each third party library. accesses across library boundaries were implementing using classic call gates from segmented architectures which could change memory visibility and syscall filtering.
so while I agree that language integration is really useful, I think you can get a lot out of appropriate support in the runtime, most notably the library loader.
Yes, that's why I said "to be maximally useful" instead of "to be useful"