> Deterministic simulation. Another lightweight method widely used at AWS is deterministic simulation testing, in which a distributed system is executed on a single-threaded simulator with control over all sources of randomness, such as thread scheduling, timing, and message delivery order. Tests are then written for particular failure or success scenarios, such as the failure of a participant at a particular stage in a distributed protocol. The nondeterminism in the system is controlled by the test framework, allowing developers to specify orderings they believe are interesting (such as ones that have caused bugs in the past). The scheduler in the testing framework can also be extended for fuzzing of orderings or exploring all possible orderings to be tested.
Any good open source libraries that do this that are language agnostic? Seems doable - spin up a container with some tools within it. Said tools require some middleware to know when a test is going to be run, when test is run, tools basically make certain things, networking, storage, etc "determinstic" in the context of the test run.
This is more-or-less what antithesis does, but haven't seen anything open source yet.
You of course, could write your tests well, such that you can stub out I/O, but that's work and not everyone will write their tests well anyway (you should do this anyway, but it's nicer imo if this determinism is on a layer higher than the application).
as a slight sidebar - I'm not really bullish on AI, but I think testing is one of the things where AI will hopefully shine, because the feedback loop during prompting can be driven by your actual application requirements, such that the test implementation (driven by AI), requirements (driven by you as the prompt) and "world" (driven by the actual code being tested) can hopefully help drive all three to some theoretical ideal. if AI gives us anything, I'm hoping it can make software a more rigorous discipline by making formal verification more doable.
There have historically been two giant adoption challenges for DST.
(1) Previously, you had to build your entire system around one of the simulation frameworks (and then not take any dependencies).
(2) It’s way too easy to fool yourself with weak search/input generation, which makes all your tests look green when actually you aren’t testing anything nontrivial.
As you say, Antithesis is trying to solve both of these problems, but they are very challenging.
I don’t know of anybody else who has a reliable way of retrofitting determinism onto arbitrary software. Facebook’s Hermit project tried to do this with a deterministic Linux userspace, but is abandoned. (We actually tried the same thing before we wrote our hypervisor, but found it didn’t work well).
A deterministic computer is a generically useful technology primitive beyond just testing. I’m sure somebody else will create one someday, or we will open-source ours.
Trust me, I love FDB, but that's not the same thing. The FDB team IIRC had to write their own programming language to do this. It's not a agnostic layer above the application.
The problem with coupled tooling is that no one will use it. That's what is cool about antithesis. If they're able to complete their goal, that's basically what will be achieved.
I guess you meant to say "only the people working on the software coupled to the tooling will use it". It's not just FDB & Amazon that are using something like this, and it is a ridiculously powerful type of tool for debugging distributed systems.
ah, yes.
Fiar point. I was thinking about antithesis, but it's not open source (yet?). Turns out I also didn't read your comment well enough. Back to lurking I go.
I suspect you can relatively easily obtain a completely deterministic machine by running QEMU in 100% emulation mode in one thread. But what you are after is controlled deterministic execution, and it's far harder. That is, making your multiple processes to follow a specific dance that triggers an interesting condition must be very involved, when seen from the level as low as CPU and OS scheduler. Hence a language-agnostic setup is hard to achieve and especially hard to make it do your bidding. It may drown you in irrelevant details.
I once built a much, much simpler thing that allowed to run multiple JVM threads in a particular kind of lockstep, by stubbing and controlling I/O operations and the advance of the system time. With that, I could run several asynchronously connected components in particular interaction patterns, including not just different thread activation order but also I/O failures, retries, etc. It was manageable, and it helped uncover a couple of nasty bugs before the code ever ran in prod.
But that was only possible because I went with drastic simplifications, controlling not the whole system but only particular synchronization points. It won't detect a generic data race where explicit synchronization would be just forgotten.
I'm sure I heard that something like this existed for the JVM ages ago (like 15 years). I don't remember the details so it might not be quite the same, but a colleague was telling me about some tech which would test your concurrency by automatically selecting bad scheduling orders.
https://rr-project.org/ for languages that can be debugged by gdb.
+1 for rr. Bonus feature is you can also time-travel debug! It's spoiled me forever...
- [deleted]
There was a talk from Joe Armstrong about using property testing to test Dropbox.