Conformance checking at MongoDB: Testing that our code matches our TLA+ specs

mongodb.com

・

109 points

・

todsacerdoti

・

3 days ago

68 comments

amazingamazing ・ 3 days ago

interesting article - I wonder why TLA+ isn't more popular. seems useful, but I guess it's hard to iterate fast with it.

shoo ・ 3 days ago

Mike Dodds' article "What Works (and Doesn't) Selling Formal Methods" offers some insight - two points were: (i) many potential applications of formal methods in business do not result in an attractive cost-benefit outcome, relative to mainstream software development approaches, especially in being able to produce enough benefit for a small input of time / resources. (ii) anecdote: one business that needs to produce software that complies with certain validation/quality targets - say achieved with a baseline method using a lot of manual QA and without formal methods - does not get a business benefit to investing in increased validation, they just need enough enough validation to clear the compliance bar or rank slightly ahead of the competition. if formal methods could assist them in achieving the same level of validation to meet their compliance obligations with less cost than their baseline development and QA processes then they might have much more appetite to invest in formal methods.
https://www.galois.com/articles/what-works-and-doesnt-sellin...
corresponding HN thread from earlier this week: https://news.ycombinator.com/item?id=44131324
ketzo ・ 3 days ago

I think it’s just fairly rare that formal verification is valuable to a business.
Obviously all software benefits from correctness — but we all know that software certainly doesn’t need to be bug-free to make money!
And if the benefits of formal verification don’t outweigh the (non-trivial!) costs — and critically, outweigh them in a business-person-legibile way — then people won’t put up the time and effort.
oggy ・ 3 days ago

I think the main answer was given by another comment: for most projects, correctness usually isn't worth that much (i.e., a bug isn't that expensive for a company producing a piece of software). It also isn't in the software culture (yet?). Today people will be shocked if you don't have a version control system and a CI pipeline. Few people had one 20 years ago. Also, people are often reluctant to learn a new paradigm (think functional programming).
Having done multiple TLA verification projects myself, here are some technical ones:
1. It's actually surprisingly hard to write a good specification, i.e., precisely and formally state what you want. Often it's not even practical, i.e., the specification is so complex that you end up nearly recreating the implementation. Kolmogorov complexity is also a thing for specifications, I suppose ;) 2. TLA in particular is mostly useful for concurrent/distributed systems. 3. Model checking (the method used to verify the TLA models) hits limits pretty quickly. E.g., you may be able only check your system for 2 threads, but forget about 3 threads. 4. The TLA tooling is very clunky by modern standards.
- _flux ・ 3 days ago
  
  I think for this reason TLA+ is—or should be—good for designing systems: it keeps your design as simple as possible :) and it asks you the difficult questions. Sort of a logical rubber duck.
  And the best part is, when you implement the spec, it works! At least by design.
  It is quite a different task to implement models for existing systems in fine detail. But it can be done, e.g. https://arxiv.org/abs/2409.14301 . This group in particular combined fine and coarse grained models for the checking, allowing to focus the checking performance only on the parts of the system they were interested in.
windows2020 ・ 3 days ago

It's the complete opposite of asking an LLM to make an app, which is all the rage.
stusmall ・ 3 days ago

I recently took some training on TLA+. I loved the concepts of it. It seems amazingly powerful for prototyping and proving out algorithms/architectures. A big concern of mine was testing models vs implementation and managing the drift. I talked to the instructor a bit about it and he actually pointed to what Mongo was doing as described in the article. There wasn't good off the shelf tooling available to make something similar work for us at our scale without major amount of work.
Where on the other hand, model based property testing gets us a lot of the benefits but drops in more easily. There are great proptest frameworks available in the language we use an the team is already experienced with then. TLA+ is much more exhaustive and powerful, but for the exact project I was looking at the additional cost couldn't be justified by the additional benefit.
I believe we still have some TLA+ work moving forward inside the company but for my product I couldn't justify it.
cmrdporcupine ・ 3 days ago

Honestly... Just like with writing extensive unit tests, you will then end up basically writing the program twice. Which is not going to appeal to the "move as fast as possible" crowd.
- pjmlp ・ 3 days ago
  
  At least the unit tests can validate the actual implementation.
  
  cmrdporcupine ・ 3 days ago
  ・ 4 more
  
  I mean you only validate the parts that the test writer thought to verify. It will always remain unexhaustive and full of holes.
  
  pjmlp ・ 3 days ago
  ・ 3 more
  
  Which still remains more than what the TLA+ theoric model can offer.
  
  stusmall ・ 3 days ago
  ・ 2 more
  
  This comment thread is on an article about how Mongo applies TLA+ models to an implementation. They talk about how they use the model to generate test cases for the application. There are fair criticisms of formal modeling, but that they can't test the implementation isn't one.
  I mention in another comment, I ended up not going the TLA+ route. It isn't because it cannot offer this, it's because it's a heavy, opinionated investment. That is a good trade off for some systems, like databases, but not for everything.
  
  pjmlp ・ 3 days ago
  
  The criticism is regarding TLA+ in particular, versus other formal verification tools that generate code directly from the model without additional effort building tools, or manual verification that model and actual code map to the same semantics.
KeplerBoy ・ 3 days ago

Because you can't ship a TLA+ model and the chances of the model matching the implementation are nil. So you spend a lot of time proving your approach could work flawlessly in theory if you don't fuck up the implementation.
For most problems this is an assumed certainty.

oggy ・ 3 days ago

Funny to see this posted on HN, just last week I finished writing a blog post about a project I did for checking that code matches the TLA+ specs so I have to shamelessly plug it :) [1] I was aware of the MongoDB paper, but I ended up actually doing exactly what they suggested wouldn't work: I instrumented Rust programs to log traces and checked them against the specification. Even though the models were largely also post-factum models as in their case (i.e., the code was there first, and models were built later on), this worked for us since the models really were aimed at capturing the implementation's behavior. Our target implementation is possibly smaller than what they had to deal with, though (it's only around 5kLoC of Rust smart contracts) so that's a factor.

My technique was slightly different, though. First, I ended up ignoring the initial states, because tests would often manually create fixtures that would serve as a starting point. So I only ended up checking that trace steps obey the transition predicate, which is weaker, but hey, all this is best-effort only anyways. Second, I ended up using the Apalache tool instead of TLC; the reason being that my transitions were sometimes of the form "there exists a number between 1 and N". While model checking you would pick a small N, but in the test runs where the traces come from the N was sometimes huge. TLC ends up enumerating all the possibilities between 1 and N, whereas Apalache translates the whole thing into an SMT formula which is generally trivial to check. Apalache also has the side benefit of requiring type annotations (and providing a type checker) which makes the models a lot easier to refactor when the code changes.

I also ended up creating a small library for helping instrument Rust programs to collect such logs. The idea here was to minimize the amount of logging statements in the production code, to keep it out of the way of the people working on the code base who aren't familiar with TLA. It's somewhat specific to our code base, and there's a fair amount of unsafe nastiness that works only because our code is of certain shape, but in case someone's interested, it's open source: https://github.com/dfinity/ic/tree/master/rs/tla_instrumenta....

I wasn't aware of the 2024 paper they reference though, so curious to see what approach they took.

[1] https://www.mongodb.com/blog/post/engineering/conformance-ch...

jiryu ・ 3 days ago

That's a really cool technique oggy, I don't think I've heard of anything exactly like it. If you haven't told the TLA+ community about it yet, I'm sure they'd like to learn about your approach, either on the mailing list[1] or at the next TLA+ conference[2] in 2026.
[1] https://groups.google.com/g/tlaplus
[2] https://conf.tlapl.us/home/
- oggy ・ 3 days ago
  
  Thank you for the kind words! I haven't really talked about it anywhere yet since it's fresh off the press, I'll definitely post it on the mailing list.

magicalhippo ・ 3 days ago

About 15 years ago or so, webscale was everything and MongoDB was webscale[1][2] hence being used everywhere.

Has it become a legacy product, or are there still good reasons for using it in a new project in 2025?

[1]: https://www.youtube.com/watch?v=b2F-DItXtZs

[2]: https://news.ycombinator.com/item?id=1636198

erulabs ・ 3 days ago

Been an SRE and DBA for almost 20 years and the only truth in tech I firmly believe in: use whatever you want in a new project. Build fast and get it out. Once you have paying users, hire old dads like me to move you to Cassandra or Vitess or TiDB or something, or dont and just pay the bills for mongodb and laugh all the way to Series C.
I wouldn't start a new project with MongoDB, I'd probably use ScyllaDB, and i'd spent months getting the data model just right while you launch and get paying customers.
- VWWHFSfQ ・ 3 days ago
  
  We're seeing a convergence of document DBs adding relational features, and relational DBs adding document features. At this point I find the best of both worlds to simply be PG with JSONB.
  create table document ( id uuid primary key default gen_random_uuid(), doc jsonb );
  This alone will give you a huge number of the features that Mongo provides, but also the full power of Postgres for everything else.
0x70dd ・ 3 days ago

Until very recently, at the company I work for, we were running one of the largest (if not the largest) replica set cluster in terms of number of documents stored (~20B) and data size ~11TB. The database held up nicely in the past 10 years since the very first inception of the product. We had to do some optimizations over the years, but those are expected with any database. Since Mongo Atlas has a hard limit on the maximum disk size of 14TB we explored multiple migration options - sharding Mongo or moving to Postgres/TimescaleDB or another time series database. During the evaluation of alternative database we couldn't find one which supported our use case, that's highly available, could scale horizontally and that's easily maintainable (e.g. upgrades on Mongo Atlas require no manual intervention and there's no downtime even when going to the next major version). We had to work around numerous bugs that we encountered during sharding that were specific to our workloads, but the migration was seamless and required ~1h of downtime (mainly required to fine-tune database parameters). We've had very few issues with it over the years. I think Mongo is a mature technology and it makes sense depending on the type of data you're storing. I know at least few other healthcare companies that are using it for storing life-critical data even at a larger scale.
- HdS84 ・ 3 days ago
  
  Did you evaluate RavenDB? Just out of interest
  
  serguzest ・ 3 days ago
  
  Ravendb is trash. It can’t handle even 1/10 of this type of a load, it was trashed in jepsen testing too. I had to work with it 4 years and i disliked it.
tgv ・ 3 days ago

> are there still good reasons for using it in a new project in 2025?
I've written this before: if your data looks like trees, with some loose coupling between them, it's a good choice. And most data does look like trees.
It does place some extra duties on the backend. E.g., mongodb doesn't propagate ("cascade") deletes (this also happens to be a feature I dislike: not so long ago, a delete of an insignificant record triggered an avalanche of deletes in a postgresql database, and restoring that took quite a bit of time.)
- nailer ・ 3 days ago
  
  > I've written this before: if your data looks like trees, with some loose coupling between them, it's a good choice. And most data does look like trees.
  I had an education statup a little while ago.
  Courses had many cohorts, cohorts had many sessions.
  It really was much nicer having a single tree structure for each course, appending new cohorts etc, rather than trying to represent this in a flat Postgres database.
  That said I acknowledge and agree with other commenter's experiences about MongoDB and data loss in the past.
- mxey ・ 3 days ago
  
  Postgres defaults to preventing delete when foreign keys would be violated, not cascading.
  
  tgv ・ 3 days ago
  
  The sane default. It was a Django project, and unfortunately many relations were set to ondelete=cascade.
- elcritch ・ 3 days ago
  
  My main concern would be with querying. What is querying like in modern mongodb, or scylladb? I've seen couchbase ads showing a superset of sql, etc. Last I recall Mongo had a weird query system.
  
  senderista ・ 3 days ago
  
  Basically it requires you to write the query plan yourself, which IMO is not a bad idea for an OLTP system.
- Semaphor ・ 3 days ago
  
  Cascading deletes are an optional feature in every DB that supports them, aren’t they? I only have DBA-ish experience with SQL Server, but there they are fully opt in.
  
  hobs ・ 3 days ago
  ・ 2 more
  
  Yes, because they are terrible and break replication for cross system repl.
  
  Semaphor ・ 2 days ago
  
  Yeah, but OP made it seem like they were some kind of mandatory or at least default feature.
Aldipower ・ 3 days ago

MongoDB is used for ~8 years straight here with a replica set that replicates itself and even crossing datacenter boundaries. It runs smooth, stable, fast and allows for a good backup strategy. It implements all the database concept you can image, offers great compression algos and it easy to maintain. Short: It drives the job and it does that good.
threeseed ・ 3 days ago

MongoDB as a company is growing 20% y/y and 2B in revenue.
So very far from being a legacy product.
I still use it for new projects because (a) Atlas is genuinely a solid offering with a great price point for startups and (b) schema-less datastores have become more of a necessity as our control of data has decreased e.g. SaaS companies dictate their schema and we need to accomodate.
- anonymousDan ・ 3 days ago
  
  Interesting, can you elaborate on what you mean by (b) with an example?
- MoreQARespect ・ 3 days ago
  
  They're really good at sales and marketing, especially aimed at beginners.
  I'm still puzzled why people use it given that it's a database and there's nothing technical it ever did better than any of its competitors. The best that can be said for it is that it works about as well in some circumstances as competing databases do.
  
  Aldipower ・ 3 days ago
  
  What you're writing sounds like marketing to me. Really bad one...
tomnipotent ・ 3 days ago

The original storage engine was terrible, but they wised up and later acquired and adopted WiredTiger as the default. It was sort of their InnoDB moment and went a long way to improving performance, reliability, replication, and other "enterprisey" features.
lunarcave ・ 3 days ago

Personal opinion.
I swore away from it for 10 years, but came back recently. And I'm pleasantly surprised with the developer experience of MongoDB Atlas (the cloud version).
You just have to keep in mind the common sense best practices about developing with kv stores, and you'll be mostly alright.
otabdeveloper4 ・ 3 days ago

> Has it become a legacy product, or are there still good reasons for using it in a new project in 2025?
We're in the process of migrating our legacy stuff to mongodb right now.
If you want a high-availability database solution that "just works" and is open then you don't have many other options.
(Postgress and Mysql don't have automatic failover and recovery.)
- mxey ・ 3 days ago
  
  PostgreSQL does not have built-in clustering but Patroni does it automatically.
  MySQL Group Replication isn‘t automatic in all cases, like starting after all nodes failed, and it has some limitations, but it is built-in.
  Not saying that MongoDB ReplicaSet is bad, has been working very well for us AFAICT.
  
  otabdeveloper4 ・ 2 days ago
  
  If you rely on Postgres and/or Mysql replication then you'll need a DBA on call. Lots of manual work involved in babysitting them when stuff goes south.
  Mongodb is entirely automatic, you just delete bad servers and reprovision them automatically and everything "just works" with no interruption.
- guappa ・ 3 days ago
  
  mongodb is not open
  
  usoBordal ・ a day ago
  
  Agreed. It seems like something really bad is going on. I hope the Mongodb security team will handle it ASAP.
  
  otabdeveloper4 ・ 3 days ago
  ・ 3 more
  
  I disagree with that. The Server Side Public License is more open than the AGPL.
  In the same sense that the GPL is more open than the MIT license; more viral requirements for openness are generally a good thing. I don't want Amazon and the ilk deploying hosted Mongodb clusters.
  
  guappa ・ 3 days ago
  ・ 2 more
  
  People have the most varied creeds and faiths. However the facts are that it doesn't respect freedom 0.
  
  otabdeveloper4 ・ 2 days ago
  
  Freedom of what? Freedom of business for cloud providers? I don't give a shit about that.
  Open source for me is a hedge against risks. (Especially political and/or legal ones.) Mongodb's current license resolves that issue for me perfectly.
Thaxll ・ 3 days ago

Only here you read that MongoDB is a legacy product.
Fortnite runs on MongoDB, it's one of the largest game on the planet.
15 years ago Postgres did not have proper replication, shocking right?
- viraptor ・ 3 days ago
  
  It didn't have replication integrated. Slony replication existed for 20 years, same as pgpool.
- guappa ・ 3 days ago
  
  I doubt they use mongodb for the actual real time game. Probably just for the user accounts, items, skins and that sort of stuff. It could literally be done by anything else.
- mxey ・ 3 days ago
  
  Did MongoDB have _proper_ replication 15 years ago?
Magmalgebra ・ 3 days ago

> or are there still good reasons for using it in a new project in 2025
it's not clear there ever was. Most of the big users I'm aware of, like Stripe, don't seem to have needed it and regretted the decision. Big data didn't become a thing in the way people expected[0]. If you really did need the scalability of Mongo you'd choose a NewSql database like TiDB[1].
[0] https://motherduck.com/blog/big-data-is-dead/ [1] https://docs.pingcap.com/
- threeseed ・ 3 days ago
  
  a) Big data is more than just the size of the data. It's about how you treat that data i.e. instead of doing expensive and brittle up-front RDBMS modelling you instead dump it all into a data lake and figure out how to handle the data at run-time. And it is still the standard pattern in almost all companies today.
  b) Nobody was choosing MongoDB solely for performance. If it was you would choose some in-memory K/V store. It was about it being the only well supported document store that was also fast and scalable.
  c) Stripe’s DocDB is an extension of MongoDB Community: https://stripe.com/blog/how-stripes-document-databases-suppo...
  
  guappa ・ 3 days ago
  
  That's not how software works. In my experience using mongodb just means that now every single bug in the code creates messed up data for a few months/years and then when that data pops up the software crashes :D
- computerfan494 ・ 3 days ago
  
  I can assure you that Stripe does not regret the decision.
  
  undefined ・ 3 days ago
  
  [deleted]
  
  hobs ・ 3 days ago
  
  Interesting, a lot of ex stripe eng I talk to say the opposite, but that they made it work.
arccy ・ 3 days ago

it's probably stable enough... but our teams got like 90% reduction in their bills by moving off mongodb atlas to postgres. of course, architect your data schema right and you'll have much more flexibility in choosing a database engine that can be fast, cheap, and easy to operate.
undefined ・ 3 days ago

[deleted]

usernamed7 ・ 3 days ago

[flagged]

compton93 ・ 3 days ago

I consulted on a project that was using MongoDB even though it was obvious from the concept that an RDBMS would be better, however I went in with an open mind and gave MongoDB a red hot crack. It straight up ignored indexes with no explainable reason why. We had a support contract and they just gave us the run around.
- usernamed7 ・ 3 days ago
  
  yeah i've had similar experience but never had indexes get ignored! 100% not surprised though.
merek ・ 3 days ago

Where? I see the following
> I was excited about this vision. Too often, an engineer tries to write one huge TLA+ spec for the whole system. It's too complex and detailed, so it's not much easier to understand ...
- usernamed7 ・ 3 days ago
  
  looks like the article was updated after the version I read.
nailer ・ 3 days ago

To be fair, about a decade they did stop losing data, but they did play with everyone's data for about a decade first.
https://stackoverflow.com/questions/10560834/to-what-extent-...
- usernamed7 ・ 3 days ago
  
  lord knows i've been through this... glad it's something they finally solved for.
undefined ・ 3 days ago

[deleted]
random3 ・ 3 days ago

[flagged]