Seems like some people here are taking this post literally, as in the author (Dan Abramov) is proposing a format called Progressive JSON — it is not.
This is more of a post on explaining the idea of React Server Components where they represent component trees as javascript objects, and then stream them on the wire with a format similar to the blog post (with similar features, though AFAIK it’s bundler/framework specific).
This allows React to have holes (that represent loading states) on the tree to display fallback states on first load, and then only display the loaded component tree afterwards when the server actually can provide the data (which means you can display the fallback spinner and the skeleton much faster, with more fine grained loading).
(This comment is probably wrong in various ways if you get pedantic, but I think I got the main idea right.)
Yup! To be fair, I also don't mind if people take the described ideas and do something else with them. I wanted to describe RSC's take on data serialization without it seeming too React-specific because the ideas are actually more general. I'd love if more ideas I saw in RSC made it to other technologies.
hi dan! really interesting post.
do you think a new data serialization format built around easier generation/parseability and that also happened to be streamable because its line based like jsonld could be useful for some?
I don’t know! I think it depends on whether you’re running into any of these problems and have levers to fix them. RSC was specifically designed for that so I was trying to explain its design choices. If you’re building a serializer then I think it’s worth thinking about the format’s characteristics.
Awesome, thanks! I do keep running on the issues, but the levers as you say make it harder to implement.
As of right now, I could only replace the JSON tool calling on LLM's on something I fully control like vLLM, and the big labs probably are happy to over-charge a 20-30% tokens for each tool call, so they wouldn't really be interested on replacing json any time soon)
also it feels like battling against a giant which is already an standard, maybe there's a place for it on really specialized workflows where those savings make the difference (not only money, but you also gain a 20-30% extra token window, if you don't waste it on quotes and braces and what not
Thanks for replying!
I've used React in the past to build some applications and components. Not familiar with RSC.
What immediately comes to mind is using a uniform recursive tree instead, where each node has the same fields. In a funny way that would mimic the DOM if you squint. Each node would encode it's type, id, name, value, parent_id and order for example. The engine in front can now generically put stuff into the right place.
I don't know whether that is feasible here. Just a thought. I've used similar structures in data driven react (and other) applications.
It's also efficient to encode in memory, because you can put this into a flat, compact array. And it fits nicely into SQL dbs as well.
GraphQL has similar notions, e.g. @defer and @stream.
Am I the only person that dislikes progressive loading? Especially if it involves content jumping around.
And the most annoying antipattern is showing empty state UI during loading phase.
Right — that’s why the emphasis on intentionally designed loading states in this section: https://overreacted.io/progressive-json/#streaming-data-vs-s...
Quoting the article:
> You don’t actually want the page to jump arbitrarily as the data streams in. For example, maybe you never want to show the page without the post’s content. This is why React doesn’t display “holes” for pending Promises. Instead, it displays the closest declarative loading state, indicated by <Suspense>.
> In the above example, there are no <Suspense> boundaries in the tree. This means that, although React will receive the data as a stream, it will not actually display a “jumping” page to the user. It will wait for the entire page to be ready. However, you can opt into a progressively revealed loading state by wrapping a part of the UI tree into <Suspense>. This doesn’t change how the data is sent (it’s still as “streaming” as possible), but it changes when React reveals it to the user.
[…]
> In other words, the stages in which the UI gets revealed are decoupled from how the data arrives. The data is streamed as it becomes available, but we only want to reveal things to the user according to intentionally designed loading states.
Smalltalk UIs used to work with only one CPU thread. Any action from the user would freeze the whole UI while it was working, but the positive aspect of that is that it was very predictable and bug free. That's helpful since Smalltalk is OOP.
Since React is functional programming it works well with parallelization so there is room for experiments.
> Especially if it involves content jumping around.
I remember this from the beginning of Android, you'll search for something and click on it and the time it takes you to click the list of results changed and you clicked on something else. Happens with adds on some websites, maybe intentionally?
> And the most annoying antipattern is showing empty state UI during loading phase.
Some low quality software even show "There are no results for your search" when the search didn't even start or complete.
> Smalltalk UIs used to work with only one CPU thread. Any action from the user would freeze the whole UI while it was working …
If that happened maybe a programmer messed-up the green threads!
"The Smalltalk-80 system provides support for multiple independent processes with three classes named Process, ProcessorScheduler, and Semaphore. "
p251 "Smalltalk-80 The Language and it's Implementation"
https://rmod-files.lille.inria.fr/FreeBooks/BlueBook/Blueboo...
You might be interested in the "remote data" pattern (for lack of a better name)
https://www.haskellpreneur.com/articles/slaying-a-ui-antipat...
alternative is to stare at blank page without any indication that something is happening
It’s better than moving the link or button as I’m clicking it.
I'm sure that isn't the only alternative.
Or, you could use caches and other optimizations to serve content fast.
Ember did something like this but it made writing Ajax endpoints a giant pain in the ass.
It’s been so long since I used ember that I’ve forgotten the terms, but essentially the rearranged the tree structure so that some of the children were at the end of the file. I believe it was meant to handle DAGs more efficiently but I may have hallucinated that recollection.
But if you’re using a SAX style streaming parser you can start making progress on painting and perhaps follow-up questions while the initial data is still loading.
Of course in a single threaded VM, you can snatch Defeat from the jaws of Victory if you bollocks up the order of operations through direct mistakes or code evolution over time.
I already use streaming partial json responses (progressive json) with AI tool calls in production.
It’s become a thing, even beyond RSCs, and has many practical uses if you stare at the client and server long enough.
Can you offer some detail into why you find this approach useful?
From an outsider's perspective, if you're sending around JSON documents so big that it takes so long to parse them to the point reordering the content has any measurable impact on performance, this sounds an awful lot like you are batching too much data when you should be progressively fetching child resources in separate requests, or even implementing some sort of pagination.
Slow llm generation. A progressive display of a progressive json is mandatory.
how do you do that exactly?
One way is to eagerly call JSON.parse as fragments are coming in. If you also split on json semantic boundaries like quotes/closing braces/closing brackets, you can detect valid objects and start processing them while the stream continues.
Interesting approach! thanks for sharing
- [deleted]
Not the original commenter but I’ve done this too with Pydantic AI (actually the library does it for you). See “Streaming Structured Output” here https://ai.pydantic.dev/output/#streaming-structured-output
Thanks yes! Im aware of structured outputs, llama.cpp has also great support with GBNF and several languages beyond json.
I've been trying to create go/rust ones but its way harder than just json due to all the context/state they carry over
- [deleted]