Progressive JSON

overreacted.io

・

570 points

・

kacesensitive

・

6 days ago

239 comments

goranmoomin ・ 6 days ago

Seems like some people here are taking this post literally, as in the author (Dan Abramov) is proposing a format called Progressive JSON — it is not.

This is more of a post on explaining the idea of React Server Components where they represent component trees as javascript objects, and then stream them on the wire with a format similar to the blog post (with similar features, though AFAIK it’s bundler/framework specific).

This allows React to have holes (that represent loading states) on the tree to display fallback states on first load, and then only display the loaded component tree afterwards when the server actually can provide the data (which means you can display the fallback spinner and the skeleton much faster, with more fine grained loading).

(This comment is probably wrong in various ways if you get pedantic, but I think I got the main idea right.)

danabramov ・ 6 days ago

Yup! To be fair, I also don't mind if people take the described ideas and do something else with them. I wanted to describe RSC's take on data serialization without it seeming too React-specific because the ideas are actually more general. I'd love if more ideas I saw in RSC made it to other technologies.
- tough ・ 6 days ago
  
  hi dan! really interesting post.
  do you think a new data serialization format built around easier generation/parseability and that also happened to be streamable because its line based like jsonld could be useful for some?
  
  danabramov ・ 6 days ago
  ・ 3 more
  
  I don’t know! I think it depends on whether you’re running into any of these problems and have levers to fix them. RSC was specifically designed for that so I was trying to explain its design choices. If you’re building a serializer then I think it’s worth thinking about the format’s characteristics.
  
  tough ・ 5 days ago
  
  Awesome, thanks! I do keep running on the issues, but the levers as you say make it harder to implement.
  As of right now, I could only replace the JSON tool calling on LLM's on something I fully control like vLLM, and the big labs probably are happy to over-charge a 20-30% tokens for each tool call, so they wouldn't really be interested on replacing json any time soon)
  also it feels like battling against a giant which is already an standard, maybe there's a place for it on really specialized workflows where those savings make the difference (not only money, but you also gain a 20-30% extra token window, if you don't waste it on quotes and braces and what not
  Thanks for replying!
  
  dgb23 ・ 5 days ago
  
  I've used React in the past to build some applications and components. Not familiar with RSC.
  What immediately comes to mind is using a uniform recursive tree instead, where each node has the same fields. In a funny way that would mimic the DOM if you squint. Each node would encode it's type, id, name, value, parent_id and order for example. The engine in front can now generically put stuff into the right place.
  I don't know whether that is feasible here. Just a thought. I've used similar structures in data driven react (and other) applications.
  It's also efficient to encode in memory, because you can put this into a flat, compact array. And it fits nicely into SQL dbs as well.
- hn_throwaway_99 ・ 4 days ago
  
  GraphQL has similar notions, e.g. @defer and @stream.
krzat ・ 6 days ago

Am I the only person that dislikes progressive loading? Especially if it involves content jumping around.
And the most annoying antipattern is showing empty state UI during loading phase.
- danabramov ・ 6 days ago
  
  Right — that’s why the emphasis on intentionally designed loading states in this section: https://overreacted.io/progressive-json/#streaming-data-vs-s...
  Quoting the article:
  > You don’t actually want the page to jump arbitrarily as the data streams in. For example, maybe you never want to show the page without the post’s content. This is why React doesn’t display “holes” for pending Promises. Instead, it displays the closest declarative loading state, indicated by <Suspense>.
  > In the above example, there are no <Suspense> boundaries in the tree. This means that, although React will receive the data as a stream, it will not actually display a “jumping” page to the user. It will wait for the entire page to be ready. However, you can opt into a progressively revealed loading state by wrapping a part of the UI tree into <Suspense>. This doesn’t change how the data is sent (it’s still as “streaming” as possible), but it changes when React reveals it to the user.
  […]
  > In other words, the stages in which the UI gets revealed are decoupled from how the data arrives. The data is streamed as it becomes available, but we only want to reveal things to the user according to intentionally designed loading states.
- dominicrose ・ 5 days ago
  
  Smalltalk UIs used to work with only one CPU thread. Any action from the user would freeze the whole UI while it was working, but the positive aspect of that is that it was very predictable and bug free. That's helpful since Smalltalk is OOP.
  Since React is functional programming it works well with parallelization so there is room for experiments.
  > Especially if it involves content jumping around.
  I remember this from the beginning of Android, you'll search for something and click on it and the time it takes you to click the list of results changed and you clicked on something else. Happens with adds on some websites, maybe intentionally?
  > And the most annoying antipattern is showing empty state UI during loading phase.
  Some low quality software even show "There are no results for your search" when the search didn't even start or complete.
  
  igouy ・ 4 days ago
  
  > Smalltalk UIs used to work with only one CPU thread. Any action from the user would freeze the whole UI while it was working …
  If that happened maybe a programmer messed-up the green threads!
  "The Smalltalk-80 system provides support for multiple independent processes with three classes named Process, ProcessorScheduler, and Semaphore. "
  p251 "Smalltalk-80 The Language and it's Implementation"
  https://rmod-files.lille.inria.fr/FreeBooks/BlueBook/Blueboo...
- sdeframond ・ 5 days ago
  
  You might be interested in the "remote data" pattern (for lack of a better name)
  https://www.haskellpreneur.com/articles/slaying-a-ui-antipat...
- Szpadel ・ 6 days ago
  
  alternative is to stare at blank page without any indication that something is happening
  
  withinboredom ・ 5 days ago
  
  It’s better than moving the link or button as I’m clicking it.
  
  leptons ・ 6 days ago
  
  I'm sure that isn't the only alternative.
  
  ahofmann ・ 6 days ago
  
  Or, you could use caches and other optimizations to serve content fast.
hinkley ・ 5 days ago

Ember did something like this but it made writing Ajax endpoints a giant pain in the ass.
It’s been so long since I used ember that I’ve forgotten the terms, but essentially the rearranged the tree structure so that some of the children were at the end of the file. I believe it was meant to handle DAGs more efficiently but I may have hallucinated that recollection.
But if you’re using a SAX style streaming parser you can start making progress on painting and perhaps follow-up questions while the initial data is still loading.
Of course in a single threaded VM, you can snatch Defeat from the jaws of Victory if you bollocks up the order of operations through direct mistakes or code evolution over time.
vinnymac ・ 6 days ago

I already use streaming partial json responses (progressive json) with AI tool calls in production.
It’s become a thing, even beyond RSCs, and has many practical uses if you stare at the client and server long enough.
- motorest ・ 6 days ago
  
  Can you offer some detail into why you find this approach useful?
  From an outsider's perspective, if you're sending around JSON documents so big that it takes so long to parse them to the point reordering the content has any measurable impact on performance, this sounds an awful lot like you are batching too much data when you should be progressively fetching child resources in separate requests, or even implementing some sort of pagination.
  
  Wazako ・ 5 days ago
  
  Slow llm generation. A progressive display of a progressive json is mandatory.
- tough ・ 6 days ago
  
  how do you do that exactly?
  
  danenania ・ 5 days ago
  ・ 3 more
  
  One way is to eagerly call JSON.parse as fragments are coming in. If you also split on json semantic boundaries like quotes/closing braces/closing brackets, you can detect valid objects and start processing them while the stream continues.
  
  tough ・ 5 days ago
  ・ 2 more
  
  Interesting approach! thanks for sharing
  
  undefined ・ 5 days ago
  
  [deleted]
  
  richin13 ・ 5 days ago
  ・ 2 more
  
  Not the original commenter but I’ve done this too with Pydantic AI (actually the library does it for you). See “Streaming Structured Output” here https://ai.pydantic.dev/output/#streaming-structured-output
  
  tough ・ 5 days ago
  
  Thanks yes! Im aware of structured outputs, llama.cpp has also great support with GBNF and several languages beyond json.
  I've been trying to create go/rust ones but its way harder than just json due to all the context/state they carry over
undefined ・ 6 days ago

[deleted]

jatins ・ 6 days ago

I have seen Dan's "2 computers" talk and read some of his recent posts trying to explore RSC and their benefits.

Dan is one of the best explainers in React ecosystem but IMO if one has to work this hard to sell/explain a tech there's 2 possibilities 1/ there is no real need of tech 2/ it's a flawed abstraction

#2 seems somewhat true because most frontend devs I know still don't "get" RSC.

Vercel has been aggressively pushing this on users and most of the adoption of RSC is due to Nextjs emerging as the default React framework. Even among Nextjs users most devs don't really seem to understand the boundaries of server components and are cargo culting

That coupled with fact that React wouldn't even merge the PR that mentions Vite as a way to create React apps makes me wonder if the whole push for RSC is for really meant for users/devs or just as a way for vendors to push their hosting platforms. If you could just ship an SPA from S3 fronted with a CDN clearly that's not great for Vercels and Netflifys of the world.

In hindsight Vercel just hiring a lot of OG React team members was a way to control the future of React and not just a talent play

danabramov ・ 6 days ago

You’re wrong about the historical aspects and motivations but I don’t have the energy to argue about it now and will save it for another post. (Vercel isn’t setting React’s direction; rather, they’re the ones who funded person-decades of work under the direction set by the React team.)
I’ll just correct the allegation about the Vite — it’s being worked on but the ball is largely in the Vite team’s court because it can’t work well without bundling in DEV (and the Vite team knows it and will be fixing that). The latest work in progress is here: https://github.com/facebook/react/pull/33152.
Re: people not “getting” it — you’re kind of making a circular argument. To refute it I would have to shut up. But I like writing and I want to write about the topics I find interesting! I think even if you dislike RSC, there’s enough interesting stuff there to be picked into other technologies. That’s really all I want at this point. I don’t care to convince you about anything but I want people to also think about these problems and to steal the parts of the solution that they like. Seems like the crowd here doesn’t mind that.
- andrewingram ・ 5 days ago
  
  I also appreciate that you’re doing these explainers so that people don’t have to go the long way round understand what problems exists that call for certain shapes of solutions — especially when those solutions can feel contrived or complicated.
  As someone who’s been building web UI for nearly 30 years (scary…), I’ve generally been fortunate enough that when some framework I use introduces a new feature or pattern, I know what they’re trying to do. But the only reason I know what they’re trying to do is because I’ve spent some amount of time running into the problems they’re solving. The first time I saw GraphQL back in 2015, I “got” it; 10 years later most people using GraphQL don’t really get it because they’ve had it forced upon them or chose it because it was the new shiny thing. Same was true of Suspense, server functions, etc.
liamness ・ 5 days ago

You can of course still just export a static site and host it on a basic CDN, as you say. And you can self host Next.js in the default "dynamic" mode, you just need to be able to run an Express server, which hardly locks you into any particular vendor.
Where it gets a little more controversial is if you want to run Next.js in full fat mode, with serverless functions for render paths that can operate on a stale-while-revalidate basis. Currently it is very hard for anyone other than Vercel to properly implement that (see the opennextjs project for examples), due to undocumented "magic". But thankfully Next.js / Vercel have proposed to implement (and dogfood) adapters that allow this functionality to be implemented on different platforms with a consistent API:
https://github.com/vercel/next.js/discussions/77740
I don't think the push for RSC is at all motivated by the shady reasons you're suggesting. I think it is more about the realisation that there were many good things about the way we used to build websites before SPA frameworks began to dominate. Mostly rendering things on the server, with a little progressive enhancement on the client, is a pattern with a lot of benefits. But even with SSR, you still end up pushing a lot of logic to the client that doesn't necessarily belong there.
- lioeters ・ 5 days ago
  
  > thankfully Next.js / Vercel have proposed to implement (and dogfood) adapters that allow this functionality to be implemented on different platforms with a consistent API:
  Seeing efforts like this (started by the main dev of Next.js working at Vercel) convinces me that the Vercel team is honestly trying to be a good steward with their influence on the React ecosystem, and in general being a beneficial community player. Of course as a VC-funded company its purpose is self-serving, but I think they're playing it pretty respectably.
  That said, there's no way I'm going to run Next.js as part of a server in production. It's way too fat and complicated. I'll stick with using it as a static site generator, until I replace it with something simpler like Vite and friends.
throwingrocks ・ 6 days ago

> IMO if one has to work this hard to sell/explain a tech there's 2 possibilities 1/ there is no real need of tech 2/ it's a flawed abstraction
There’s of course a third option: the solution justifies the complexity. Some problems are hard to solve, and the solutions require new intuition.
It’s easy to say that, but it’s also easy to say it should be easier to understand.
I’m waiting to see how this plays out.
metalrain ・ 6 days ago

While RSC as technology is interesting, I don't think it makes much sense in practice.
I don't want to have a fleet of Node/Bun backend servers that have to render complex components. I'd rather have static pages and/or React SPA with Go API server.
You get similar result with much smaller resources.
- pas ・ 5 days ago
  
  It's convenient for integrating with backends. You can use async/await on the server, no need for hooks (callbacks) for data loading.
  It allows for dynamism (user only sees the menus that they have permissions for), you can already show those parts that are already loaded while other parts are still loading.
  (And while I prefer the elegance and clean separation of concerns that come with a good REST API, it's definitely more work to maintain both the frontend and the backend for it. Especially in caes where the backend-for-frontend integrates with more backends.)
  So it's the new PHP (with ob_flush), good for dashboards and big complex high-traffic webshop-like sites, where you want to spare no effort to be able to present the best options to the dear customer as soon as possible. (And also it should be crawlable, and it should work on even the lowest powered devices.)
- ec109685 ・ 5 days ago
  
  How do you avoid having your users stare at spinners while their browser makes api calls (some of them depending on each other) in order to render the page?
- presentation ・ 4 days ago
  
  That's fine for you, but not all React users are you. It makes much sense in practice for me.
- robertoandred ・ 5 days ago
  
  RSCs work just fine with static deployments and SPAs. (All Next sites are SPAs.)
MaxBav ・ 5 days ago

Every use case has its optimal stack. Isomorphic rendering (NextJS, Nuxt, Sveltekit with "non-static" adapters, ...) is a good fit for very few use cases only.
Many "thought leaders" still don't get the math right. At first visit, your Next app can't serve individual content. So you need two round trips. Both are slow as they are typically served from a Node server.
Serving the app (properly built and bundled, e.g. using Astro or a small and fast SPA like Solid or Svelte) from a CDN and the data from an API is faster at first visit.
At consecutive visits the Next app can serve the rendered page with individual content. Nice and fast. But the CDN hosted app is still in the browser cache. It's even faster! So also just one request to the backend for the individual data is needed.
Regarding SEO the arguments for isomorphic rendering are also flawed. If you care about SEO, just create static html (again, Astro makes it very easy) and put it on a CDN. Why should a crawler care about individual content, that isomorphic frameworks can provide? For SEO the response of a anonymous request matters. So just the static content.
IMO 99% of the use cases are better solved with traditional server rendered MPAs (e.g. Django or ASP.NET MVC), fast SPAs (not React, but Solid, Svelte or Vue) and if SEO and first paint really matter static sites (e.g. Astro).
Garlef ・ 6 days ago

I think there's a world where you would use the code structuring of RSCs to compile a static page that's broken down into small chunks of html, css, js.
Basically: If you replace the "$1" placeholders from the article with URIs you wouldn't need a server.
(In most cases you don't need fully dynamic SSR)
The big downside is that you'd need a good pipeline to also have fast builds/updates in case of content changes: Partial streaming of the compiled static site to S3.
(Let's say you have a newspaper with thousands of prerendered articles: You'd want to only recompile a single article in case one of your authors edits the content in the CMS. But this means the pipeline would need to smartly handle some form of content diff)
- danabramov ・ 6 days ago
  
  RSC is perfectly capable of being run at the build-time, which is the default. So that’s not too far from what you’re describing.
kenanfyi ・ 6 days ago

I find your analysis very good and agree on why companies like Vercel are pushing hard on RSC.
- foo42 ・ 6 days ago
  
  [flagged]
  
  tomhow ・ 6 days ago
  
  Please don't do this here. If a comment seems unfit for HN, please flag it and email us at hn@ycombinator.com so we can have a look.
  
  kenanfyi ・ 4 days ago
  
  Sorry, what? Is it just my phrasing or my rant on VC-backed entities pushing things to gain advantage?
chamomeal ・ 4 days ago

Tangent: next.js is pretty amazing but it’s still surprising to me that it’s become to default way to write react. I just don’t enjoy writing next.js apps even though typescript is my absolute favorite language, and I generally love react as well.
presentation ・ 4 days ago

for what it's worth I am a NextJS developer and everyone on my team had a pretty easy time getting used to client/server components.
Do I wish that it were something like some kind of Haskell-style monad (probably doable in TypeScript!) or a taint or something, rather than a magic string comment at the top of the file? Sure, but it still doesn't seem to be a big deal, at least on my team.

hyfgfh ・ 6 days ago

The thing I have seem in performance is people trying to shave ms loading a page, while they fetch several mbs and do complex operations in the FE, when in the reality writing a BFF, improving the architecture and leaner APIs would be a more productive solution.

We tried to do that with GraphQL, http2,... And arguably failed. Until we can properly evolve web standards we won't be able to fix the main issue. Novel frameworks won't do it either

danabramov ・ 6 days ago

RSC, which is described at the end of this post, is essentially a BFF (with the API logic componentized). Here’s my long post on this topic: https://overreacted.io/jsx-over-the-wire/ (see BFF midway in the first section).
- MaxBav ・ 5 days ago
  
  But with a considerable amount of added complexity and bulk. And operational drawbacks. A well designed API (Go, ASP.NET, Java) and a fast SPA (let's say Solid) without client side global data management, just per component data fetching, are simple and fast. You can use a CDN to cache not only the app but the data.
onion2k ・ 6 days ago

Doesn't that depend on what you mean by "shave ms loading a page"?
If you're optimizing for time to first render, or time to visually complete, then you need to render the page using as little logic as possible - sending an empty skeleton that then gets hydrated with user data over APIs is fastest for a user's perception of loading speed.
If you want to speed up time to first input or time to interactive you need to actually build a working page using user data, and that's often fastest on the backend because you reduce network calls which are the slowest bit. I'd argue most users actually prefer that, but it depends on the app. Something like a CRUD SAAS app is probably best rendered server side, but something like Figma is best off sending a much more static page and then fetching the user's design data from the frontend.
The idea that there's one solution that will work for everything is wrong, mainly because what you optimise for is a subjective choice.
And that's before you even get to Dev experience, team topology, Conway's law, etc that all have huge impacts on tech choices.
- MrJohz ・ 6 days ago
  
  > sending an empty skeleton that then gets hydrated with user data over APIs is fastest for a user's perception of loading speed
  This is often repeated, but my own experience is the opposite: when I see a bunch of skeleton loaders on a page, I generally expect to be in for a bad experience, because the site is probably going to be slow and janky and cause problems. And the more the of the site is being skeleton-loaded, the more my spirits worsen.
  My guess is that FCP has become the victim of Goodhart's Law — more sites are trying to optimise FCP (which means that _something_ needs to be on the screens ASAP, even if it's useless) without optimising for the UX experience. Which means delaying rendering more and adding more round-trips so that content can be loaded later on rather than up-front. That produces sites that have worse experiences (more loading, more complexity), even though the metric says the experience should be improving.
  
  PhilipRoman ・ 5 days ago
  ・ 3 more
  
  It also breaks a bunch of optimizations that browsers have implemented over the years. Compare how back/forward history buttons work on reddit vs server side rendered pages.
  
  MrJohz ・ 5 days ago
  ・ 2 more
  
  It is possible to get those features back, in fairness... but it often requires more work than if you'd just let the browser handle things properly in the first place.
  
  zelphirkalt ・ 5 days ago
  
  Seems like 95% of businesses are not willing to pay the web dev who created the problem in the first place to also fix the problem, and instead want more features released last week.
  The number of websites needlessly forced into being SPAs without working navigation like back and forth buttons is appalling.
  
  Bjartr ・ 5 days ago
  
  > the experience should be improving
  I think it's more the bounce rate is improving. People may recall a worse experience later, but more will stick around for that experience if they see something happen sooner.
- motorest ・ 6 days ago
  
  > If you're optimizing for time to first render, or time to visually complete, then you need to render the page using as little logic as possible - sending an empty skeleton that then gets hydrated with user data over APIs is fastest for a user's perception of loading speed.
  I think that OP's point is that these optimization strategies are completely missing the elephant in the room. Meaning, sending multi-MB payloads creates the problem, and shaving a few ms here and there with more complexity while not looking at the performance impact of having to handle multi-MB payloads doesn't seem to be an effective way to tackle the problem.
- FridgeSeal ・ 5 days ago
  
  > speed up time to first input or time to interactive you need to actually build a working page using user data, and that's often fastest on the backend because you reduce network calls which are the slowest bit.
  It’s only fastest to get the loading skeleton onto the page.
  My personal experience with basically any site that has to go through this 2-stage loading exercise is that:
  - content may or may not load properly.
  - I will probably be waiting well over 30 seconds for the actually-useful-content.
  - when it does all load, it _will_ be laggy and glitchy. Navigation won’t work properly. The site may self-initiate a reload, button clicks are…50/50 success rate for “did it register, or is it just heinously slow”.
  I’d honestly give up a lot of fanciness just to have “sites that work _reasonably_” back.
  
  zelphirkalt ・ 5 days ago
  
  30s is probably an exaggeration even for most bad websites, unless you are on a really poor connection. But I agree with the rest of it. Often it isn't even a 2-stages thing but an n-stages thing that happens there.
xiphias2 ・ 6 days ago

At least this post explains why when I load a Facebook page the only thing that really matters (the content) is what loads last
- globalise83 ・ 5 days ago
  
  When I load a Facebook page the content that matters doesn't even load.
presentation ・ 4 days ago

One huge point of RSC is that you can use your super heavyweight library in the backend, and then not send a single byte of it to the frontend, you just send its output. It's a huge win in the name of shaving way more than ms from your page.
One example a programmer might understand - rather than needing to send the grammar and code of a syntax highlighter to the frontend to render formatted code samples, you can keep that on the backend, and just send the resulting HTML/CSS to the frontend, by making sure that you use your syntax highlighter in a server component instead of a client component. All in the same language and idioms that you would be using in the frontend, with almost 0 boilerplate.
And if for some reason you decide you want to ship that to the frontend, maybe because you want a user to be able to syntax highlight code they type into the browser, just make that component be a client component instead of a server component, et voila, you've achieved it with almost no code changes.
Imagine what work that would take if your syntax highlighter was written in Go instead of JS.
kristianp ・ 6 days ago

What's a BFF in this context? Writing an AI best friend isn't all that rare these days...
- continuational ・ 6 days ago
  
  BFF (pun intended?) in this context means "backend for frontend".
  The idea is that every frontend has a dedicated backend with exactly the api that that frontend needs.
  
  zelphirkalt ・ 5 days ago
  
  It is a terrible idea organizationally. It puts backend devs at the whims of often hype train and CV driven development of frontend devs. What often happens is, that complexity is moved from the frontend to the backend. But that complexity is not necessarily implicit, but often self inflicted accidental complexity by choices in frontend. The backend API should facilitate getting the required data to render pages and perform required operations to interact with that data. Everything else is optimization that one may or may not need.
elcomet ・ 6 days ago

Too many acronyms, what's FE, BFF?
- aeinbu ・ 5 days ago
  
  I was asking the same questions.
  - FE is short for the Front End (UI)
  - BFF is short for Backend For Frontend
- holoduke ・ 5 days ago
  
  Front end and a backend for a frontend. In which you generally design apis specific for a page by aggregating multiple other apis, caching, transforming etc.

usrbinbash ・ 5 days ago

Or here is a different approach:

We acknowledge that streaming data is not a problem that JSON was intended, or designed, to solve, and ... not do that.

If an application has a usecase that necessitates sending truly gigantic JSON objects across the wire, to the point where such a scheme seems like a good idea, the much better question to ask is "why is my application sending ginormeous JSON objects again?"

And the answer is usually this:

Fat clients using bloated libraries and ignoring REST, trying to shoehorn JSON into a "one size fits all" solution, sending first data, then data + metadata, then data + metadata + metadata describing the interface, because finally we came full circle and re-invented a really really bad version of REST that requires several MB of minified JS for the browser to use.

Again, the solution is not to change JSON, the solution is to not do the thing that causes the problem. Most pages don't need a giant SPA framework.

3cats-in-a-coat ・ 6 days ago

I'll try to explain why this is a solution looking for a problem.

Yes, breadth-first is always an option, but JSON is a heterogenous structured data source, so assuming that breadth-first will help the app start rendering faster is often a poor assumption. The app will need a subset of the JSON, but it's not simply the depth-first or breadth-first first chunk of the data set.

So for this reason what we do is include URLs in JSON or other API continuation identifiers, to let the caller choose where in the data tree/graph they want to dig in further, and then the "progressiveness" comes from simply spreading your fetch operation over multiple requests.

Also often times JSON is deserialized to objects so depth-frst or breadth-first doesn't matter, as the object needs to be "whole" before you can use it. Hence again: multiple requests, smaller objects.

In general when you fetch JSON from a server, you don't want it to be so big that you need to EVEN CONSIDER progressive loading. HTML needs progressive loading because a web page can be, historically especially, rather monolithic and large.

But that's because a page is (...was) static. Thus you load it as a big lump and you can even cache it as such, and reuse it. It can't intelligently adapt to the user and their needs. But JSON, and by extension the JavaScript loading it, can adapt. So use THAT, and do not over-fetch data. Read only what you need. Also, JSON is often not cacheable as the data source state is always in flux. One more reason not to load a whole lot in big lumps.

Now, I have a similar encoding with references, which results in a breadth-first encoding. Almost by accident. I do it for another reason and that is structural sharing, as my data is shaped like a DAG not like a tree, so I need references to encode that.

But even though I have breadth-first encoding, I never needed to progressively decode the DAG as this problem should be solved in the API layer, where you can request exactly what you need (or close to it) when you need it.

danabramov ・ 6 days ago

>The app will need a subset of the JSON, but it's not simply the depth-first or breadth-first first chunk of the data set.
Right. Closer to the end of the article I slightly pivot to talk about RSC. In RSC, the data is the UI, so the outermost data literally corresponds to the outermost UI. That's what makes it work.
It's encoded like progressive JSON but conceptually it's more like HTML. Except you can also have your own "tags" on the client that can receive object attributes.
- owebmaster ・ 6 days ago
  
  > Closer to the end of the article I slightly pivot to talk about RSC.
  Not again, please.
  
  danabramov ・ 6 days ago
  ・ 3 more
  
  The best part about someone else's writing is you can just ignore it.
  
  kiitos ・ 5 days ago
  ・ 2 more
  
  And an important part about publishing your own thoughts to a broader audience is evaluating feedback from that audience rather than dismissing it :)
  
  yawaramin ・ 5 days ago
  
  It's a fait accompli; they've already implemented all this. They're not really looking for feedback at this point, more like describing how and why it works.

jerf ・ 5 days ago

There are at least two other alternatives I'd reach for before this.

Probably the simplest one is to refactor the JSON to not be one large object. A lot of "one large objects" have the form {"something": "some small data", "something_else": "some other small data", results: [vast quantities of identically-structured objects]}. In this case you can refactor this to use JSON lines. You send the "small data" header bits as a single object. Ideally this incorporates a count of how many other objects are coming, if you can know that. Then you send each of the vast quantity of identically-structed objects as one-line each. Each of them may have to be parsed in one shot but many times each individual one is below the size of a single packet, at which point streamed parsing is of dubious helpfulness anyhow.

This can also be applied recursively if the objects are then themselves large, though that starts to break the simplicity of the scheme down.

The other thing you can consider is guaranteeing order of attributes going out. JSON attributes are unordered, and it's important to understand that when no guarantees are made you don't have them, but nothing stops you from specifying an API in which you, the server, guarantee that the keys will be in some order useful for progressive parsing. (I would always shy away from specifying incoming parameter order from clients, though.) In the case of the above, you can guarantee that the big array of results comes at the end, so a progressive parser can be used and you will guarantee that all the "header"-type values come out before the "body".

Of course, in the case of a truly large pile of structured data, this won't work. I'm not pitching this as The Solution To All Problems. It's just a couple of tools you can use to solve what is probably the most common case of very large JSON documents. And both of these are a lot simpler than any promise-based approach.

xelxebar ・ 6 days ago

Very cool point, and it applies to any tree data in general.

I like to represent tree data with parent, type, and data vectors along with a string table, so everything else is just small integers.

Sending the string table and type info as upfront headers, we can follow with a stream of parent and data vector chunks, batched N nodes at a time. Tye depth- or breadth-first streaming becomes a choice of ordering on the vectors.

I'm gonna have to play around with this! Might be a general way to get snappier load time UX on network bound applications.

thethimble ・ 6 days ago

You can even alternate between sending table and node chunks! This will effectively allow you to reveal the tree in any order including revealing children before parents as well as representing arbitrary graphs! Could lead to some interesting applications.
- xelxebar ・ 6 days ago
  
  Good point! The parent vector rep is what allows arbitrary node order, but chunking the table data off chunks of node IDs is brilliant idea. Cheers!
dmkolobov ・ 6 days ago

If you send the tree in preorder traversal order with known depth, you can send the tree without node ids or parent ids! You can just send the level for each node and recover the tree structure with a stack.
- xelxebar ・ 6 days ago
  
  Well the whole point is to use a breadth first order here. I don't think there's a depth vector analogue for breadth first traversals. Is there?
  But, indeed, depth vectors are nice and compact. I find them harder to work with most of the time, though, especially since insertions and deletions become O(n), compared to parent vector O(1).
  That said, I do often normalize my parent vectors into dfpo order at API boundaries, since a well-defined order makes certain operations, like finding leaf siblings, much nicer.
  
  ummonk ・ 6 days ago
  
  I’m not familiar with depth vectors, but wouldn’t the breadth first traversal analogue of each entry specifying its depth (in a depth first format) be each entry specifying the number of immediate children it has?
  
  dmkolobov ・ 5 days ago
  
  Yeah, it has its limits for sure. I like it for the streaming aspect.
  I think you can still have the functionality described in the article: you would send “hole” markers tagged with their level. Then, you could make additional requests when you encounter these markers during the recovery phase, possibly with buffering of holes. It becomes a sort of hybrid DFS/BFS approach where you send as much tree structure at a time as you want.
x-complexity ・ 6 days ago

... It might be a pursuit worth making a small library for.
- undefined ・ 5 days ago
  
  [deleted]

Velorivox ・ 6 days ago

99.9999%* of apps don't need anything nearly as 'fancy' as this, if resolving breadth-first is critical they can just make multiple calls (which can have very little overhead depending on how you do it).

* I made it up - and by extension, the status quo is 'correct'.

danabramov ・ 6 days ago

To be clear, I wouldn't suggest someone to implement this manually in their app. I'm just describing at the high level how the RSC wire protocol works, but narratively I wrapped it in a "from the first principles" invention because it's more fun to read. I don't necessarily try to sell you on using RSC either but I think it's handy to understand how some tools are designed, and sometimes people take ideas from different tools and remix them.
- Velorivox ・ 6 days ago
  
  I get that. Originally my comment was a response to another but I decided to delete and repost it at the top level — however I failed to realize that not having that context makes the tone rather snarky and/or dismissive of the article as a whole, which I didn't intend.
  
  danabramov ・ 6 days ago
  
  Np, fair enough!
- conartist6 ・ 6 days ago
  
  I'm already thinking of whether there's any ideas here I might take for CSTML -- designed as a streaming format for arbitrary data but particularly for parse trees

neRok ・ 5 days ago

Multiple calls?! That sounds like n*n+1. Gross :P

I think the issue with the example json is that it's sent in OOP+ORM style (ie nested objects), whereas you could just send it as rows of objects, something like this;

  {
    header: "Welcome to my blog",
    post_content: "This is my article",
    post_comments: [21,29,88], # the numbers are the comment ID's
    footer: "Hope you like it",
    comments: {21: "first", 29: "second", 88: "third" }
  }

But then you may as well just go with protobufs or something, so your endpoints and stuff are all typed and defined, something like this;

  syntax = "proto3";
  service DirectiveAffectsService {
    rpc Get(GetPageWithPostParams) returns (PageWithPost);
  }
  message GetPageWithPostParams {
    string post_id = 1;
  }
  message PageWithPost {
    string page_header = 1;
    string page_footer = 2;
    string post_content = 3;
    repeated string post_comments = 4;
    repeated CommentInPost comments_for_post = 5;
  }
  message CommentInPost {
    string comment_id = 1;
    string comment_text = 2;
  }

And with this style, you don't necessarily need to embed the comments in 1 call like this, and you could cleanly do it in 2 like parent-comment suggests (1 to get page+post, second to get comments), which might be aided with `int32 post_comment_count = 4;` instead (so you can pre-render n blocks).

xtajv ・ 6 days ago

There's nothing wrong with "accidentally-overengineering" in the sense of having off-the-shelf options that are actually nice.
There is something wrong with adding a "fancy" feature to an off-the-shelf option, if said "fancy" feature is realistically "a complicated engineering question, for which we can offer a leaky abstraction that will ultimately trip up anybody who doesn't have the actual mechanics in mind when using it".
- motorest ・ 6 days ago
  
  > There's nothing wrong with "accidentally-overengineering" in the sense of having off-the-shelf options that are actually nice.
  Your comment focuses on desired outcomes (i.e., "nice" things), but fails to acknowledge the reality of tradeoffs. Over engineering a solution always creates problems. Systems become harder to reason with, harder to maintain, harder to troubleshoot. For example, in JSON arrays are ordered lists. If you onboard an overengineered tool that arbitrarily reorders elements in a JSON array, things can break in non-trivial ways. And they often do.
echelon ・ 6 days ago

We technically didn't need more than 640K either.
Having progressive or partial reads would dramatically speed up applications, especially as we move into an era of WASM on the frontend.
A proper binary encoded format like protobuf with support for partial reads and well defined streaming behavior for sub message payloads would be incredible.
It puts more work on the engineer, but the improvement to UX could be massive.
- pigbearpig ・ 6 days ago
  
  Sure, if you’re the 0.00001% that need that. It’s going to be over engineering for most cases. There are so many simpler and easier to support things that can be done before trying this sort of thing.
  Following the example, why is all the data in one giant request? Is the DB query efficient? Is the DB sized correctly? How about some caching? All boring, but if rather support and train someone on boring stuff.
  
  undefined ・ 6 days ago
  
  [deleted]

atombender ・ 5 days ago

You could stream incrementally like this without explicitly demarcating the "holes". You can simply send the unfinished JSON (with empty arrays as the holes), then compute the next iteration and send a delta, then compute the next and send a delta, and so on.

A good delta format is Mendoza [1] (full disclosure: I work at Sanity where we developed this), which has Go and JS/TypeScript [2] implementations. It expresses diffs and patches as very compact operations.

Another way is to use binary digging. For example, zstd has some nifty built-in support for diffing where you can use the previous version as a dictionary and then produce a diff that can be applied to that version, although we found Mendoza to often be as small as zstd. This approach also requires treating the JSON as bytes and keeping the previous binary snapshot in memory for the next delta, whereas a Mendoza patch can be applied to a JavaScript value, so you only need the deserialized data.

This scheme would force you to compare the new version for what's changed rather than plug in exactly what's changed, but I believe React already needs to do that? Also, I suppose the Mendoza applier could be extended to return a list of keys that were affected by a patch application.

[1] https://github.com/sanity-io/mendoza

[2] https://github.com/sanity-io/mendoza-js

andrewingram ・ 5 days ago

For the use case of streaming data for UI, I don’t think empty arrays and nulls are sufficient information. At any moment during the stream, you need the ability to tell what data is pending.
If pending arrays are just returned as empty arrays, how do I know if it’s empty because it’s actually empty, or empty because it’s pending?
GraphQL’s streaming payloads try to get the best of both worlds, at any point in time you have a valid payload according the GraphQL schema - so it’s possible to render some valid UI, but it also communicates what paths contain pending data, and then subsequent payloads act as patches (though not as sophisticated as Mendoza’s).
- atombender ・ 5 days ago
  
  As I commented in https://news.ycombinator.com/item?id=44150238, all you need is a way to express what is pending, which can be done using JSON key paths.
  Of course, you could do it in-band, too:
  {"comments": {"state": "pending", "values": []}}
  …at the cost of needing your data model to be explicit about it. But this has the benefit of being diffable, of course, so once the data is available, the diff is just the new state and the new values.
  
  andrewingram ・ 5 days ago
  
  Yes, hence the last paragraph in my comment :)
__mattya ・ 5 days ago

They want to know where the holes are so that they can show a loading state.
- atombender ・ 5 days ago
  
  You don't need templating ($1 etc.) for that as long as you can describe the holes somehow, which can be done out-of-band.
  If we imagine a streaming protocol of key/value pairs that are either snapshots or deltas:
  event: snapshot data: {"topPost":[], "user": {"comments": []}} pending: topPosts,user.comments event: delta data: [17,{"comments":[{"body":"hello world"}]},"user"] pending: topPosts

turtlebits ・ 6 days ago

Progressive JPEG make sense, because it's a media file and by nature is large. Text/HTML on the other hand, not so much. Seems like a self-inflicted solution where JS bundles are giant and now we're creating more complexity by streaming it.

danabramov ・ 6 days ago

Things can be slow not because they're large but because they take latency to produce or to receive. The latency can be on the server side (some things genuinely take long to query, and might be not possible or easy to cache). Some latency may just be due to the user having poor network conditions. In both cases, there's benefits to progressively revealing content as it becomes available (with intentional loading stages) instead of always waiting for the entire thing.
- whilenot-dev ・ 6 days ago
  
  Agree with everything you're saying here, but to be fair I think the analogy with Progressive JPEG doesn't sit quite right with your concept. What you're describing sounds more like "semantic-aware streaming" - it's as if a Progressive JPEG would be semantically aware of its blob and load any objects that are in focus first before going after data for things that are out of focus.
  I think that's a very contemporary problem and worth pursuing, but I also somehow won't see that happening in real-time (with the priority to reduce latency) without necessary metadata.
  
  danabramov ・ 6 days ago
  ・ 3 more
  
  It’s not an exact analogy but streaming outside-in (with gradually more and more concrete visual loading states) rather than top-down feels similar to a progressive image to me.
  
  whilenot-dev ・ 5 days ago
  ・ 2 more
  
  It's data (JPEG/JSON) VS software (HTML/CSS/JS)... you can choose to look at HTML/CSS/JS as just some chunks of data, or you can look at it as a serialized program that wants to be executed with optimal performance. Your blog post makes it seem like your focus is on the latter (and it's just quite typical for react applications to fetch their content dynamically via JSON), and that's where your analogy to the progressive mode of JPEGs falls a bit flat and "streaming outside-in" doesn't seem like all you want.
  Progressively loaded JPEGs just apply some type of "selective refinement" to chunks of data, and for Progressive selective refinement to work it's necessary to "specify the location and size of the region of one or more components prior to the scan"[0][1]. If you don't know what size to allocate, then it's quite difficult(?) to optimize the execution. This doesn't seem like the kind of discussion you'd like to have.
  Performance aware web developers are working with semantic awareness of their content in order to make tweaks to the sites loading time. YouTube might prefer videos (or ads) to be loaded before any comments, news sites might prioritize text over any other media, and a good dashboard might prioritize data visualizations before header and sidebar etc.
  The position of the nodes in any structured tree tells you very little about the preferred loading priority, wouldn't you agree?
  [0] https://jpeg.org/jpeg/workplan.html
  [1] https://www.itu.int/ITU-T/recommendations/rec.aspx?id=3381 (see D.2 in the PDF)
  EDIT: Btw thanks for your invaluable contributions to react (and redux back then)!
  
  danabramov ・ 5 days ago
  
  I used this analogy more from the user's perspective (as a user, a gradually sharpening image feels similar to a website with glimmers gradually getting replaced by revealing content). I don't actually know how JPEG is served under the hood (and the spec is too dense for me) so maybe if you explain the point a bit closer I'll be able to follow. I do believe you that the analogy doesn't go all the way.
  RSC streams outside-in because that's the general shape of the UI — yes, you might want to prioritize the video, but you have to display the shell around that video first. So "outside-in" is just that common sense — the shell goes first. Other than that, the server will prioritize whatever's ready to be written to the stream — if we're not blocked on IO, we're writing.
  The client does some selective prioritization on its own as it receives stuff (e.g. as it loads JS, it will prioritize hydrating the part of the page that you're trying to interact with).
undefined ・ 6 days ago

[deleted]

camgunz ・ 6 days ago

I don't mean to be dismissive, but haven't we solved this by using different endpoints? There's so many virtues: you avoid head of line blocking; you can implement better filtering (eg "sort comments by most popular"); you can do live updates; you can iterate on the performance of individual objects (caching, etc).

---

I broadly see this as the fallout of using a document system as an application platform. Everything wants to treat a page like a doc, but applications don't usually work that way, so lots of code and infra gets built to massage the one into the other.

danabramov ・ 5 days ago

Sort of! I have two (admittedly long) articles on this topic, comparing how the code tends to evolve with separate endpoints and what the downsides are:
- https://overreacted.io/one-roundtrip-per-navigation/
- https://overreacted.io/jsx-over-the-wire/
The tldr is that endpoints are not very fluid — they kind of become a "public" API contract between two sides. As they proliferate and your code gets more modular, it's easy to hurt performance because it's easy to introduce server/client waterfalls at each endpoint. Coalescing the decisions on the server as a single pass solves that problem and also makes the boundaries much more fluid.
- camgunz ・ 4 days ago
  
  Oh I see. Maybe a way to restate this is "how do I communicate the costs of data to the client", that is the cost of returning top-level user data is let's just say 1; the cost of returning the last 10 comments is 2, and the cost of returning older comments is 2000. Because otherwise pushing that set of decisions back to the server doesn't exactly solve it, it just means you now actually can make that decision server side, even though you're still waiting a long time on comment 11 no matter what.
  Re your "JSX Over The Wire" post, I think we've gone totally around the bend. A piece of code that takes 0 or more responses from a data backend and returns some kind of HTML is a web service. Like, that's CGI, that's PHP, that's Rails, Node, Django, whatever. If the argument here is "the browser should have some kind of state tracking/reactivity built in, and until that day we have a shim like jQuery or the old school thin React or the new school htmx" then OK, but this is so, so much engineering to elide `onclick` et al.
  ---
  I kind of worry that we've spent way, way too much time in these weeds. There's millions and millions of lines of React out there, and certainly the majority of it is "stitch the responses of these N API calls together into a view/document, maybe poll them for updates from time to time", to the degree that AI just does it now. If it's so predictable that a couple of video cards can do it in a few seconds, why have we spent gazillions of engineering years polishing this?

jarym ・ 6 days ago

This appears conceptually similar to something like line-delimited JSON with JSON Patch[1].

Personally I prefer that sort of approach - parsing a line of JSON at a time and incrementally updating state feels easier to reason and work with (at least in my mind)

[1] https://en.wikipedia.org/wiki/JSON_Patch

rictic ・ 5 days ago

If you've got some client side code and want to parse and render JSON progressively, try out jsonriver: https://github.com/rictic/jsonriver

Very simple API, takes a stream of string chunks and returns a stream of increasingly complete values. Helpful for parsing large JSON, and JSON being emitted by LLMs.

Extensively tested and performance optimized. Guaranteed that the final value emitted is identical to passing the entire string through JSON.parse.

jumploops ・ 4 days ago

What’s the benefit of `jsonriver` over one of the myriad of “best effort” parsers[0][1][2] in a try/catch loop while streaming?
[0]https://github.com/beenotung/best-effort-json-parser
[1]https://github.com/unjs/destr
[2]https://www.npmjs.com/package/json-parse-even-better-errors
- rictic ・ 4 days ago
  
  Good question. jsonriver is well optimized, exhaustively tested (tens of thousands of test cases), and provides a number of potentially useful invariants[0] about its parsing.
  jsonriver's performance comes primarily from simplicity, and doing as little work per character as we can. Repeatedly reparsing from scratch on the other hand gets expensive quick, your parse time is quadratic in the length of the string to parse.
  [0] https://github.com/rictic/jsonriver?tab=readme-ov-file#invar...
  
  jumploops ・ 4 days ago
  
  I've been looking for a better solution for awhile now (if you couldn't tell), and will definitely try out jsonriver for our use-case. Thanks!

aljow ・ 6 days ago

If it has to be mangled to such an extent to do this, then it seems reasonable to assume JSON is the wrong format for the task.

Better to rethink it from scratch instead of trying to put a square peg in a round hog.

danabramov ・ 6 days ago

I'm being a bit coy about it but the article aims to describe key ideas in the RSC wire protocol, which is an implementation detail of React and isn't actually beholden to JSON itself. JSON is just a nice starting point to motivate it. However, I think reusing JSON for object notation kind of makes sense (and allows native JSON.parse calls for large objects).
undefined ・ 6 days ago

[deleted]

harrall ・ 6 days ago

I don’t think progressive loading is innovative.

What is innovative trying to build a framework that does it for you.

Progressive loading is easy, but figuring out which items to progressively load and in which order without asking the developer/user to do much extra config is hard.

danabramov ・ 6 days ago

Right, which is why I describe a framework that does it for you (RSC) at the end of the article. The article itself is meant as an explanation of how RSC works under the hood.
motorest ・ 6 days ago

> Progressive loading is easy, but figuring out which items to progressively load and in which order without asking the developer/user to do much extra config is hard.
Do developers even control the order in which stuff is loaded? Tha depends on factors beyond a developer's control, such as the user's network speed, the origin server's response speed, which resources are already cached, how much data each request fetches for user A or user B, etc.
hobs ・ 6 days ago

That's because its basically cache invalidation.

pjungwir ・ 5 days ago

Does this scheme give a way to progressively load slices of an array? What I want is something like this:

    ["foo", "bar", "$1"]

And then we can consume this by resolving the Promise for $1 and splatting it into the array (sort of). The Promise might resolve to this:

    ["baz", "gar", "$2"]

And so on.

And then a higher level is just iterating the array, and doesn't have to think about the promise. Like a Python generator or Ruby enumerator. I see that Javascript does have async generators, so I guess you'd be using that.

The "sort of" is that you can stream the array contents without literally splatting. The caller doesn't have to reify the whole array, but they could.

EDIT: To this not-really-a-proposal I propose adding a new spread syntax, ["foo", "bar", "...$1"]. Then your progressive JSON layer can just deal with it. That would be awesome.

danabramov ・ 5 days ago

From what I understand of the RSC protocol which the post is based on (might be wrong since I haven't looked closely at this part), this is supported: https://github.com/facebook/react/pull/28847.
>The format is a leading row that indicates which type of stream it is. Then a new row with the same ID is emitted for every chunk. Followed by either an error or close row.

yen223 ・ 6 days ago

I've never really thought about how all the common ways we serialise trees in text (JSON, s-expressions, even things like tables of content, etc), serialise them depth-first.

I suppose it's because doing it breadth-first means you need to come up with a way to reference items that will arrive many lines later, whereas you don't have that need with depth-first serialisation.

NoahZuniga ・ 6 days ago

Also it makes memory allocation easier

jimmcslim ・ 6 days ago

I understand the GraphQL has fallen out of favour somewhat, but wasn’t it intended to solve for this?

owebmaster ・ 6 days ago

It can't fall out of favor if it was never really in favor to begin with. GraphQL was a quite brief hype then a big technical debt.
- RexM ・ 5 days ago
  
  Interesting take considering graphql adoption is growing and generally in favor at my company.
- tunesmith ・ 6 days ago
  
  So if not graphql, then what's the latest "in-favor" thinking to solve the problem of underfetching and overfetching? Especially in an environment with multiple kinds of frontends?
- cluckindan ・ 6 days ago
  
  What do you mean by technical debt here?
  
  owebmaster ・ 5 days ago
  ・ 4 more
  
  Everywhere I worked with GraphQL it was always a pain for the backend team to keep the graphql server updated and also a pain to use in the frontend, simple REST apis or JSON-RPC are much better.
  
  cluckindan ・ 5 days ago
  ・ 3 more
  
  Interesting. Why did you not have these pains with other tech? Team unfamiliar with GraphQL?
  
  owebmaster ・ 4 days ago
  ・ 2 more
  
  > Why did you not have these pains with other tech?
  You don't need a new layer between database -> backend -> frontend
  So GraphQL became a backend-for-frontend layer that needs maintaince. The team knowing GraphQL or not is not what causes this, but definitely makes it worse as they use the tool, which is quite complex, wrongly.
  NextJS is the "evolution" (or more like growth? as a tumor) of this backend-for-frontend approach, glued with the mentioned RSC in this post. Great recipe to fight accidental complexity all days and nights.
  
  cluckindan ・ 2 days ago
  
  So because they were not generating resolvers from the schema, a maintenance burden was created. You CAN create resolvers manually but that’s really not the best idea when automated approaches exist. That’s poor engineering, not bad technology.
  GraphQL federation is where the thing really shines.
tgasson ・ 5 days ago

The GraphQL payload has redundant copies of the objects within. Relay has an experimental (dead) exploration of normalising data on the server[1] in order to avoid that transport cost.
For @defer, the transport is a stream[2] where subsequent payloads are tracked ("pending") and as those payloads arrive they are "patched" into the existing data at the appropriate path.
Both of these are effectively implementations of this Progressive JSON concept.
I can picture a future where both the JSX and object payloads are delivered progressively over the wire in the same stream, and hydrate React and Relay stores alike. That could potentially simplify and eliminate the need for Relay to synchronise it's own updates with React. It would also potentially be a suitable compile target for decoupling, React taking ownership of a normalised reactive object store (with state garbage collection which is a hard problem for external stores to solve) with external tools like Relay and TanStack Query providing the type-generation & fetching implementation details.
Using `new Promise()` in the client-side store would also mean you could potentially shrink the component API to this. `useFragment(fragment IssueAssignee on Issue { assignee { name }}, issue)` could instead be `const assignee = use(issue.assignee)` and would suspend (or not) appropriately.
[1]: https://github.com/facebook/relay/blob/main/packages/relay-r...
[2]: https://github.com/graphql/graphql-wg/blob/main/rfcs/DeferSt...
danabramov ・ 6 days ago

From what I recall, GraphQL has a feature that's similar (@defer) but I'm not familiar enough to compare them. RSC was definitely inspired by GraphQL among other things.
delichon ・ 6 days ago

For serialization GraphQL uses ... JSON.
GraphQL could use Progressive JSON to serialize subscriptions.
- Spivak ・ 6 days ago
  
  I think the point is that GraphQL solves the problem, a client only actually needing a subset of the data, by allowing the client to request only those fields.

tills13 ・ 6 days ago

Holy the pomp in this thread. It would perhaps help for some people here to have the context that this isn't some random person on the internet but Dan Abromov -- probably one of the most influential figures in building React (if not one of the creators, iirc)

kolme ・ 6 days ago

He got famous because of "redux" and "hot module reload" and then he got hired by Meta and started working on react.
This was before the hook era.
yard2010 ・ 6 days ago

Dan is hands down THE best captain to steer this ship - he manages to push react forward even though it changed a lot (and faced many growth pains and challenges) in the last few years. He is doing it in his own special way - he is kind, thoughtful, patient and visionary. He is the best kind of master teacher there is - although he has many many years of experience, he understands exactly what newbies don't understand. That's inspiring.
Read a few of his many comments in any React issue and see what I mean. We are truly gifted. Dan you are my idol!
timeflex ・ 4 days ago

You're free to say thank you but acting like everyone else here should is weird. There are a lot of good points.

roxolotl ・ 6 days ago

This feels very similar to JSON API links[0]. This is a great way to implement handling resolving those links on the frontend though.

0: https://jsonapi.org/format/#document-links

mdaniel ・ 6 days ago

In the recent Web 2.0 2.0 submission <https://news.ycombinator.com/item?id=44073785> there was some HATEOAS poo-poo-ing, but maybe this is the delivery mechanism which makes that concept easier to swallow, 'cause JSON

jongjong ・ 5 days ago

This is an interesting idea. I solved this problem in a different way by loading each resource/JSON individually, using foreign keys to link them on the front end. This can add latency/delays with deeply nested child resources but it was not a problem for any of the use cases I came across (pages/screens rarely display parent/child resources connected by more than 3 hops; and if they do, they almost never need them to be loaded all at once).

But anyway this is a different custom framework which follows the principle of resource atomicity and a totally different direction than GraphQL approach which follows the principle of aggregating all the data into a big nested JSON. The big JSON approach is convenient but it's not optimized for this kind of lazy loading flexibility.

IMO, resource atomicity is a superior philosophy. Field-level atomicity is a great way to avoid conflicts when supporting real-time updates. Unfortunately nobody has shown any interest or is even aware of its existence as an alternative.

We are yet to figure out that maybe the real issue with REST is that it's not granular enough (should be field granularity, not whole resource)... Everyone knows HTTP has heavy header overheads, hence you can't load fields individually (there would be too many heavy HTTP requests)... This is not a limitation for WebSockets however... But still, people are clutching onto HTTP; a transfer protocol originally designed for hypertext content, as their data transport.

nixpulvis ・ 6 days ago

I've always liked the idea of putting latency requirements in to API specifications. Maybe that could help delimit what is and is not automatically inlined as the author proposes.

xtajv ・ 6 days ago

Choose APIs that offer SLAs. <3
It's not about being picky. It's about communicating needs, and setting boundaries that are designed to satisfy those needs without overwhelming anybody's system to the point of saturation and degraded performance.
- nixpulvis ・ 5 days ago
  
  Right, but what if some of that SLA information actually directed the code itself.
  In the context of this blog post, what if the SLA was <100ms for an initial response, with some mandatory fields, but then any additional information which happens to be loaded within that 100ms automatically is included. With anything outside the 100ms is automatically sent in a followup message?

jsnelgro ・ 4 days ago

I feel like this is a great practical example showcasing the value of static types and designing your data model up front. If you know the structure of the object, you can put in placeholders while the real thing loads in. Then you get to choose how it loads in. Ideally you can stream the patches via serialized actions. This is basically Redux/Elm in a nutshell where the action objects can come from events sent via SSE/websockets/polling/etc.

Reducing a change log is about as stateless, functional, and elegant as it gets. I love these sorts of designs but reality seems to complicate them in unexpected ways unfortunately. Still worth striving for though!

bilater ・ 5 days ago

This is something I've been thinking about ever since I saw BAML. Progressive streaming for JSON should absolutely be a first class thing in Javascript-land.

I wonder if Gemini Diffusion (and that class of models) really popularize this concept as the tokens streamed in won't be from top to bottom.

Then we can have a skeleton response that checks these chunks, updates those value and sends them to the UI.

bob1029 ・ 6 days ago

> We can try to improve this by implementing a streaming JSON parser.

In .NET land, Utf8JsonReader is essentially this idea. You can parse up until you have everything you need and then bail on the stream.

https://learn.microsoft.com/en-us/dotnet/standard/serializat...

OrangeMusic ・ 2 days ago

GraphQL has a defer mode that looks like this. You receive the fast pieces first, and the slower pieces of the json come later, with the path to where they should be attached to.

aperturecjs ・ 6 days ago

I previously wrote a prototype of streaming a JSON tree this way:

https://github.com/rgraphql/rgraphql

But it was too graphql-coupled and didn't really take off, even for my own projects.

But it might be worth revisiting this kind of protocol again someday, it can tag locations within a JSON response and send updates to specific fields (streaming changes).

inglor ・ 6 days ago

I am not sure the wheel can be rediscovered many more times but definitely check out Kris's work from around 2010-2012 around q-connection and streaming/rpc of chunks of data. Promises themselves have roots in this and there are better formats for this.

Check our mark miller's E stuff and thesis - this stuff goes all the way back to the 80s.

inglor ・ 6 days ago

Not to disrespect Dan here, each discovery is impressive on its own but I wish we had a better way to preserve this sort of knowledge.
- vanderZwan ・ 6 days ago
  
  > I wish we had a better way to preserve this sort of knowledge.
  It's called "being part of the curriculum" and apparently the general insights involved aren't, so far.

alganet ・ 5 days ago

This breaks JSON. Now we need a different JSON that escapes the $ sign, and it is incompatible with other JSON parsers.

Also, not a single note about error handling?

There is already a common practice around streaming JSON content. One JSON document per line. This also breaks JSON (removal of newline whitespace), but the resulting documents are backwards compatible (a JSON parser can read them).

Here's a simpler protocol:

Upon connecting, the first line sent by the server is a JavaScript function that accepts 2 nullable parameters (a, b) followed by two new lines. All the remaining lines are complete JSON documents, one per line.

The consuming end should read the JavaScript function followed by two new lines and execute it once passing a=null, b=null.

If that succeeds, it stores the return value and moves to the next line. Upon reading a complete JSON, it executes the function passing a=previousReturn, b=newDocument. Do this for every line consumed.

The server can indicate the end of a stream by sending an extra new line after a document. It can reuse the socket (send another function, indicating new streamed content).

Any line that is not a JavaScript function, JSON document or empty is considered an error. When one is found by the consuming end, it should read at most 1024 bytes from the server socket and close the connection.

TL;DR just send one JSON per line and agree on a reduce function between the producer and consumer of objects.

uncomplete ・ 6 days ago

jsonl is json objects separated by endline characters. Used in Bedrock batch processing.

markerz ・ 6 days ago

ndjson is extremely similar, Splunk uses it for exporting logs as json
- lucb1e ・ 6 days ago
  
  From a quick lookup, aren't "newline-delimited json" and "json lines" identical? Different name for the same thing?
  
  Izkata ・ 6 days ago
  
  Came up at work a few weeks ago when a co-worker used "ndjson" which I'd never heard of before, but I knew "jsonl" which he'd never heard of before: As far as I could tell with some searching, they are basically the same thing and have two different names because they came from two different places. "ndjson" was a full-on spec, while "jsonl" was more informal - kind of like an enterprise vs open source, that converged on the same idea.
  From wikipedia, "ndjson" used to include single-line comments with "//" and needed custom parsers for it, but the spec no longer includes it. So now they are the same.
  
  yencabulator ・ 5 days ago
  
  ndjson has an actual spec (however bitrotted), everything else in that space makes rookie mistakes like not specifying that a newline is a required message terminator -- consider receiving "13\n42", is that truncated or not?
  https://github.com/ndjson/ndjson.github.io/issues/1#issuecom...
  None of the above is actually good enough to build on, so a thousand little slightly-different ad hoc protocols bloom. For example, is empty line a keepalive or an error? (This might be perfectly fine. They're trivial to program, not like you need a library.)

PetahNZ ・ 5 days ago

Reminds me of Oboe.js

https://oboejs.com/

sriku ・ 6 days ago

Would a stream where each entry is a list of kv-pairs work just as well? The parser is then expected to apply the kv pairs to the single json object as it is receiving them. The key would describe a json path in the tree - like 'a.b[3].c'.

KronisLV ・ 6 days ago

I feel like in an ideal world, this would start in the DB: your query referencing objects and in what order to return them (so not just a bunch of wide rows, nor multiple separate queries) and as the data arrives, the back end could then pass it on to the client.

powgpu ・ 6 days ago

Many db has done that, 4d.com is one that comes to mind. It is kinda like socket.io + PostgreSQL + node/ruby/php (middleware layer) all in one. In db there is also concept of cursor, etc.
Seems like it is never about merit of technological design. As some CS professor put it, tech is more about fashion than tech now days. IMHO that is true, and often also comes down to the technological context surrounding the industry at the time, and now days if the code is open sourced/FOSS.
pigbearpig ・ 6 days ago

That’s not going to make the front page of HN though.

rabiescow ・ 4 days ago

I think the main issue with this goes against the fundamental vore principles of a json object has no internal order. If it was an ordered object it would be very different and maybe feasible. But you would have to change the complete standard if JSON. It makes no sense speaking of an object without order as if it has order in terms of "header" etc...

Aeolun ・ 6 days ago

I think the problem with this is that it makes a very simple thing a lot harder. I don’t want to try and debug a JSON stream that can fail at any point. I just want to send a block of text (which I generate in 2ms anyway) and call it a day.

jlokier ・ 5 days ago

2ms to generate, 1 second for basic text to appear and 20 more seconds to receive the whole page on my phone in the centre of town, due to poor service.
Compared with waiting on a blank page for ages, sometimes it's nice to see text content if it's useful, and to be able to click navigation links early. It's much better than pages which look like they have finished loading but important buttons and drop-downs are broken without any visible indication because there's JS still loading in the background. I'm also not fond of pages where you can select options and enter data, and then a few seconds after you've entered data, all the fields reset as background loading completes.
All the above are things I've experienced in the last week.

nesarkvechnep ・ 5 days ago

In my opinion, REST, proper, hypertext driven, solves the same problems. When you have small, interlinked, cacheable resources, the client decides how many relations to follow.

techpression ・ 5 days ago

Reading this makes me even happier I decided on Phoenix LiveView a while back. React has become a behemoth requiring vendor specific hosting (if you want the bells and whistles) and even a compiler to overcome all the legacy.

Most of the time nobody needs this, make sure your database indexes are correct and don’t use some under powered serverless runtime to execute your code and you’ll handle more load than most people realize.

If you’re Facebook scale you have unique problems, most of us doesn’t.

gavinray ・ 5 days ago
```
  > React has become a behemoth requiring vendor specific hosting
```
This is one of the silliest things I've read in a while.
React is sub-3kB minified + gzip'ed [0], and the grand majority of React apps I've deployed are served as static assets from a fileserver.
My blog runs off of Github Pages, for instance.
People will always find a way to invent problems for themselves, but this is a silly example.
[0] https://bundlephobia.com/package/react@19.1.0
- MrJohz ・ 5 days ago
  
  I find the Krausest benchmarks[0] to be useful for these sorts of comparisons. There are always flaws in benchmarks, and this one particularly is limited to the performance for DOM manipulation of a relatively simple web application (the minimal VanillaJS implementation is about 50 lines of code). That said, Krausest and the others who work on it do a good job of ensuring the different apps are well-optimised but still idiomatic, and it works well as a test of what the smallest meaningful app might look like for a given framework.
  I typically compare Vanilla, Angular, SolidJS, Svelte, Vue Vapor, Vue, and React Hooks, to get a good spread of the major JS frameworks right now. Performance-wise, there are definitely differences, but tbh they're all much of a muchness. React famously does poorly on "swap rows", but also there's plenty of debate about how useful "swap rows" actually is as a benchmark.
  But if you scroll further down, you get to the memory allocation and size/FCP sections, and those demonstrate what a behemoth React is in practice. 5-10× larger than SolidJS or Svelte (compressed), and approximately 5× longer FCP scores, alongside a significantly larger runtime memory than any other option.
  React is consistently more similar to a full Angular application in most of the benchmarks there than to one of the more lightweight (but equally capable) frameworks in that list. And I'm not even doing a comparison with microframeworks like Mithril or just writing the whole thing in plain JS. And given the point of this article is about shaving off moments from your FCP by delaying rendering, surely it makes sense to look at one of the most significant causes to FCP, namely bundle size?
  [0]: https://krausest.github.io/js-framework-benchmark/2025/table...
- owebmaster ・ 5 days ago
  
  > This is one of the silliest things I've read in a while.
  You know that the author of this post is the creator of React and that he's been pushing for RSC/Vercel relentlessly, right?
  btw reactdom is ~30kb gzipped so React minimal bundle is around 35kb
  
  whilenot-dev ・ 5 days ago
  
  Dan Abramov isn't "the creator of React", he just became an evangelist for react ever since he got to the team at Facebook through his work on redux. He is pushing for RSC (as that's where react's future seems to be), but what makes you think he's pushing for Vercel?
  
  danabramov ・ 5 days ago
  
  I'm not the creator of React (that would be Jordan). I've also never taken any money from Vercel and I don't care about it. I do think RSC is an interesting technology and I like writing about it while I'm on my sabbatical.
  
  gavinray ・ 5 days ago
  ・ 2 more
  
  If you really want to bikeshed over size, you can use Preact which is a genuine 3kB full drop-in for React.
  
  owebmaster ・ 5 days ago
  
  Why would I? React is just bad, the change from classes/function components to hooks abstraction was terrible but the current push to RSC made me quit 2 years ago with zero regrets. Life is great when you don't need to debug zillions of useless components re-render.
- iammrpayments ・ 4 days ago
  
  There’s something wrong with these stats, if I open the network tools it shows 40kb+ for React
  
  acemarke ・ 4 days ago
  
  React's actual implementation is in the reconciler logic, which is then built into the platform-specific packages like ReactDOM and React Native.
  So, the `react` core package is tiny because it just has a few small common methods and shims for the hooks, and all of the actual logic and bundle size are in `react-dom`.
- techpression ・ 5 days ago
  
  There’s more to it than size, the framework itself and its execution speed and behaviors. Look at a flame graph of any decent React app for example.
  Sure, I could’ve been clearer, but you did forget react-dom. And good luck getting RSC going on GH pages.
  
  danabramov ・ 5 days ago
  ・ 2 more
  
  RSC is perfectly capable of producing static sites. My site is hosted for free on Cloudflare with their static free plan.
  
  techpression ・ 5 days ago
  
  That’s not what I said though. You can generate a static site using spring in Java too, doesn’t mean it actually runs Java.

geokon ・ 5 days ago

This is outside my realm of experience, isn't this kind part of the utility of a triple-store? Isn't that the canonical way to flatten trees data to a streamable sequence?

I think you'd also need to have some priority mechanism for which order to send your triple store entries (so you get the same "breadth first" effect) .. and correctly handle missing entries.. but that's the data structure that comes to mind to build off of

aaronvg ・ 5 days ago

You might also find Semantic Streaming interesting. It's t he same concept but applied to llm token streaming. It's used in BAML (the ai framework). https://www.boundaryml.com/blog/semantic-streaming

I'm one of the developers of BAML.

dejj ・ 6 days ago

Reminds me of Aftertext, which uses backward references to apply markup to earlier parts of the data.

Think about how this could be done recursively, and how scoping could work to avoid spaghetti markup.

Aftertext: https://breckyunits.com/aftertext.html

tarasglek ・ 6 days ago

People put so much effort into streaming Json parsing whereas we have a format called Yaml which takes up less characters on the wire and happens to work incrementally out of the box meaning that you can reparse the stream as it's coming in without having to actually do any incremental parsing

motorest ・ 6 days ago

> People put so much effort into streaming Json parsing whereas we have a format called (...)
There are many formats out there. If payload size is a concern, everyone is far better off enabling HTTP response compression instead of onboarding a flavor-of-the-month language.
rk06 ・ 6 days ago

Yaml makes json appear user friendly by comparison.
Last thing one want in a wire format is white space sensitivity and ambiguous syntax. Besides, if you are really transferring that much json data, there are ways to achieve it that solves the issues
croes ・ 6 days ago

YAML is more complex and harder to parse
filoeleven ・ 5 days ago

If we're talking about niche data protocols, edn is hard to beat. Real dates and timestamps and comments, namespaced symbols, tagged elements, oh my!
https://github.com/edn-format/edn

polyomino ・ 6 days ago

We encountered this problem when converting audio only LLM applications to visual + audio. The visuals would increase latency by a lot since they need to be parsed completely before displaying, whereas you can just play audio token by token and wait for the LLM to generate the next one while audio is playing.

65 ・ 5 days ago

Here's a random, crazy idea:

What if instead of streaming JSON, we streamed CSV line by line? That'd theoretically make it way easier to figure out what byte to stream from and then parse the CSV data into something usable... like a Javascript object.

metalrain ・ 6 days ago

It feels like this idea needs Cap'n'Proto style request inlining so client can choose what parts to stream instead of getting everything asynchronously.

https://capnproto.org/

anonzzzies ・ 6 days ago

So is there a library / npm to do this? Even his not good cases example; just making partial JSON to parse all the time. I don't care if it's just the top and missing things, as long as it always parses as legal json.

sillyboi ・ 4 days ago

Honestly, this approach feels like it adds a lot of unnecessary complexity. It introduces a custom serialization structure that can easily lead to subtle UI bugs and a nightmare of component state tracking. The author seems to be solving two issues at once: large payloads and stream-structured delivery. But the latter only really arises because of the former.

For small to medium JSON responses, this won't improve performance meaningfully. It’s hard to imagine this being faster or more reliable than simply redesigning the backend to separate out the heavy parts (like article bodies or large comment trees) and fetch them independently. Or better yet, just use a proper streaming response (like chunked HTTP or GraphQL @defer/@stream).

In practice, trying to progressively hydrate JSON this way may solve a niche problem while creating broader engineering headaches.

Existenceblinks ・ 6 days ago

It's useless as data is not just some graphic semantic, they have relation, business rules on top, not ready to interact with if not all are ready, loaded.

danabramov ・ 6 days ago

It’s definitely not useless. You’re right that it requires the interpreting layer to be able to handle missing info. The use case at the end of the article is streaming UI. UI, unlike arbitrary data, is actually self-describing — and we have meaningful semantics for incomplete UI (show the closest loading state placeholder). That’s what makes it work, as the article explains in the last section.
- Existenceblinks ・ 5 days ago
  
  Thanks Dan. Yes, I agreed on the ui part, it seems to work in most cases. Some html tags have relation like `<datalist>` or `[popover]` attribute, but if we make all kind of relations trivial then it's benefit for sure.
  
  danabramov ・ 5 days ago
  
  Yea, and also to clarify by "UI", I don't necessarily mean HTML — it could be your own React components and their props. In idiomatic React, you generally don't have these kinds of "global" relations between things anyway. (They could appear inside components but then presumably they'd be bound by matching IDs.)

tacone ・ 4 days ago

I don't really like the use of comments to mark variables, an ad-hoc syntax would be probably a better idea.

K0IN ・ 4 days ago

For me this seems over complicated, or am i missing something?

Any benefits using this over jsonl + json patch?

yencabulator ・ 5 days ago

SvelteKit has something like this to facilitate loading data where some of the values are Promises. I don't think the format is documented for external consumption, but it basically does this: placeholders for values where the JSON value at that point is still loading, replaced by streaming the results as they complete.

https://svelte.dev/docs/kit/load#Streaming-with-promises

behnamoh ・ 6 days ago

I think the pydantic library has something similar that involves validating streaming JSON from large language models.

ChrisMarshallNY ・ 6 days ago

This would be good.

I got really, really sick of XML, but one thing that XML parsers have always been good at, is realtime decoding of XML streams.

It is infuriating, waiting for a big-ass JSON file to completely download, before proceeding.

Also JSON parsers can be memory hogs (but not all of them).

hsbauauvhabzb ・ 6 days ago

Json is just a packing format that does have that limitation. If you control the source and the destination, could you possibly use a format that supports streaming better like Protobuf?
- zzo38computer ・ 6 days ago
  
  I had invented a variant of DER called DSER (Distinguished Streaming Encoding Rules), which is not compatible with DER (nor with BER) but is intended for when streaming is needed.
  The type and value are encoded the same as DER, but the length is different:
  - If it is constructed, the length is omitted, and a single byte with value 0x00 terminates the construction.
  - If it is primitive, the value is split into segments of lengths not exceeding 255, and each segment is preceded by a single byte 1 to 255 indicating the length of that segment (in bytes); it is then terminated by a single byte with value 0x00. When it is in canonical form, the length of segments other than the last segment must be 255.
  Protobuf seems to not do this unless you use the deprecated "Groups" feature, and this is only as an alternative of submessages, not for strings. In my opinion, Protobuf also seems to have many other limits and other problems, that DER (and DSER) seems to do better anyways.
- sopooneo ・ 6 days ago
  
  I've heard two side the the Protobuf/streaming idea. On my first introduction, it seemed you could. But later reading leads me to believe it is only almost streamable: https://belkadan.com/blog/2023/12/Protobuf-Is-Almost-Streama....
  * I do acknowledge you qualified the question with "better".
- hombre_fatal ・ 6 days ago
  
  What stops you from parsing tokens from a stream like a SAX parser for JSON?
  [ ["aaa", "bbb"], { "name", "foo" } ]
  Start array Start array String aaa String bbb End array Start object Key name String foo End object End array
  
  ChrisMarshallNY ・ 6 days ago
  ・ 2 more
  
  Nothing, really, but I don’t have the bandwidth to write JSAX. I wonder why it hasn’t already been done by someone more qualified than I am. I suspect that I’d find out, if I started doing it.
  You can do that, in a specialized manner, with PHP, and Streaming JSON Parser[0]. I use that, in one of my server projects[1]. It claims to be JSON SAX, but I haven’t really done an objective comparison, and it specializes for file types. It works for my purposes.
  [0] https://github.com/salsify/jsonstreamingparser
  [1] https://github.com/LittleGreenViper/LGV_TZ_Lookup/blob/main/...
  
  hombre_fatal ・ 5 days ago
  
  Streaming JSON parsers certainly exist. I'm just pointing out there's nothing about JSON that makes it inherently harder to stream than an XML tree.
  In response to "Json is just a packing format that does have that [streaming] limitation".
- ChrisMarshallNY ・ 6 days ago
  
  Yes, but that also interferes with portability.
  I’ve written a lot of APIs. I generally start with CSV, convert that to XML, then convert that to JSON.
  CSV is extremely limited, and there’s a lot of stuff that can only be expressed in XML or JSON, but starting with CSV usually enforces a “stream-friendly” structure.

jto1218 ・ 5 days ago

typically if we need to lazy load parts of the data model we make multiple calls to the backend for those pieces. And our redux state has indicators for loading/loaded so we can show placeholders. Is the idea that that kind of setup is inefficient?

efitz ・ 6 days ago

What if we like, I don’t know, you know, separate data from formatting?

1vuio0pswjnm7 ・ 5 days ago

"Because the format is JSON, you're not going to have a valid object tree until the last byte loads. You have to wait for the entire thing to load, then call JSON.parse, and then process it.

I have a filter I wrote that just reformats JSON into line-delimited text that can be processed immediately by line-oriented UNIX utilities. No waiting.

"The client can't do anything with JSON until the server sends the last byte."

"Would you call [JSON] good engineering?"

I would not call it "engineering". I would call it design.

IMO, djb's netstrings^1 is better design. It inspired similar designs such as bencode.^2

1. https://cr.yp.to/proto/netstrings.txt (1997)

2. https://wiki.theory.org/BitTorrentSpecification (2001)

"And yet [JSON's] the status quo-that's how 99.9999%^* of apps send and process JSON."

Perhaps "good" does not necessarily correlate with status quo and popularity.

Also, it is worth considering that JSON was created for certain popular www browsers. It could piggyback on the popularity of that software.

its-summertime ・ 6 days ago

why send the footer above the comments? Maybe its not a footer then but a sidebar? Should be treated as a sidebar then? Besides this could all kinda be solved by still using plain streaming json and sending .comments last?

danabramov ・ 6 days ago

Part of the point I'm making is that an out-of-order format is more efficient because we can send stuff as it's ready (so footer can go as soon as it's ready). It'll still "slot in" the right place in the UI. What this lets us do, compared to traditional top-down streaming, is to progressively reveal inner parts of the UI as more stuff loads.

izger ・ 5 days ago

Interesting idea. Another way to implement the same without breaking json protocol framing is just sent {progressive: "true"} {a:"value"} {b:"value b"} {c: {d}c:"value b"} .. {progressive: "false"}

and have

{ progressive: "false", a:"value", b:"value b", .. }

on top of that add some flavor of message_id, message_no (some other on your taste) and you will have a protocol to consistently update multiple objects at a time.

philippta ・ 6 days ago

This sounds suspiciously similar to CSV.

defraudbah ・ 5 days ago

nah, I got enough with TCP

EugeneOZ ・ 5 days ago

Just don't resurrect HATEOAS monster, please.

creatonez ・ 5 days ago

This HN thread is fascinating. A third of the commenters here only read 1/3 of the article, another third read 2/3 of the article, and another third actually read the whole article. It's almost like the people in this thread linearly loaded the article and stopped at random points.

Please, don't be the next clueless fool with a "what about X" or "this is completely useless" response that is irrelevant to the point of the article and doesn't bother to cover the use case being proposed here.

inglor ・ 6 days ago

Check our mark miller's E stuff and thesis - this stuff goes all the way back to the 80s.

inglor ・ 6 days ago

@dang - I hit "reply" once (I am sure of that) and I see my (identical) comment twice in the UI. Not sure what sort of logging/tracing/instrumentation you have in place - I am not delete'ing this so you have a chance to investigate but if that's not useful by all means feel free to do so.

fdoifdois ・ 5 days ago

[flagged]

animanoir ・ 6 days ago

[dead]

okasaki ・ 6 days ago

Reinventing pagination

?page=3&size=100

slt2021 ・ 6 days ago

if you ever feel the need to send progressive JSON - just zip it and don't bother solving fake problem at the wrong abstraction layer

aloha2436 ・ 6 days ago

The article doesn't advocate sending it progressively to make it smaller on the wire. The motivating example is one where some of the data (e.g. posts) is available before the rest of the data in the response (e.g. comments). Rather than:
- Sending a request for posts, then a request for comments, resulting in multiple round trips (a.k.a. a "waterfall"), or,
- Sending a request for posts and comments, but having to wait until the commends have loaded to get the posts,
...you can instead get posts and comments available as soon as they're ready, by progressively loading information. The message, though, is that this is something a full-stack web framework should handle for you, hence the revelation at the end of the article about it being a lesson in the motivation behind React's Server Components.

yawaramin ・ 6 days ago

> I’d like to challenge more tools to adopt progressive streaming of data.

It's a solved problem. Use HTTP/2 and keep the connection open. You now have effectively a stream. Get the top-level response:

    {
      header: "/posts/1/header",
      post: "/posts/1/body",
      footer: "/posts/1/footer"
    }

Now reuse the same connection to request the nested data, which can all have more nested links in them, and so on.

aloha2436 ・ 6 days ago

> Now reuse the same connection to request the nested data, which can all have more nested links in them, and so on.
This still involves multiple round-trips though. The approach laid out in the article lets you request exactly the data you need up-front and the server streams it in as it becomes available, e.g. cached data first, then data from the DB, then data from other services, etc.
- yawaramin ・ 5 days ago
  
  When you have an HTTP/2 connection already open a 'round-trip' is not really a gigantic concern performance-wise. And it gives the client application complete control and ver what nested parts it wants to get and in what order. Remember that the article said it's up to the server what order to stream the parts? That might not necessarily be a good idea on the client side though. It would probably be better for the client to decide what it wants and when. Eg, it can request the header and footer, then swap in a skeleton facade in the main content area, then load the body and swap it in when loaded.
  
  jlokier ・ 5 days ago
  ・ 2 more
  
  Round trips for parallel requests work fine over HTTP/2. (As long as there aren't vast numbers of tiny requests, for example every cell in a spreadsheet).
  However, sequentially-dependent requests are about as slow with HTTP/2 as HTTP/1.1. For example, if your client side, after loading the page, requests data to fill a form component, and then that data indicates a map location, so your client side requests a map image with pins, and then the pin data has a link to site-of-interest bubble content, and you will be automatically expanding the nearest one, so your client side requests requests the bubble content, and the bubble data has a link to an image, so the client requests the image...
  Then over HTTP/2 you can either have 1 x round trip time (server knows the request hierarchy all the way up to the page it sends with SSR) or 5 x round trip time (client side only).
  When round trip times are on the order of 1 second or more (as they often are for me on mobile), >1s versus >5s is a very noticable difference in user experience.
  With lower latency links of 100ms per RTT, the UX difference between 100ms and 500ms is not a problem but it does feel different. If you're on <10ms RTT, then 5 sequential round trips are hardly noticable, thought it depends more on client-side processing time affecting back-to-back delays.
  
  yawaramin ・ 4 days ago
  
  > When round trip times are on the order of 1 second or more (as they often are for me on mobile)
  For an already-open HTTP/2 connection? Or for a new connection for each request?
- kiitos ・ 5 days ago
  
  Assuming a stable connection, there is no meaningful performance difference between a request/response round-trip from the client to the server, and a response streamed from the server to the client, amortized over time.