Cute but why? Human-based rankings rarely align and for good reason. In ranking, you are reducing multiple quantitative and qualitative attributes (and their combinations) to a single dimension. You will lose information.
To illustrate further, I picked “electric guitars”. The top two were obvious and boring and the rest was a weird hodgepodge. Significantly, there is no consideration given for whether the person wanting the rank likes to play jazz or metal or country or has small hands or requires active electronics or likes trems or whatever. So it’s a fine exercise in showing llms doing a thing, but adds little/no value over just doing a web search. Or, more appropriately, having a conversation with an experienced guitar player about what I want in a guitar.
We absolutely do lose information here; that's a great point. The goal for us wasn't necessarily to surface the best ranking; it was to learn how LLMs produce a given ranking and what sources it pulls in.
The nugget of real interest here (personally speaking) is in those citations: what is the new meta for products getting ranked/referred by LLMs?