The feature matrix says cbz/zip doesn't have random page access, but it definitely does. Zip also supports appending more files without too much overhead.
Certainly there's a complexity argument to be made, because you don't actually need compression just to hold a bundle of files. But these days zip just works.
The perf measurement charts also make no sense. What exactly are they measuring?
Edit:
This reddit post seems to go into more depth on performance: old.reddit.com/r/selfhosted/comments/1qi64pr/comment/o0pqaeo/
Zip also has per-asset checksums, contrary to the comparison table.
And what's the point of aligning the files to be "DirectStorage-ready" if they're going to be JPEGs, a format that, as far as I know, DirectStorage doesn't understand?
And the author says it's a problem that "Metadata isn't native to CBZ, you have to use a ComicInfo.xml file.", but... that's not a problem at all?
The whole thing makes no sense.
It makes no sense because it's some degree of AI slop: https://reddit.com/r/selfhosted/comments/1qi64pr/i_got_into_...
Note that he doesn't quite say, when asked pointblank how much AI he used in his erroneous microbenchmarking, that he didn't use AI: https://reddit.com/r/selfhosted/comments/1qi64pr/i_got_into_...
Which explains all of it.
Kudos to /u/teraflop, for having infinitely more patience with this than I would.
That whole subreddit has unfortunately become inundated with AI slop.
It used to be a decent resource to learn about what services people were self hosting. But now, many posts are variations of, “I’ve made this huge complicated app in an afternoon please install it on your server”. I’ve even seen a vibe-coded password manager posted there.
Reputable alternatives to the software posted there exist a a huge amount of the time. Not to mention audited alternatives in the case of password managers, or even just actively maintained alternatives.
3 days ago the rules changed that vibe coded stuff is only allowed on Fridays.
https://old.reddit.com/r/selfhosted/comments/1qfp2t0/mod_ann...
I'm a moderator for a decently large programming subreddit, and I'd estimate about half the project submissions now being obvious slop. You get a very good nose for sniffing that stuff out after a while, though it can be frustrating when you can't really convince other people beyond going "trust me, it's slop".
Bullshit asymmetry by way of impulsive LLM slop strikes again.
Every new readme, announcement post, and codebase is tailored to achieve maximum bloviation.
No substance, no credibility———just vibes.
If you read the reddit thread, it was coded by hand then only bug checked with ai.
It was benchmarked with AI. Benchmarks being the main reason for this thing existing...
After reading the reddit comments, it looks like a primary problem is that the author doesn't (didn't?) understand how to benchmark it correctly. Like comparing the time to mmap() a file with the time to actually read the same file. Not at all the same thing.
For example: https://old.reddit.com/r/selfhosted/comments/1qi64pr/i_got_i...
I mean, its open source so people can create benchmark and independently verify if the AI was wrong and then have the claims be passed to the author.
I haven't read the reddit thread or anything but If the author coded it by hand or is passionate about this project, he will probably understand what we are talking about.
But I don't believe its such a big deal to have a benchmark be written by AI though? no?
> I mean, its open source so people can create benchmark and independently verify if the AI was wrong and then have the claims be passed to the author.
Thank you for volunteering. I look forward to your results.
> Thank you for volunteering. I look forward to your results.
Sure can you wait a few weeks tho? I know nothing about benchmarking so gonna learn it first and I have a few tests to prepare for irl.
I do feel like someone else more passionate about the project should try to pick the benchmarking though.
I don't mind benchmarking it but I only know tools like hyper for benchmarks & I have played with my fair share of zip archives and their random access retrieval but I feel like even that would depend from source to source.
There are some experienced people in here who are really cool at what they do, I just wanted to say that if someone's interested and already has the Domain Specific knowledge to benchmark & they enjoy it in the first place, this having AI benchmark shouldn't be much of a problem in comparison.
Why would someone spend their time checking someone else's AI slop when that person couldn't even be bothered to write the basic checks that prove their project was worthwhile?