I like the pelican I got out of deepseek-v4-flash more than the one I got from deepseek-v4-pro.
Flash: https://gist.github.com/simonw/4a7a9e75b666a58a0cf81495acddf...
Pro: https://gist.github.com/simonw/9e8dfed68933ab752c9cf27a03250...
Both generated using OpenRouter.
For comparison, here's what I got from DeepSeek 3.2 back in December: https://simonwillison.net/2025/Dec/1/deepseek-v32/
And DeepSeek 3.1 in August: https://simonwillison.net/2025/Aug/22/deepseek-31/
And DeepSeek v3-0324 in March last year: https://simonwillison.net/2025/Mar/24/deepseek/
DeepSeek pelicans are the angriest pelicans I’ve seen so far.
they're just late for work.
No way. The Pro pelican is fatter, has a customized front fork, and the sun is shining! He’s definitely living the best life.
yeah. look at these 4 feathers (?) on his bum too.
a lot of dumplings
What was your prompt for the image? Apologies if this should be obvious.
>Generate an SVG of a pelican riding a bicycle
at the top of the linked pages.
The Flash one is pretty impressive. Might be my favorite so far in the pelican-riding-a-bicycle series
Being a bicycle geometry nerd I always look at the bicycle first.
Let me tell you how much the Pro one sucks... It looks like failed Pedersen[1]. The rear wheel intersects with the bottom bracket, so it wouldn't even roll. Or rather, this bike couldn't exist.
The flash one looks surprisingly correct with some wild fork offset and the slackest of seat tubes. It's got some lowrider[2] aspirations. The seat post has different angle than the seat tube, so good luck lowering that.
This is an excellent comment. Thanks for this - I've only ever thought about whether the frame is the right shape, I never thought about how different illustrations might map to different bicycle categories.
Some other reactions:
I wonder which model will try some more common spoke lacing patterns. Right now there seems to be a preference for radial lacing, which is not super common (but simple to draw). The Flash and Pro one uses 16 spoke rims, which actually exist[1] but are not super common. The Pro model fails badly at the spoke pattern.
[1] https://cicli-berlinetta.com/product/campagnolo-shamal-16-sp...
The Pedersen looks like someone failed the "draw a bicycle" test and decided to adjust the universe.
Where is the GPT 5.5 Pelican?
Why they so angry?
I really like the pro version. The pelican is so cute.
[flagged]
It's just Simon Willison (the person you are replying to) who always makes a pelican, as his personal flippant benchmark. It's not that deep.
No benchmark will be perfect, especially if it's public but it's a fun experiment to visually see how these models get better and better.
Why is it so wrong?
Thanks for the "scientific air" remark, that gave me a genuine LOL.
I think the pelican on a bike is known widely enough that of seizes to be useful as a benchmark. There is even a pelican briefly appearing in the promo video of GPT-5, if I'm not mistaken https://openai.com/gpt-5/. So the companies are apparently aware of it.