A decade of "the best smartphone camera competitions" by mkbhd have clearly highlighted what is happening here.
1: In a/b testing, nearly everyone including pixel peepers prefer a more vibrant photo.
2: the traditional perspective of "a photo should look as close as possible to what my eyes see if I drop the viewfinder" is increasingly uncommon and not pursued in the digital age by nearly anyone.
3: phone companies know the above, and basically all of them engage in varrying degrees of "crank vibrance until people start to look like clowns, apply a skin correction so you can keep the rest mega vibrant" with an extra dash of "if culturally accepted to the primary audience, add additional face filtering to improve how people look, including air-brushing and thinning of the face"
This is rightfully compared to the loudness wars and I think that's accurate. It really became a race to the bottom once we collectively decided that "accurate" photos were not interesting and we want "best" photos.
I fully agree with your observations, and would add the irony of such a pursuit by phone makers is that serious hobbyist/amateur/professional photographers and videographers understand that cameras are inherently inaccurate, and that what we’re really capturing is an interpretation of what we’re seeing through imperfect glass, coatings, and sensor media to form an artistic creation. Sure, cameras can be used for accuracy, but those models and lenses are often expensive and aimed at specific industries.
We enjoy the imperfections of cameras because they let us create art. Smartphone makers take advantage of that by, as you put it, cranking things to eleven to manipulate psychology rather than invest in more accurate platforms that require skill. The ease is the point, but ease rarely creates lasting art the creator is genuinely proud of or that others appreciate the merit behind.
I don't spend too much time thinking about cameras or lenses but this kind of conversation makes me wonder... when I take photos of receipts or street signs or just text in general, is it possible that at some point the computational photography makes a mistake and changes text? or am I being paranoid?
Worse, Xerox scanners specifically meant for digitizing documents have changed text for a long time. The compression algorithm they used (I think even in the default settings) sometimes replaced e.g. 6 with 8, and similar things. See: https://www.youtube.com/watch?v=7FeqF1-Z1g0 (german, but there should be news articles from back then in english as well, somewhere)
That's not really "computation photography" in any meaningful sense, closer to "digital processing". It's not impossible for such glitches to occur with modern smartphone cameras, but it's implausible. I don't think there's ever a confirmed instance of such a gaff happening. Meanwhile a few years ago there was a photo with a misplaced leaf that made the rounds, and people were complaining about how it was caused by computational photography, but it turned out the photo was accurate. The leaf actually there.
My point was that you don't have to take a photo of a receipt to run into this issue, actual machines specifically build to digitize receipts and other documents already made this kind of mistake.
No idea if this can happen with what modern smartphone cameras do to photos. If "AI" is involved then I would expect such issues to be possible because of the basic nature of them being random generators, just like how LLMs hallucinate stuff all the time. Other "enhancement" approaches might not produce issues like this.
iPhones can definitely garble text, although it's not clear whether they can substitute some text for another. Seems possible but unlikely (in a purely statistical sense).
https://www.reddit.com/r/iphone/comments/1m5zsj7/ai_photo_ga...
https://www.reddit.com/r/iphone/comments/1jbcl1l/iphone_16_p...
https://www.reddit.com/r/iphone/comments/17bxcm8/iphone_15_n...
> is it possible that at some point the computational photography makes a mistake and changes text?
Yes it is. I've seen that happen in real-time with the built-in camera viewfinder (not even taking a photo) on my mid-range Samsung phone, when I zoomed in on a sign.
It only changed one letter, and it was more like a strange optical warping from one letter to a different one when I pointed the camera at a particular sign in a shop, but it was very surprising to see.
Xerox scanners/photocopiers had this problem.
It was the compression format, not the scanner, right? Same would have happened if you store in that format (with the same quality settings etc.) on a computer or smartphone
Not that that helps anyone who's affected, but that situation is more like if you'd have an .aip file, AI Photo storage format, where it invents details when you zoom in, and not a sensor (pipeline) issue
No they exhibited it in pure instant single copy copying mode.
Oh wtf! I had ctrl+f'd the article for cop (to catch "copy" and "copies" and such) to quickly check this but didn't see that. Then I guess I don't remember the root cause of this issue
- [deleted]
It's definitely a possibility if there's a point where LLM-based OCR is applied.
See https://www.runpulse.com/blog/why-llms-suck-at-ocr and its related HN discussion https://news.ycombinator.com/item?id=42966958
Like almost everything LLMs do, you don't need an LLM to make these mistakes.
LLM-based OCR and speech transcription do come with a failure condition that is different than you see in pre-LLM solutions. When the source data is hard to understand, LLMs try to fill the gap with something that makes sense given the surrounding context.
Pre-LLM approaches handle unintelligible source data differently. You'll more commonly see nonsense output for the unintelligible bits. In some cases the tool might be capable of recognizing low confidence and returning an error or other indicator of a possible miss.
IMO, that's a feature. The LLM approach makes up something that looks right but may not actually match the source data. These errors are far harder to detect and more likely to make it past human review.
The LLM approach does mean that you can often get a more "complete" output from a low quality data source vs pre-LLM approaches. And sometimes it might even be correct! But it will get it wrong other times.
Another failure condition I've experienced with LLM-based voice transcription that I didn't have pre-LLM - running down the wrong fork in the road. Sometimes the LLM approaches will get a word or two wrong...words with similar phonetics or multiple meanings, that kind of thing. It may then continue down the path this mistaken context has created, outputting additional words that do not align to the source data at all.
Having uploaded my share of receipts to Concur, there's 2 checks & balances: If you still have the original, then you can correct the OCR'd value. And then Concur will recognized both line items and totals and whine if they don't match.
> We enjoy the imperfections of cameras because they let us create art
For something as widespread as photography I'm not sure you can define a "we". Even pro photographers often have a hard time relating to each other's workflows because they're so different based on what they're shooting.
The folks taking pictures of paintings for preservation are going to be lighting, exposing, and editing very differently than the folks shooting weddings who will be shooting differently than the folks doing architecture or real estate shots. If you've ever studied under a photographer or studied in school you'll learn this pretty quickly.
There's a point to be made here than an iPhone is more opinionated than a camera, but in my experience most pro photographers edit their shots, even if it's just bulk application of exposure correction and an appropriate color profile. In that way a smartphone shot may have the composition of the shooter but not the color and processing choices that the shooter might want. But one can argue that fixed-lens compacts shooting JPG are often similarly opinionated. The difference of opinion is one of degrees not absolutes.
As an aside, this appeal to a collective form of absolute values in photography bothers me. It seems to me to be a way to frame the conversation emotionally, to create an "us vs them" dynamic. But the reality of professional photography is that there are very few absolute values in photography except the physical existence of the exposure triangle.
There's no such thing as "accurate photographs". I don't think we can even agree if two human perceive the same picture the same way.
I do think the average person today should learn about the basics of photography in school simply because of how much our daily lives are influenced by images and the visual language of others. I'd love to see addition to civics and social sciences classes that discuss the effects of focal lengths, saturation, and DOF on compositions. But I don't think that yearning for an "accurate photo" is the way.
> "the best smartphone camera competitions" by mkbhd
Also in normal phone reviews, they always put pictures of different phones next to each other so that people can form their own opinion on what they prefer. How is the reader to know what it really looked like? The reviewer should compare it against what they actually saw and felt the mood was in the moment and give a verdict of which camera captured that
Of course nicer colors look nicer but that's not the camera's job: I can turn that up if I want it. For that to work well, the camera needs to know what's there in the first place
Eyeing the raw results from the pro capture mode vs. the automagic results of my five year old 300€ phone, it does an amazing job of removing sensor noise and improving lighting in ways that I usually can't replicate short of using a tripod and a whole lot of image stacking. The only exception is extreme contrasts, such as a full moon on a dark sky or rays of direct sunlight (at sunrise) on half of a rolling hill when the other half is still in complete shadow. Then the only solution is to take two pictures, one where you can see the dark bit and one where you can see the bright bit, and stitch them together
Yes, and before photography existed, people expected painters to prioritize a flattering appearance instead of realism when commissioning portraits, too. And landscape painters used more vivid colors than in real life to convey a mood. But now that it's regular people preferring the same, it's suddenly bad.
It's not bad in a sense of taste, but it's bad in a sense that there's a lack of option in phone camera.
After taking a photo I adjust brightness contrast etc to make the picture on the screen match that what I see in front of me. Sometimes this really brings the mood into the shot.
This is also why I get much better results on a phone than on any fancy camera with a smaller or different display. The phone matches what those to view the image get to see closely or exactly.
Exactly the reverse for me, with the potential for professional grade photos, which I get roughly 25% of the time with a camera. Some sessions are close to 100% pro quality.
Zoom or cropping is horrible in phones, and I carry a Lumix with 8x zoom, smaller than a phone, everywhere. And every car trip, a very old Coolpix with 42x zoom (24-1000mm effective) small enough to fit in a small shoulder bag. I used the Coolpix Monday to get close up photos of two county police being helicoptered on a long line to a drug bust before anyone knew what was going down.
Phone photographers (I am one) look for scenes within the phone's limits. I used the Coolpix for over 10 years, capturing everything without limit.
For real fun, take photos in Yosemite with an iPad. It looks like a view camera, with the very large display. Add a western hat and a beard, and impersonate Ansel Adams. :-)
> compared to the loudness wars
I would like to compare it to "cinema mode" on my television.
I sometimes turn on cinema mode, but although the colors have more subtlety, nuance and accuracy... dimness just doesn't compare as well as you think to a much brighter picture.
sigh.
That said, it's a little annoying that the apple camera app doesn't capture raw out-of-the-box.
You can do raw out of the box - just need to enable it in settings
whoa! you are right! where did that come from?
EDIT: wait, seems to be only some phones.
As someone into photography as a hobby, I don't get why we invest in smartphone cameras nor why people care. It all looks like the same trash.
If you want a photo to reminisce on, sure use a smartphone. In which case anything short of 1800s camera quality will do the job great. If you want to make a photo that might look good then do yourself a favour and get a cheap dedicated camera.
The big difference is in low light conditions, where a 10-year old phone or cheap camera will give you 90% noise, and a new-ish phone or quality camera will actually be pretty good.
The "Beginner Photographer" samples in the article look the best to me, out of all the samples. Is that not supposed to happen?
Yeah, TFA's point is that the basic/inexpensive camera in the hands of an unskilled user can be higher quality than an equivalent iPhone camera shot. In my opinion from the example shots used this is definitely the case. Camera phone distortion is pretty bad (you have to stand back further from your subject offset this, use a higher res setting, and crop in) and the processing has gotten out of hand in recent years to the point where it starts making photos look worse and worse.
Ah I see now that I read more closely. Man the difference is really stark!