It's possible to save RAW files (mostly) unprocessed with iPhones, either via built-in functionality (Pros) or via apps like Halide.
But the aggressiveness of the de-noising in the native JPG/HEIF images otherwise is really unfortunate if you want to look at the images on a screen larger than the phone's screen. The amount of detail lost (other than in areas like people's faces where the phone knows to specialise) can be very considerable.
I'd really like a way to dial that aggressiveness down a fair bit, even at the cost of more noise/grain and larger file size (through less compression due to the extra noise).
Another thing is the amount of lens flare you can get when shooting at the sun for sunsets/rises, etc or other large bright light sources.
With very small lens elements, from a physics perspective it's understandable that suppressing the reflections and inter-reflections is very difficult on such a small surface area (even with special coatings to reduce the fresnel reflection ratios), but if you care about image quality and wanting to look at images on screen larger than the phone which took them, larger format cameras still have some benefit despite their larger and heavier size and therefore inconvenience (looks at 5D Mk IV on shelf).
I wish there was a middle ground between what Android/Pixel camera saves as raw, and the in-camera JPEG. Sometimes I have a few quibbles with the JPEG and what I'd like to do is edit the raw file, but starting from something close to the JPEG. Unfortunately what you get as a starting point from raw is hideous, and it's never clear how to begin. I don't think I've ever got an acceptable result trying to edit raw photos from my Pixel.
For Android, you can sort of get some of this with Snapseed. I occasionally use it, and it's "ok". I'm more frustrated by the fact that my preferred RAW editor (DxO) doesn't handle Android's DNG files. For me, at least, editing raw images on a phone screen is just not tolerable.
In other words, you want either your camera app to select the initial tweaks for you to be able continue in the external editor (not going to happen, RAW editing software is incompatible by design), or your editing software to select the initial tweaks that "look good" (that depends on your software). In RAW mode, Google Camera's output is photometrically correct, even if it stacks multiple frames or denoises it. Which is the only way to do it that makes sense, any other RAW camera app or actual dedicated camera does this the same way.
It's strange that in the age of AI, denoisers are still so bad. It's basically impossible to photograph snowing in the winter because the denoiser will remove 90% of the snowflakes. Machine learning models are already used for denoising ray traced graphics with substantially improved results, so why is it that cameras aren't using ML denoisers yet? At least for still images. Or do they perhaps already use them, only the quality is still bad for unknown reasons?
Are we still talking about smartphone caneras? If yes, apps already heavily rely on much more advanced computational photography than your average photo editor can do, including but not limited to ML denoisers. The problem is that such apps are typically optimized for the "average case" and are as automated as possible, so they either remove snow, rain, and haze intentionally, or lose small moving particles as the result of stacking. That said, snow and rain are usually possible to capture in the apps that attempt to determine the scene type or have specific modes.
(As someone who worked closely with pathtracing renderers and de-noisers, I think I can answer this :) )
It's mostly because in the VFX/CG space for ray tracing/path tracing de-noisers, they almost always rely on extra outputs/AOVs of things like 'albedo' (diffuse reflectance), normal / world position, etc, to help guide them in many cases.
So they often can 'cheat' a bit, and know where the edges of things are (because say the object ID AOV changes - minus pixel filtering, which complicates things a bit).
They can also 'cheat' in other ways, by mixing back in some of the diffuse texture detail that the denoiser might have removed from the 'albedo' AOV channel.
Cameras don't really have anything to guide them, so they have to guess. And often, they seem to use very primitive methods like bi-lateral filters (or at least things which look very similar), to try and guide them, but it doesn't work very well.
Portrait cameras on phones can use depth sensors a bit to help if the camera has them, but for things like hair strands, it doesn't really work, and is mostly useful for fake-depth-of-field depth-based blurring.
Yeah, but surely ML models would at least work better than analytic algorithms. After all, when looking at a noisy picture, our brain is pretty good at distinguishing detail from noise, so it's not clear to me why a ML model couldn't have denoising performance similar to the human brain, even if it doesn't match the "cheating" denoisers used in ray tracing.
It would probably help you to compare what you can do on a phone vs what you can do with desktop software (Lightroom/Photoshop, DxO, Topaz, CaptureOne, etc). It's generally quite good, with the exception of challenging liminal areas (e.g. hair, foliage).
Fwiw, Topaz -- which I have a license for but essentially never use -- has pretty incredible denoising & upsizing features (for both photo & video), but to get the optimal quality output you offload the processing to their cloud infra (and buy credits from them to pay for it). It's roughly the equivalent of a SWE using a local LLM that's good enough" vs a frontier model that's SOTA but requires a consumption-based subscription.
Lens for image quality and sensor size and density for resolution, but we hit pretty hard limits on those a long time ago. Software on top of that has been the major differentiator for quite some time. Exposure stacking and intelligent detail control produce more improvement for less investment than a super-complex lens assembly or exotic sensor. Though it brings its own risks.
Not to say there is no movement on the other fronts. Glass was pushing for a crazy anamorphic lens and far larger sensor that would have been a serious improvement, but I don't know if it went anywhere.
I wish "real" camera companies were more aggressive about offering computational photography post-processing, at least as an option. I've gotten spoiled by an iPhone. Despite my Sony and Fuji having huge wonderful lenses, I am pretty disappointed when using a dedicated camera and interior lighting leads to slight blur, or a cloudy day produces washed-out skies.
To expand on the HDR example: There’s this interesting lecture series about computational photography by Marc Levoy, who worked on the earlier Pixel cameras: https://youtube.com/watch?v=y7HrM-fk_Rc
From what I remember, the core thesis is “take a lot of pictures and take the best parts”, which works for a surprising number of cases.
The part he didn't mention is interpolation at the low end "specs are mere suggestions" end of things. I have a backup Android phone - a true "brand X" type of thing, vanilla android, bought at a garage sale. Nice enough phone, but claims a 40MP camera. The merest glance at a picture taken by it shows it has an ordinary-for-its-time 13MP camera in it and the pictures are interpolated to 40MP.
Hopefully the camera doesn't upscale and then downscale again if told so save at its actual native-ish resolution.
> in the darkness, your camera will need to use a longer shutter speed
the alternative, which many smartphone cameras do now, is to capture a burst of many photos of a short shutter speed and then combine them in software. For static things, this is equivalent to a longer shutter speed (with the additional advantage of not blowing out the highlights), and for moving things, we can filter in software to avoid smearing them out.
the main issue is when you blow the image up, the details in the highlights and shadows don't hold up, you need to study Chroma subsampling to understand this. Sensor size is still important, but they're getting closer
It's not the Chroma subsampling, it's the agressive de-noising removing the detail (noise is technically 'detail' you don't normally want).
410/411/422 is the least of the problems. If it was just that, it'd largely just be compression artifacts around red/blue things like you often see on streaming / TV new text banners at the bottom. i.e. things like Stop signs, etc...
All that really matters is that it exists. If you really care about the quality of your camera, you're going to want to get a dedicated camera. For everyone else (i.e. basically everybody except photographers), literally any phone camera is as good as another.
But the aggressiveness of the de-noising in the native JPG/HEIF images otherwise is really unfortunate if you want to look at the images on a screen larger than the phone's screen. The amount of detail lost (other than in areas like people's faces where the phone knows to specialise) can be very considerable.
I'd really like a way to dial that aggressiveness down a fair bit, even at the cost of more noise/grain and larger file size (through less compression due to the extra noise).
Another thing is the amount of lens flare you can get when shooting at the sun for sunsets/rises, etc or other large bright light sources. With very small lens elements, from a physics perspective it's understandable that suppressing the reflections and inter-reflections is very difficult on such a small surface area (even with special coatings to reduce the fresnel reflection ratios), but if you care about image quality and wanting to look at images on screen larger than the phone which took them, larger format cameras still have some benefit despite their larger and heavier size and therefore inconvenience (looks at 5D Mk IV on shelf).
It's mostly because in the VFX/CG space for ray tracing/path tracing de-noisers, they almost always rely on extra outputs/AOVs of things like 'albedo' (diffuse reflectance), normal / world position, etc, to help guide them in many cases.
So they often can 'cheat' a bit, and know where the edges of things are (because say the object ID AOV changes - minus pixel filtering, which complicates things a bit).
They can also 'cheat' in other ways, by mixing back in some of the diffuse texture detail that the denoiser might have removed from the 'albedo' AOV channel.
Cameras don't really have anything to guide them, so they have to guess. And often, they seem to use very primitive methods like bi-lateral filters (or at least things which look very similar), to try and guide them, but it doesn't work very well.
Portrait cameras on phones can use depth sensors a bit to help if the camera has them, but for things like hair strands, it doesn't really work, and is mostly useful for fake-depth-of-field depth-based blurring.
Fwiw, Topaz -- which I have a license for but essentially never use -- has pretty incredible denoising & upsizing features (for both photo & video), but to get the optimal quality output you offload the processing to their cloud infra (and buy credits from them to pay for it). It's roughly the equivalent of a SWE using a local LLM that's good enough" vs a frontier model that's SOTA but requires a consumption-based subscription.
https://techcrunch.com/2018/10/22/the-future-of-photography-...
Not to say there is no movement on the other fronts. Glass was pushing for a crazy anamorphic lens and far larger sensor that would have been a serious improvement, but I don't know if it went anywhere.
https://techcrunch.com/2022/03/22/glass-rethinks-the-smartph...
From what I remember, the core thesis is “take a lot of pictures and take the best parts”, which works for a surprising number of cases.
Hopefully the camera doesn't upscale and then downscale again if told so save at its actual native-ish resolution.
the alternative, which many smartphone cameras do now, is to capture a burst of many photos of a short shutter speed and then combine them in software. For static things, this is equivalent to a longer shutter speed (with the additional advantage of not blowing out the highlights), and for moving things, we can filter in software to avoid smearing them out.
To counter the unnatural look of noise reduction I often add a film grain effect.
410/411/422 is the least of the problems. If it was just that, it'd largely just be compression artifacts around red/blue things like you often see on streaming / TV new text banners at the bottom. i.e. things like Stop signs, etc...