You are here
Home > The Explainers > What’s the tech behind a Pixel 2 camera? Is Google trying to make a point here?

What’s the tech behind a Pixel 2 camera? Is Google trying to make a point here?

Smartphone cameras have come a long way, especially in recent times when camera has become the USP of flagship Smartphones. Dual cameras are a thing now, although they were first seen in as back as 2011. However, it came to limelight when Apple introduced the IPhone 7 plus and the results were spectacular. It was a big step in mobile photography and image processing and given Apple’s credibility, the smartphone industry embraced the whole idea of dual lenses. This drove the industry research towards ‘better dual cameras’ rather than just ‘better cameras’. So from this point onwards, we discuss if dual lenses are really the way ahead.

Why dual camera?

Well, the logic isn’t all that hard to get. Dual lenses mean better light reception, better field of view, sharper images and overall good camera performance. But what made it a hit was the portrait mode aka the Bokeh effect, something which was only seen in DSLR’s until the dual cameras came. The phenomenon of blurring the background based on its depth or distance from the foreground. The subject was much more in focus now and the image decluttered. This was worth the extra bucks, people gladly paid for an added lens but then Google Pixel 2 changed all of it. For those who aren’t already aware of it, Pixel 2 has been rated the best camera in a smartphone till now by DxOMark and by every other reviewer on the planet. But wait, does it have a dual camera?

The secret ingredient

Much to the amazement of everyone, Pixel 2 produced much better portrait mode images and better photos in general. And all of this with a single lens. So here we will try to have a peek into Google’s technology, keeping things as ‘non-technical’ as possible and at the same time, doing justice to the top notch image processing capabilities of Google. The pixel 2 uses ‘Segmentation’ and ‘Stereo algorithm’ to achieve the results it does. More on those two later.

The whole process happens in 3 broad steps.

The first being, capturing an extremely sharp image. For this it uses HDR+, Googles computational photography technique. What it does is, it captures a burst of underexposed images, underexposed so it doesn’t blow up the lighted parts, and then averages all of them to reduce noise, preserve the natural contrast and sharpen the details. Although this sounds all easy, but it requires a software trained on a million images and circumstances in order to prevent ‘ghost images’ if the user is not steady or if the objects in the photo are moving.


This is the next stage, and it’s a tricky one. The software has to decide which pixels belong to the foreground and which to the background. What software, you ask? Well, Google has been quite generous here and has revealed that it’s a neural network written in TensorFlow- An open source software library for Machine Intelligence. This neural network is once again trained on a million images to identify people and other objects such as hats, ice cream cones, sunglasses, hand bags, jewelry and several different items. Google wasn’t very clear on the intricate details, obviously, but they did say that the network is a skip connection. Meaning, the network does all this in several stages beginning with low-level features such as edges and then going on to high-level features such as body parts. It then masks the foreground in white, and everything else in black. This is all that happens in the front facing camera. The black part is blurred and the photo is generated. However, the rear camera doesn’t stop here.

A segmented image released by Google, obtained from the Pixel 2.

Phase-detect Auto-focus aka PDAF

This is the most important step. To improve on the results, depth mapping of the pixels is done. The pixels which are farther from the in-focus plane are more blurred compared to those nearer to the in-focus plane. To achieve this depth mapping, each pixel is ‘split into two’. The image in both the pixels is slightly different, but different enough to produce a depth map by computing the ‘stereo’, something a ‘stereoscope’ does. It takes photos of the same object from slightly different angles and then combines them to create a notion of depth. It has been used for a long time in astrophotography, and the same thing was being achieved by dual lenses on smartphones. It is quite similar to what our eye does. However, Google found a way to do all this with a single lens, by splitting each pixel. But since the difference in the image is slight, the depth mapping can be inaccurate at times, especially in low lighting conditions. To overcome this, Pixel 2 again takes a burst of images and averages all of them, much like HDR+ .

Images taken from the rear camera of Pixel 2 for depth mapping. Click to notice the slight difference in the animated gif.

After all this, the segmentation mask is combined with the depth map, through an intricate process which Google claims is a ‘secret sauceand the result is spectacular.

Image taken from the rear camera of Google Pixel 2.

The verdict

Google has certainly shown the world how it’s done efficiently. Apple has long been accused of not investing enough in research, which is certainly clear from the results. Companies like Samsung could have done better than just following the footsteps of Apple. However, increasing the number of lenses is certainly an easy solution, but how far are we willing to go. Rumors of the next Huawei flagship to release with 3 lenses are doing the rounds. How long until it starts looking awkward. More like a camera and less smartphone-ish. Dual lenses are certainly adding extra bucks to the phone, at a time when budget smartphones are the real deal. Well, opinions might differ but one question that Google Pixel 2 definitely raises is- shouldn’t the camera too be smart?


Leave a Reply