Apple creates AI model that can turn 2D photos into 3D images: Here’s how it will work

9 hours ago 5

ARTICLE AD BOX

Apple has developed a new AI model, SHARP, capable of transforming a single 2D image into a realistic 3D environment in under 1 second. The proposed AI model aims to retain real-world distances and scaling in the generated images.

The Cupertino-based tech giant has highlighted the AI model in a study named “Sharp Monocular View Synthesis in Less Than a Second,” describing how the AI model named SHARP has been trained to recreate 3D images from 2D. The main feature of the model is its ability to generate 3D images at high speed and with high accuracy, while retaining spatial consistency. This enables the reconstructed images to have the appropriate dimensions, just like the original ones we see in reality.SHARP showcases Apple's research efforts in image processing using AI, spatial computing, and more. The technology to rapidly produce 3D images from 2D pictures holds promise in almost every field, starting from photography to augmented reality.It also contains examples that illustrate how well this model can translate regular photographs to a three-dimensional representation with a photorealistic quality.

In a blog post, Apple researchers shared the study for the AI model and wrote: “We present SHARP, an approach to photorealistic view synthesis from a single image. Given a single photograph, SHARP regresses the parameters of a 3D Gaussian representation of the depicted scene. This is done in less than a second on a standard GPU via a single feedforward pass through a neural network. The 3D Gaussian representation produced by SHARP can then be rendered in real time, yielding high-resolution photorealistic images for nearby views. The representation is metric, with absolute scale, supporting metric camera movements. Experimental results demonstrate that SHARP delivers robust zero-shot generalization across datasets. It sets a new state of the art on multiple datasets, reducing LPIPS by 25–34% and DISTS by 21–43% versus the best prior model, while lowering the synthesis time by three orders of magnitude.”In simple terms, the model creates a 3D version of the scene that can be viewed from different angles close to the original photo.

How Apple’s new 3D image making AI model works

A 3D Gaussian is basically a small, fuzzy spot of colour and light placed in space. When you put millions of these spots together, they can recreate a 3D scene that looks real from a particular viewing angle.Most methods that use this technique need dozens or even hundreds of photos of the same scene taken from different positions.

Apple's SHARP model, however, can reconstruct a complete 3D scene from a single photo using its neural network.To make this work, Apple trained SHARP using large amounts of artificial and real-world images, teaching it to recognise common patterns of depth and shape across many different scenes.When you give SHARP a new photo, it estimates how far away objects are, improves this estimate using what it has learned, and then figures out where millions of 3D spots should go and what they should look like, all in one go.

This allows SHARP to rebuild a believable 3D scene without requiring multiple images or slow processing per scene.However, there's a catch. SHARP works well for viewing angles close to those of the original photo. You can't move too far from the original position because the model doesn't create entirely new parts of the scene it hasn't seen. This limitation allows Apple to keep the model fast enough to produce results in under a second while maintaining realistic output.You can try Apple's new AI model via GitHub. Some users have also shared their own test results online on microblogging site X (formerly Twitter)

Read Entire Article