Netflix unveils ‘VOID’, an AI model that can change a movie plot

1 hour ago 6

ARTICLE AD BOX

Netflix has unveiled an AI model that can alter the scene. While this is nothing new, Netflix’s model not just remove objects from video, it understands what should happen next, essentially altering the causality.

Most object-removal technology leaves a gap where something used to be and/ or fills it with a plausible-looking background. VOID goes a step further: after erasing an object from a scene, it calculates how the surrounding environment should realistically respond to its absence, and in the process it adjusts the behaviour of other elements in the frame accordingly.The model, called VOID, is a vision-language system designed to edit video footage with a level of contextual awareness that sets it apart from several existing tools. For example, imagine this scenario: A movie director has just wrapped up a multi-million dollar action sequence where the star drives straight into a fiery, head-on collision. The cars explode, debris scatters, and the scene is a wrap. But suddenly, there is request to change the scene: The character survives and drive off into the sunset instead. This would mean costly reshoots or millions spent on computer-generated graphics.

How Netflix VOID works

Netflix’s VOID can alter the scene and digitally erase the collision, the second vehicle, the smoke, and the flying debris. It will then generate new, realistic footage of pristine pavement, making it look as though the main character simply drove down an empty, peaceful road.The AI’s ‘magic’ lies in its grasp of real-world physics. Let us understand this by a different example. If a video shows a person jumping into a pool and splashing water onto the deck, VOID can completely erase the person.

It will then intelligently alter the water and the wet concrete, leaving behind a perfectly still, undisturbed pool as if nobody had ever jumped in.“To train the model, we generate a new paired dataset of counterfactual object removals using Kubric and HUMOTO, where removing an object requires altering downstream physical interactions. During inference, a vision-language model identifies regions of the scene affected by the removed object.

These regions are then used to guide a video diffusion model that generates physically consistent counterfactual outcomes,” Netflix said.“Experiments on both synthetic and real data show that our approach better preserves consistent scene dynamics after object removal compared to prior video object removal methods,” the company added.Netflix isn’t keeping this powerful tool locked behind studio doors, and made the VOID model available to the public on the AI platform Hugging Face, allowing anyone to download and use it.While there are already several video-altering tools on the market, such as Runway, DiffuEraser, and ProPainter, the Netflix team claims VOID is vastly superior, as per The Register.

Read Entire Article