Training fields

NeRF: the challenge of editing the content of neural radiation fields

[ad_1]

Earlier this year, NVIDIA advanced Neural Radiance Fields (NeRF) research including InstantNeRFapparently capable of generating browsable neural scenes in mere seconds – from a technique that, when has emerged in 2020, often took hours or even days to train.

NVIDIA’s InstantNeRF delivers impressive results fast. Source: https://www.youtube.com/watch?v=DJ2hcC1orc4

Although this type of interpolation produces a static scene, NeRF is also capable of representing motion and basic editing by “copy-and-paste”, where individual NeRFs can either be put together into composite scenes or inserted into existing scenes.

Nested NeRFs, featured in 2021 research from Shanghai Tech University and DGene Digital Technology.  Source: https://www.youtube.com/watch?v=Wp4HfOwFGP4

Nested NeRFs, featured in 2021 research from Shanghai Tech University and DGene Digital Technology. Source: https://www.youtube.com/watch?v=Wp4HfOwFGP4

However, if you’re looking to step into a computed NeRF and actually change something going on inside (the same way you might change elements in a traditional CGI scene), the rapid pace of interest in the sector came with very little solutions to date, and none that even begin to match the capabilities of CGI workflows.

Although geometry estimation is essential to creating a NeRF scene, the end result is comprised of fairly “locked” values. While there is some progress being made to change texture values ​​in NeRF, the actual objects in a NeRF scene are not parametric meshes that can be edited and played with, but rather brittle, frozen point clouds.

In this scenario, a person rendered in a NeRF is essentially a statue (or series of statues, in video NeRFs); the shadows they cast on themselves and other objects are textures, rather than flexible calculations based on light sources; and the ability to edit NeRF content is limited to the choices made by the photographer taking the rare source photos from which the NeRF is generated. Parameters such as shadows and pose remain unmodifiable, in no creative sense.

NeRF Edition

A new academic research collaboration between China and the UK is taking up this challenge with NeRF Editionwhere CGI-style proxy meshes are extracted from a NeRF, deformed at will by the user, and the deformations fed back to the neural computations of the NeRF:

NeRF puppets with NeRF editing, as the deformations calculated from the footage are applied to equivalent points inside a NeRF representation.  Source: http://geometrylearning.com/NeRFEditing/

NeRF puppets with NeRF editing, as the deformations calculated from the footage are applied to equivalent points inside a NeRF representation. Source: http://geometrylearning.com/NeRFEditing/

The method adapts NeuS 2021 US/China reconstruction technique, which extracts a signed distance function (SDF, a much older method of volumetric reconstruction) capable of learning the geometry represented inside the NeRF.

This SDF object becomes the user’s base for sculpting, with warping and molding capabilities provided by the venerable As-Rigid-As-Possible (PFRA) technique.

ARAP allows users to deform the extracted SDF mesh, although other methods, such as skeleton-based and cage-based approaches (i.e. NURBs), also work well.  Source: https://arxiv.org/pdf/2205.04978.pdf

ARAP allows users to deform the extracted SDF mesh, although other methods, such as skeleton-based and cage-based approaches (i.e. NURBs), also work well. Source: https://arxiv.org/pdf/2205.04978.pdf

With the deformations applied, it is necessary to translate this information from the vector to the native RGB/pixel level of NeRF, which is a bit longer of a journey.

The triangular vertices of the mesh that the user deformed are first translated into a tetrahedral mesh, which forms a skin around the user-mesh. A discrete space strain field is extracted from this additional mesh, and finally a NeRF-compatible continuous strain field is obtained that can be fed back into the neural radiation environment, reflecting user changes and modifications, and directly affecting the rays interpreted in the target. Nerve.

Objects distorted and animated by the new method.

Objects distorted and animated by the new method.

The paper states:

“After transferring the surface strain to the tetrahedral mesh, we can obtain the discrete strain field of ‘effective space’. We now use these discrete transformations to bend the casting radii. To generate an image of the distorted radiation field, we shoot rays into space containing the distorted tetrahedral mesh.

the paper is titled NeRF-Editing: editing the geometry of neural radiation fieldsand comes from researchers from three Chinese universities and institutions, as well as a researcher from the School of Computer Science & Informatics at Cardiff University and two other researchers from the Alibaba Group.

Boundaries

As mentioned before, the transformed geometry will not “update” any linked aspects in the NeRF that have not been modified, nor will it reflect secondary consequences of the deformed element, such as shadows. The researchers give an example, where the sub-shadows on a human figure in a NeRF remain unchanged, even though the warping should change the lighting:

From the paper: we see that the horizontal shadow on the figure's arm stays in place even when the arm is moved up.

From the paper: we see that the horizontal shadow on the figure’s arm stays in place even when the arm is moved up.

Experiences

The authors observe that there is currently no comparable method for direct intervention in the NeRF geometry. Therefore, the experiments conducted for the research were more exploratory than comparative.

The researchers demonstrated NeRF-Editing on a number of public datasets, including characters from Mixamo, and the now-iconic Lego bulldozer and chair from the original NeRF. Implementation. They also experimented on a real horse statue captured from the FVS datasetas well as their own original captures.

The head of a bowed horse.

The head of a bowed horse.

For future work, the authors intend to develop their system within the just-in-time (JIT) compiled machine learning framework Jittor.

First published May 16, 2022.

[ad_2]
Source link