US20260087726
2026-03-26
Physics
G06T15/08
The patent application describes a system and method for reconstructing, segmenting, and simulating 3D scenes using neural representations. A computing system processes video data with depth maps to create a 3D representation of a scene. This involves using Gaussian splat representations to model objects and segment them within the 3D space. The system updates the 3D model dynamically, allowing for real-time interaction and display of objects.
Traditional methods for 3D scene reconstruction, like mesh-based techniques, face challenges with accuracy and efficiency, especially in real-time applications. Neural representations such as NeRFs and Gaussian splats offer detailed reconstructions but lack explicit surface definitions, complicating the simulation of physical interactions. These limitations affect the realism of simulations in AR and VR environments, where precise models are necessary for effective interaction.
The proposed system uses depth maps and video data to generate detailed 3D representations of scenes. Segmentation models isolate objects for manipulation, and volumetric densification techniques enhance the fidelity of these representations. The system performs operations like inpainting to correct inaccuracies, facilitating realistic simulations of both rigid and elastic objects by updating volumetric data with new inputs.
The system employs various simulation techniques, including rigidity and elasticity simulations. These involve applying physics models to transform objects, determining energy functions, and minimizing them to simulate realistic physical states. This approach allows for detailed modeling of object interactions and deformations, enhancing the realism of the simulated environment.
The method supports advanced features like densification, where objects are sampled to create voxelized volumes. These volumes are updated based on occupancy states and depth maps, allowing for accurate interior modeling. The system also refines poses of video sources to align 3D representations with 2D frames, ensuring consistency and accuracy in the reconstruction process.