Invention Title:

METHOD AND APPARATUS FOR GENERATING 3D SCENE BASED ON LARGE LANGUAGE MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Publication number:

US20250124651

Publication date:

2025-04-17

Section:

Physics

Class:

G06T17/00

Inventors:

Lei WANG Beijing, China

Shiyan Li Beijing, China

Xiaodong Zhang Beijing, China

Yichen Li Beijing, China

Assignee:

BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD. Beijing, China

Applicant:

BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD. Beijing, China

Smart overview of the Invention

The disclosure introduces a method and apparatus for generating three-dimensional (3D) scenes using a large language model (LLM). This approach is relevant to artificial intelligence, 3D modeling, and LLM technologies. The process involves analyzing description information of a desired 3D scene to extract label information. This label information is then used to generate a query prompt for the LLM, which retrieves a set of target assets needed for the scene. These assets include the objects, materials, and attributes required to construct the scene.

Field and Background

In recent years, the creation of 3D scenes has become integral to industries like gaming, animation, and digital human development. Traditionally, these scenes are crafted manually by designers using rendering engines, which can be time-consuming and prone to errors. The proposed method aims to enhance efficiency and accuracy in 3D scene generation by leveraging advancements in AI and language models.

Methodology

The method involves processing user-provided description information of a target 3D scene to identify key label information. Using this information, a query operation prompt is generated for the LLM to obtain a target asset set. This set comprises assets such as models, materials, and attributes that match the description's requirements. The LLM uses the prompt to perform a query and returns relevant assets that are then used to construct the 3D scene.

Implementation

The system includes an electronic device equipped with at least one processor and memory storing executable instructions. These instructions enable the device to perform the described method of generating 3D scenes using an LLM. The memory stores instructions that guide the processor through steps like processing description information, generating query prompts, acquiring target asset sets, and constructing the final 3D scene based on these assets.

Applications and Benefits

This approach simplifies the process of creating detailed 3D scenes by automating asset selection and placement. By utilizing label information derived from user descriptions, it reduces manual adjustments typically required when scene attributes are uncertain. This not only streamlines production but also enhances precision in asset placement and scene realism, addressing common challenges in traditional 3D modeling workflows.