Invention Title:

IMAGE EDITING WITH A SELECTED MACHINE-LEARNING MODEL

Publication number:

US20260045010

Publication date:

2026-02-12

Section:

Physics

Class:

G06T11/60

Inventors:

Bryan Feldman 🇺🇸 Mountain View, CA, United States

Yael Pritch Knaan 🇺🇸 Mountain View, CA, United States

Yaron BRODSKY 🇺🇸 Mountain View, CA, United States

Alex Rav ACHA 🇺🇸 Mountain View, CA, United States

Shlomo FRUCHTER 🇺🇸 Mountain View, CA, United States

Qinghao CHU 🇺🇸 Mountain View, CA, United States

Matan COHEN 🇺🇸 Mountain View, CA, United States

Andrey VOYNOV 🇺🇸 Mountain View, CA, United States

Assignee:

Google LLC 🇺🇸 Mountain View, CA, United States

Applicant:

Google LLC 🇺🇸 Mountain View, CA, United States

Smart overview of the Invention

A novel computer-implemented method facilitates image editing by utilizing machine-learning models. Initially, a user provides an image and a prompt requesting modifications. Based on this prompt, a suitable machine-learning model is selected from an available set. The prompt and image are then input into a large language model (LLM), which generates a rewritten prompt. This rewritten prompt guides the selected machine-learning model to produce an output image that aligns with the user's request.

Machine-Learning Model Selection

The method employs a variety of machine-learning models, each tailored to different editing tasks. Models include structure-preserving, shape-preserving, and non-structure/non-shape preserving types. The choice of model depends on the rewritten prompt's requirements. For instance, a structure-preserving model is chosen if the task involves modifying objects while maintaining their structure. Conversely, a non-structure/non-shape preserving model is used when objects need replacement.

User Interaction and Input

User input further refines the editing process by identifying specific objects or regions within the image for modification. This input influences the rewritten prompt, ensuring precise alterations. Additionally, users can access a user interface that offers presets for common editing tasks, such as object removal, background replacement, or color change, streamlining the editing experience.

System Architecture and Functionality

The system comprises processors and computer-readable media containing instructions that execute the described operations. These operations include receiving images and prompts, selecting appropriate models, and generating output images. The system also supports iterative editing, allowing users to refine results by submitting subsequent prompts for further adjustments.

Applications and Advancements

This method capitalizes on advancements in generative AI, enabling users to perform complex image edits with minimal technical expertise. By leveraging natural language prompts and sophisticated models, the system simplifies tasks that traditionally required manual intervention, such as altering textures or replacing objects, thus broadening the accessibility and creativity of digital image editing.