Invention Title:

ARTIFICIAL INTELLIGENCE-BASED IMAGE GENERATION METHOD AND APPARATUS, ELECTRONIC DEVICE, COMPUTER-READABLE STORAGE MEDIUM, AND COMPUTER PROGRAM PRODUCT

Publication number:

US20250315988

Publication date:

2025-10-09

Section:

Physics

Class:

G06T11/00

Inventors:

Ping LUO 🇨🇳 Shenzhen, China

Zhouxia WANG 🇨🇳 Shenzhen, China

Xintao WANG 🇨🇳 Shenzhen, China

Ying SHAN 🇨🇳 Shenzhen, China

Zhongang Qi 🇨🇳 Shenzhen, China

Liangbin XIE 🇨🇳 Shenzhen, China

Assignee:

TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED 🇨🇳 Shenzhen, China

Applicant:

TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED 🇨🇳 Shenzhen, China

Smart overview of the Invention

The application introduces an AI-based image generation method and apparatus, designed to efficiently create images that meet both content and style requirements. The method involves obtaining content text and style images, encoding the text, extracting style codes from images, and using a dual cross-attention mechanism for reverse diffusion processing. This approach generates a target image that aligns with the content text and desired style.

Technical Field

This disclosure falls under AI technology, specifically focusing on image generation methods and apparatuses. It leverages advancements in natural language processing (NLP), machine learning, and deep learning to enhance image generation capabilities. The method addresses limitations of existing style transfer solutions by producing more refined and accurate images.

Background

AI technologies enable machines to perform perception, reasoning, and decision-making tasks. Style transfer technologies have traditionally been limited to modifying existing images with coarse-grained effects, primarily focused on color. The disclosed method aims to overcome these limitations by generating images that meet specific content and style requirements more efficiently.

Embodiments

The disclosed embodiments include a method performed by an electronic device with a memory and processor. The process involves obtaining text and style images, encoding the text into content codes, extracting style codes from images, and applying reverse diffusion processing using a dual cross-attention mechanism. This results in a target image that matches both the content text and the target style.

Implementation

Modules: The apparatus includes modules for obtaining text and style images, encoding text, extracting style codes, and performing reverse diffusion processing.
Device Components: An electronic device with memory and processor implements the method by executing stored instructions.
Storage Medium: A non-transitory computer-readable medium stores instructions for executing the AI-based image generation method.
Benefits: The integration of content text code and style code in reverse diffusion processing enhances efficiency in generating target images that satisfy both content and style criteria.