Invention Title:

Adding Voice or Chat User interface to graphical user interface (gui)-based virtualized applications and desktops using large language and large action models

Publication number:

US20260037286

Publication date:

2026-02-05

Section:

Physics

Class:

G06F9/452

Inventors:

Mukund Ingale 🇺🇸 Pompano Beach, FL, United States

Hubert Divoux 🇺🇸 Parkland, FL, United States

Applicant:

Citrix Systems, Inc. 🇺🇸 Fort Lauderdale, FL, United States

Smart overview of the Invention

The described methods and systems aim to enhance remote desktop interfaces by integrating voice or chat user interfaces with graphical user interface (GUI)-based virtualized applications and desktops. This integration utilizes large language models (LLMs) and large action models (LAMs) to interpret user requests and execute corresponding actions within remote desktop applications. The system provides a more accessible and flexible user experience, especially on devices with limited display capabilities or for users who prefer hands-free interaction.

Technical Approach

A computing system is designed to train a LAM using historical or live interaction data from remote desktop sessions. This training enables the LAM to execute textual actions within a remote desktop application. When a user makes a declarative request via voice or chat, the LLM interprets it to match a specific action, possibly engaging in a conversational process to determine necessary parameters. The LAM then executes the action and communicates the result back to the user through the same voice or chat interface.

System Components

The system comprises several key components: a virtualization system for desktops/applications, a large language model, and a large action model. The remote desktop client application supports voice user interfaces (VUIs) and chat user interfaces (CUIs), allowing users to make high-level requests in a conversational manner. This approach differs from traditional imperative command styles, enhancing accessibility and usability across various devices, including those without conventional displays.

Implementation and Benefits

The implementation of this system does not require additional coding, making it a no-code solution applicable to existing virtualized applications and desktops. It enables new usage scenarios by allowing voice and chat interactions, thereby extending the functionality of legacy systems. The system supports various client devices, from small form-factor devices like smart glasses to standard devices like smartphones and tablets, facilitating broader adoption and flexibility.

Conclusion

This innovative approach enhances the functionality and accessibility of GUI-based virtualized applications and desktops by integrating advanced AI models for voice and chat interactions. By enabling high-level declarative requests, the system provides users with a seamless and efficient way to interact with remote desktop applications, regardless of the device form factor. This advancement represents a significant step forward in the evolution of remote desktop technologies, offering increased convenience and adaptability for diverse user needs.