US-20260127384-A1 - ARTIFICIAL INTELLIGENCE-POWERED PERSONAL COMPUTER MANAGEMENT SYSTEM AND METHODS

US20260127384A1US 20260127384 A1US20260127384 A1US 20260127384A1US-20260127384-A1

Abstract

Computer systems and methods of use, including a computer system comprising a processor and a memory storing a plurality of predefined function blocks, an operating system, a language model, and a user interface application. Each of the predefined function blocks, when executed by the processor, cause the processor to interact with the operating system. The user interface application, when executed by the processor executing the operating system, causes the processor to: receive a user request in natural language to perform an operation; select, by the language model, a subset of the predefined function blocks based on the user request; and execute the subset of the predefined function blocks to perform the operation. The operation includes interactions with the operating system. Each of the predefined function blocks included in the subset corresponds to at least one of the interactions included in the operation.

Inventors

Arash AHMADI
Sarah Safura Sharif
Yaser Michael Banad

Assignees

THE BOARD OF REGENTS OF THE UNIVERSITY OF OKLAHOMA

Dates

Publication Date: 20260507
Application Date: 20251104

Claims (20)

1 . A computer system, comprising: a processor; and a memory comprising a non-transitory processor-readable medium storing a plurality of predefined function blocks, an operating system, a language model, and a user interface application, each of the plurality of predefined function blocks comprising predetermined processor-executable instructions that, when executed by the processor, cause the processor to interact with the operating system, the user interface application comprising user processor-executable instructions that, when executed by the processor executing the operating system, cause the processor to: receive a user request in natural language to perform an operation, the operation including one or more interactions with the operating system; select, by the language model, a subset of the plurality of predefined function blocks based on the user request, each of the plurality of predefined function blocks included in the subset corresponding to at least one of the one or more interactions included in the operation; and execute each of the plurality of predefined function blocks in the subset to perform the operation.
2 . The computer system of claim 1 , wherein the user processor-executable instructions, when executed by the processor, further cause the processor to: subsequent to selecting the subset of the plurality of predefined function blocks based on the user request, generate, by the language model, an orchestration script including one or more processor-executable instructions that, when executed by the processor, cause the processor to perform the operation using the subset of the plurality of predefined function blocks to execute each of the one or more interactions with the operating system; and wherein executing the subset of the plurality of predefined function blocks to perform the operation is further defined as executing the orchestration script to perform the operation.
3 . The computer system of claim 2 , wherein the orchestration script is written in Python code.
4 . The computer system of claim 2 , wherein the language model is a first language model, the memory further stores a second language model, the step of selecting the subset of the plurality of predefined function blocks based on the user request is further defined as selecting, by the first language model, the subset of the plurality of predefined function blocks based on the user request, and the step of generating the orchestration script based on the user request is further defined as generating, by the second language model, the orchestration script based on the user request, the second language model including more parameters than the first language model.
5 . The computer system of claim 1 , wherein executing the subset of the plurality of predefined function blocks to perform the operation is further defined as executing the subset of the plurality of predefined function blocks in a restricted execution environment to perform the operation.
6 . The computer system of claim 5 , wherein the restricted execution environment is one of a restricted Python execution environment and a restricted Docker execution environment.
7 . The computer system of claim 1 , wherein the user request comprises text data.
8 . The computer system of claim 1 , wherein the user request comprises one of speech data and gesture data, and the user processor-executable instructions, when executed by the processor, further cause the processor to, subsequent to receiving the user request, convert the one of the speech data and the gesture data of the user request into text data.
9 . The computer system of claim 1 , wherein the memory further stores a plurality of risk level identifiers, each particular one of the plurality of risk level identifiers corresponding to a particular one of the plurality of predefined function blocks and indicating a predetermined risk level of the particular one of the plurality of predefined function blocks.
10 . The computer system of claim 1 , wherein the memory is restricted from being modified by the language model.
11 . A computer system, comprising: a host device comprising a host processor and a host memory comprising a host non-transitory processor-readable medium storing a plurality of predefined function blocks, each of the plurality of predefined function blocks comprising predetermined processor-executable instructions that, when executed by a processor executing an operating system, cause the host processor to interact with the operating system; and a user device comprising a user processor and a user memory comprising a user non-transitory processor-readable medium storing the operating system, a language model, and a user interface application comprising user processor-executable instructions that, when executed by the user processor executing the operating system, cause the user processor to: receive a user request in natural language to perform an operation, the operation including one or more interactions with the operating system; select, by the language model, a subset of the plurality of predefined function blocks based on the user request, each of the plurality of predefined function blocks included in the subset corresponding to at least one of the one or more interactions included in the operation; and execute the subset of the plurality of predefined function blocks to perform the operation.
12 . The computer system of claim 11 , wherein the user processor-executable instructions, when executed by the user processor, further cause the user processor to: subsequent to selecting the subset of the plurality of predefined function blocks based on the user request, generate, by the language model, an orchestration script including one or more processor-executable instructions that, when executed by the user processor, cause the user processor to perform the operation using the subset of the plurality of predefined function blocks to execute each of the one or more interactions with the operating system; and wherein executing the subset of the plurality of predefined function blocks to perform the operation is further defined as executing the orchestration script to perform the operation.
13 . The computer system of claim 12 , wherein the orchestration script is written in Python code.
14 . The computer system of claim 12 , wherein the language model is a first language model, the user memory further stores a second language model, the step of selecting the subset of the plurality of predefined function blocks based on the user request is further defined as selecting, by the first language model, the subset of the plurality of predefined function blocks based on the user request, and the step of generating the orchestration script based on the user request is further defined as generating, by the second language model, the orchestration script based on the user request, the second language model including more parameters than the first language model.
15 . The computer system of claim 11 , wherein executing the subset of the plurality of predefined function blocks to perform the operation is further defined as executing the subset of the plurality of predefined function blocks in a restricted execution environment to perform the operation.
16 . The computer system of claim 15 , wherein the restricted execution environment is one of a restricted Python execution environment and a restricted Docker execution environment.
17 . The computer system of claim 11 , wherein the user request comprises text data.
18 . The computer system of claim 11 , wherein the user request comprises one of speech data and gesture data, and the user processor-executable instructions, when executed by the user processor, further cause the user processor to, subsequent to receiving the user request, convert the one of the speech data and the gesture data of the user request into text data.
19 . The computer system of claim 11 , wherein the host memory further stores a plurality of risk level identifiers, each particular one of the plurality of risk level identifiers corresponding to a particular one of the plurality of predefined function blocks and indicating a predetermined risk level of the particular one of the plurality of predefined function blocks.
20 . The computer system of claim 11 , wherein the host memory is restricted from being modified by the language model.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit of the provisional patent application identified by U.S. Ser. No. 63/716,950, filed Nov. 6, 2024, and the provisional patent application identified by U.S. Ser. No. 63/736,972, filed Dec. 20, 2024, the entire contents of each of which are hereby expressly incorporated herein by reference. GOVERNMENT SUPPORT Not Applicable BACKGROUND Natural language computing interfaces have emerged as a significant advancement in human-computer interaction, allowing users to manage their personal computers (PCs) using natural (i.e., human-readable) language. This technology holds the potential to enhance productivity and accessibility by simplifying time-consuming and labor-intensive PC tasks into computer-recognizable (i.e., processor-readable) instructions. Current embodiments of these interfaces for PC management primarily operate through software solutions that integrate large language models (LLMs) with operating system controls. The technological foundation that enabled these interfaces stems from the introduction of the Transformer architecture, as described in the publication by Vaswani, A., et al., “Attention Is All You Need” (2017). Notable examples in the art include Open Interpreter and OpenAI's ChatGPT. However, these existing solutions exhibit significant limitations in their security architecture and execution control mechanisms. A primary deficiency in the current art is the lack of robust security measures governing the execution of machine-generated code. Existing solutions typically implement direct execution pathways for code generated by artificial intelligence (AI) models without incorporating adequate validation protocols or execution safeguards. This architectural approach creates potential vulnerabilities in system security and reliability. For example, AI models may be trained to generate seemingly safe code that contains hidden vulnerabilities that may be triggered under specific conditions. One such vulnerability was demonstrated in the publication by Hubinger, E., et al., “Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training” (2024), wherein a model was trained to insert exploitable code only when prompted with a specific year, a deceptive behavior that persisted even after safety training. Further, there is a risk that jailbreaks could potentially enable unauthorized access and execution due to insufficient safeguards. Such unauthorized execution has been demonstrated by third parties who exploited these vulnerabilities in systems like Claude Computer Use to execute catastrophic commands, including the deletion of root directories in a Linux environment. In Claude Computer Use's technical report, for example, unintended actions were documented during system demonstrations, including accidentally stopping screen recordings and unexpectedly browsing unrelated content. Alternative approaches in the art have attempted to address these limitations through dedicated hardware embodiments. Specifically, devices such as the Rabbit R1 and Humane AI Pin represent attempts to instantiate natural language computing interfaces in standalone form factors. However, these hardware-based solutions have encountered substantial obstacles to widespread adoption, primarily due to two factors: (1) restricted functional capabilities compared to software-based alternatives; and (2) prohibitive device costs that limit market accessibility. These deficiencies in the current art demonstrate the need for improved systems and methods for implementing secure, controlled natural language interfaces for PC management. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate one or more embodiments described herein and, together with the description, explain these embodiments. The drawings are not intended to be drawn to scale, and certain features and certain views of the figures may be shown exaggerated, to scale or in schematic in the interest of clarity and conciseness. Not every component may be labeled in every drawing. Like reference numerals in the figures may represent and refer to the same or similar element or function. In the drawings: FIG. 1 is a process flow diagram of an exemplary embodiment of a method of providing artificial intelligence-enabled natural language interaction with an operating system in accordance with the prior art; FIG. 2 is a block diagram of an exemplary embodiment of a computer system constructed in accordance with the present disclosure; and FIG. 3 is a block diagram of an exemplary embodiment of a first user device shown in FIG. 2; FIG. 4 is a block diagram of an exemplary embodiment of a host device shown in FIG. 2; FIG. 5 is a process flow diagram of another exemplary embodiment of a method of providing artificial intelligence-enabled natural language interaction with an operating system in accordance wi