US-20260127366-A1 - COMPUTER IMPLEMENTED METHODS FOR THE AUTOMATED ANALYSIS OR USE OF DATA, INCLUDING USE OF A LARGE LANGUAGE MODEL

US20260127366A1US 20260127366 A1US20260127366 A1US 20260127366A1US-20260127366-A1

Abstract

There is provided a method of improving the operation of a generative AI large language model (LLM)-based data processing system, by operating the LLM-based system in conjunction with a non-LLM data processing system; and in which (a) the LLM-based system sends a continuation as an input to the non-LLM system, and (b) the non-LLM system (i) uses symbolic representations to perform non-statistical reasoning on the input from the LLM-based system and (ii) generates a reasoned prompt or other context.

Inventors

William Tunstall-Pedoe
Robert Heywood
Seth WARREN
Paul BENN
Duncan REYNOLDS
Ayush Shah
Luci KRNIC
Ziyi Zhu

Assignees

UNLIKELY ARTIFICIAL INTELLIGENCE LIMITED

Dates

Publication Date: 20260507
Application Date: 20251121
Priority Date: 20220222

Claims (20)

1 . A computer implemented method of improving the accuracy or reliability of an AI system including a LLM (large language model) based system, in which the LLM-based system is a deep learning model capable of processing natural language and the AI system is capable of generating a sequence of reasoning steps; and in which: (i) the AI system passes one or more reasoning passages, explanations, history or results, in a structured, machine-readable representation distinct from natural language, to a policy or moderation service, separate from the LLM-based system; and (ii) the policy or moderation service uses the reasoning passages, explanations, history or results to refuse, redact or rephrase LLM outputs.
2 . The computer implemented method of claim 1 , in which the policy or moderation service validates the machine-readable reasoning passages, explanations, history or results against a declared schema.
3 . The method of claim 1 , in which the policy or moderation service deterministically parses the machine-readable reasoning passages, explanations, history or results.
4 . The method of claim 1 , in which the policy or moderation service returns a structured decision to refuse, redact, or rephrase portions of a response.
5 . The method of claim 1 , in which the AI system withholds enabling or permitting display of content unless the policy or moderation service's decision permits display.
6 . The method of claim 1 , in which the policy or moderation service classifies the input according to policy categories and severity.
7 . The method of claim 1 , in which the policy or moderation service records its decision, policy version, and inputs as structured audit entries.
8 . The method of claim 1 , in which the policy or moderation service produces a human-interpretable explanation summarizing the policy decision.
9 . The method of claim 1 , in which the AI system injects a structured correction from the policy or moderation service as augmented context for the LLM-based system.
10 . The method of claim 1 , in which multiple policy or moderation services run in parallel and a merging service combines their outputs.
11 . The method of claim 1 , in which the policy or moderation service is selected based on capability metadata relative to an inference task.
12 . The method of claim 1 , in which the policy or moderation service references computation units to simulate or measure risk prior to decision.
13 . The method of claim 1 , in which the policy or moderation service retries a decision after repairing a malformed structured input and re-validates the repaired input.
14 . The method of claim 1 , in which the policy or moderation service caches prior decisions and reuses them across sessions for identical inputs.
15 . The method of claim 1 , in which the AI system re-prompts the LLM-based system using the policy or moderation service's structured decision as augmented context.
16 . The method of claim 1 , in which the policy or moderation service ranks candidate outputs using trust and uncertainty labels.
17 . The method of claim 1 , in which the policy or moderation service filters evidence used for decision using tenets.
18 . The method of claim 1 , in which the policy or moderation service's inputs or outputs identify provenance and source identifiers.
19 . The method of claim 1 , in which the policy or moderation service tests policy effects in a non-blocking mode and records outcomes for audit.
20 . The method of claim 1 , in which the policy or moderation service's policy change triggers re-evaluation of prior decisions.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This is a continuation of U.S. application Ser. No. 19/364,721, filed on Oct. 21, 2025, which is a continuation of U.S. application Ser. No. 18/914,717, filed on Oct. 14, 2024, which is a continuation of U.S. application Ser. No. 18/648,788, filed on Apr. 29, 2024, now U.S. Pat. No. 12,164,868, issued Dec. 10, 2024, which is a continuation of U.S. application Ser. No. 18/301,615, filed on Apr. 17, 2023, now U.S. Pat. No. 11,989,507, issued May 21, 2024, which is a continuation of International Application No. PCT/GB2023/050405, filed on Feb. 22, 2023, which claims priority to GB Application No. GB2202347.7, filed on Feb. 22, 2022; GB Application No. GB2219268.6, filed on Dec. 20, 2022; GB Application No. GB2300624.0, filed on Jan. 16, 2023; and GB Application No. GB2302085.2, filed on Feb. 14, 2023, and is a continuation-in-part of U.S. application Ser. No. 18/001,368, filed on Dec. 9, 2022, which is the US national stage of International Application No. PCT/GB2021/052196, filed on Aug. 24, 2021, the entire contents of each of which being fully incorporated herein by reference. BACKGROUND OF THE INVENTION 1. Field of the Invention The field of the invention relates to computer implemented methods for the automated analysis or use of data, including use of a large language model (LLM), and to related computer implemented methods and systems. 2. Technical Background Natural language (NL) is language evolved for humans such as the English language. Although significant advances have been made in computers' ability to process natural language, computers are still not able to deeply understand the meaning of natural language and use that meaning internally. For this reason most computer applications typically use structured data to store information that they need for processing—e.g. a relational database: designing the schema, populating the database and writing code to process the fields in the database. Use of structured data can work well if the application has limited requirements for the type of data required. However, some applications naturally require an extremely broad, heterogeneous collection of data to work well. This means that the schema required would have to be enormous, making building and coding for such an application impractical. We refer to such applications herein as HUB applications (Heterogeneous and Unreasonably Broad). Examples of HUB applications include an application for managing a person's general health data where there are thousands of tests, thousands of medical conditions and thousands of symptoms. Another related application could be a nutrition tracking application where there are many thousands of substances and foods that can be ingested, each with different metabolic effects on the body. Another example is an application to match the résumé of potential candidates with a job specification: in principle such an application would need structured data to represent every skill that might be of value to any role, every type of experience, every type of previous job. Accounting is another application where vast heterogeneous data would be valuable: the perfect accounting application would represent every type of contract, every type of service. In practice some of these applications, where they exist, work with a limited schema that doesn't cover the full range of their ideal properties. Health applications for example, typically work like this ignoring many types of data that they do not cover and instead end up being narrow—limiting the application to only certain verticals within health. Applications may also use natural language or augment a limited schema with natural language—such as with current résumé matching applications which might represent a few key skills in a structured form but rely largely on keyword searching or statistical natural language processing (NLP) techniques on written résumés otherwise. In the case of accounting, transactions are represented with limited structured data—debits and credits on virtual ledgers with natural language names. The meaning of the natural language names and thus what these transactions represent is generally opaque to the application. Virtual ledgers often group different types of transaction together but fail to represent semantic differences which may be important. There is no exact threshold for when an application becomes a HUB application but the difficulty of building an application with a hand created schema grows more than linearly with the number of tables as managing these tables as well as the code that maintains them becomes increasingly difficult to do. These issues could be addressed if there existed a language or way of representing data that computers could fully process and understand but that also had an extremely broad scope. In conventional Artificial Intelligence (AI), statistical Machine Learning (ML)—particularly Deep Learning (DL)—has been widely used. This has pr