US-12621337-B2 - Amplification of formal method and fuzz testing to enable scalable assurance for communication system

US12621337B2US 12621337 B2US12621337 B2US 12621337B2US-12621337-B2

Abstract

Methods for more secure mobile network communications are disclosed. Specifically, details involving natural language processing (NLP) based auto formal modeling of protocols and specifications with large language models (NLP) are provided. Methods for formal and fuzzing amplification for fuzz testing to detect vulnerabilities are also disclosed. Furthermore, solutions are provided to identified vulnerabilities in existing 5G infrastructures. Also disclosed is a digital twin fuzzing framework.

Inventors

Ying Wang
Jingda Yang

Assignees

THE TRUSTEES OF THE STEVENS INSTITUTE OF TECHNOLOGY

Dates

Publication Date: 20260505
Application Date: 20240405

Claims (20)

1 . A method for validating communication systems, comprising steps of: obtaining input data comprising protocol descriptions; transforming said protocol descriptions into dependency graphs and formal models using a large language model, said large language model being integrated with a transformer model; resolving ambiguities in said input data; determining design intent in said input data; obtaining iterative feedback from a HyFuzz platform, said HyFuzz platform being adapted to refine capture of intentions and resolve ambiguities in said input data by providing said input data to said HyFuzz platform, wherein said HyFuzz platform is configured to supply said design intent to said large language model; and producing output.
2 . The method of claim 1 , further comprising a step of establishing dependency relationships through cross-attention mechanisms and/or self-attention mechanisms.
3 . The method of claim 1 , further comprising a step of quantifying dependency relationships.
4 . The method of claim 3 , wherein said dependency relationships are quantified without human involvement.
5 . The method of claim 1 , further comprising a step of creating a list of exploits from said input data.
6 . The method of claim 5 , further comprising a step of conducting fuzz testing on said list of exploits.
7 . The method of claim 1 , wherein said large language model is Cross Attention Learning.
8 . The method of claim 1 , wherein said large language model performs concurrent processing of said protocol descriptions.
9 . The method of claim 1 , wherein said dependency graphs are scalable cross-session dependency graphs, supporting hierarchy of formal analysis, thereby facilitating revelation of in-depth relationships embedded within said protocol descriptions.
10 . The method of claim 1 , wherein said HyFuzz platform is trained on labeled identifiers and formal properties.
11 . The method of claim 1 , wherein said output bypasses token count barriers.
12 . A method for validating communication systems, comprising steps of: obtaining input data comprising protocol descriptions; transforming said protocol descriptions into dependency graphs and formal models using a large language model, said large language model being integrated with a transformer model: resolving ambiguities in said input data; determining design intent in said input data; obtaining iterative feedback from an experimental platform by providing said input data to said experimental platform, wherein said experimental platform is configured to supply said design intent to said large language model; and producing output, wherein said output bypasses token count barriers.
13 . The method of claim 12 , further comprising a step of establishing dependency relationships through cross-attention mechanisms and/or self-attention mechanisms.
14 . The method of claim 13 , further comprising a step of quantifying said dependency relationships.
15 . The method of claim 14 , wherein said dependency relationships are quantified without human involvement.
16 . The method of claim 12 , further comprising a step of creating a list of exploits from said input data.
17 . The method of claim 16 , further comprising a step of conducting fuzz testing on said list of exploits.
18 . The method of claim 12 , wherein said large language model performs concurrent processing of said protocol descriptions.
19 . The method of claim 12 , wherein said dependency graphs are scalable cross-session dependency graphs, supporting hierarchy of formal analysis, thereby facilitating revelation of in-depth relationships embedded within said protocol descriptions.
20 . The method of claim 12 , wherein the experimental platform is trained on labeled identifiers and formal properties.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application claims the benefit of and priority to U.S. Provisional Patent Application Ser. No. 63/457,557 filed Apr. 6, 2023, the entire disclosure of which is incorporated herein by reference. STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH This invention was made with government support under Grant No. D22AP00144 awarded by the Department of Defense (DoD) and DARPA. The U.S. government has certain rights in the invention. FIELD OF THE INVENTION The present invention relates to a broad range of systems that require the reliability of communications, and, more specifically, to the automatic assurance for large scale computing and information systems. BACKGROUND OF THE INVENTION Existing machine learning-based vulnerability detection in 5G lacks transparency and adds uncertainty. Formal based detection is limited in scalability and lacks a synthesized knowledge of unintended emergent behaviors and causal chain effects. The adoption of open-source stacks and distributed access systems are of rapidly increasing importance to the research infrastructure and offer unprecedented interdisciplinary benefits. However, the openness in resources and technologies also presents an attack surface of unprecedented size due to the in-transparency and complexity of design specification and stack implementations. Discovering and mitigating vulnerabilities and unintended emergent behaviors in a specific research project requires combinations of automated reasoning techniques in design and allowing testing techniques in implementation stacks. Enabling the solution at the infrastructure level can benefit research projects, while an agile approach could be taken to tailor the solution to fit the specific research area. 5G of wireless technology represents a complete transformation of telecommunication networks supporting massive amounts of devices connected through 5G and empowers a vast array of new and enhanced critical infrastructure services. Motivated by general trends such as network deperimetrization and 5G systems' strong dependency on software define networking and virtualization, 5G broke down the multidomain orchestration process into the main functions relevant to a multi-provider multi-domain environment of discovery, bilateral negotiation, provisioning and assurance stages with their corresponding multi-domain reference points. Verticals in 5G and next generation infrastructure create a diverse and intricate environment including software, hardware, configurations, instruments, data, users, and various stakeholders. With system complexity and its lack of security emphasis by domain scientists, the formed ecosystem requires a comprehensive evaluation and validation for improved research and transitional CI security postures. State-of-the-art security research in large-scale 5G challenges has primarily focused on either specific system aspects/partitions or particular applications. However, the lack of systematic implementations with adaptive strategies and/or assessment of the security risks in both 5G specifications and implementations can prevent an attack surface to potential threats. Detecting unintended emergent behavior in the software stacks requires strenuous effort because most of them fall into the stochastic domain, unlike deterministic behaviors that can be detected via formal methods. In the 5G-OPS application, the adopted Open Radio Access Network (O-RAN) characterized by machine learning algorithms adds significant performance improvement but introduces more uncertainty and less transparency to 5G communications. This uncertainty poses a significant challenge to traditional vulnerability detection methods, as they may not be able to effectively identify vulnerabilities arising from unexpected inputs or behaviors resulting from machine learning algorithms. Therefore, an efficient, systematic, and comprehensive vulnerability and unintended emergent behaviors detection scheme is essential to ensure the security and robustness of critical infrastructures. Among vulnerability detection approaches, formal methods and fuzz testing has been to be efficient applying in crucial components of a system or critical infrastructure, especially in communication protocol vulnerability detection. Some registration and access control protocols, including Authentication and Key Agreement (AKA), Radio Resource Control (RRC), etc. have been applied formal methods in various frameworks. Particularly, in 5G protocols based on the 5G security design, necessary lemmas are verified, including helping lemmas, sanity-check lemmas, and the lemmas that check the relevant security properties against the 5G protocols. Several existing formal analysis frameworks include Tamarin, LTE inspector and 5G reasoner that can determine precisely which security guarantees are met in 5G protocols by applying formal methods and automated verification in the symbolic model. Previously, a protocol-based