Search

US-12625973-B1 - CPE prediction using banner-prompted LLM

US12625973B1US 12625973 B1US12625973 B1US 12625973B1US-12625973-B1

Abstract

Prediction of CPEs using banners greatly improves computer functioning. Many web services have an unknown common platform enumeration (CPE). When the CPE is unknown, a computer system is unable to obtain cybersecurity flaws and software fixes for a software product or web service. A CPE, though, is predicted by banner-prompting a large language model using a web service banner. Once the CPE is predicted, vulnerabilities may be identified.

Inventors

  • SHAEFER DREW
  • Michael Avraham Brautbar

Assignees

  • CROWDSTRIKE, INC.

Dates

Publication Date
20260512
Application Date
20250210

Claims (20)

  1. 1 . A method executed by a computer system that predicts a common platform enumeration (CPE) product, comprising: banner-grabbing a web service banner; predicting the CPE product by banner-prompting a large language model (LLM) trained using semantic relationships learned from web service banners; and classifying the CPE product by a classifier trained using the web service banners.
  2. 2 . The method of claim 1 , further comprising training the LLM using the web service banners.
  3. 3 . The method of claim 1 , further comprising training the LLM using a CPE data.
  4. 4 . The method of claim 1 , wherein the banner-prompting further comprises using a banner attribute associated with the web service banner.
  5. 5 . The method of claim 1 , further comprising recursively performing the banner-prompting of the LLM.
  6. 6 . The method of claim 1 , further comprising filtering a CPE data using the CPE product.
  7. 7 . The method of claim 1 , further comprising determining a vendor associated with the CPE product.
  8. 8 . At least one computer system that predicts a common platform enumeration (CPE) product, comprising: at least one central processing unit; and at least one memory device storing instructions that, when executed by the at least one central processing unit, perform operations, the operations comprising: banner-grabbing a web service banner; predicting the CPE product by banner-prompting a large language model (LLM) trained using semantic relationships learned from web service banners; and classifying the CPE product by a classifier trained using the web service banners.
  9. 9 . The at least one computer system of claim 8 , wherein the operations further comprise labeling the CPE product by prompting the LLM as a labeling function.
  10. 10 . The at least one computer system of claim 8 , wherein the operations further comprise generating a concatenated web service banner by concatenating the web service banners.
  11. 11 . The at least one computer system of claim 10 , wherein the operations further comprise generating banner tokens representing the web service banners using a machine learning model trained with the semantic relationships learned from the web service banners.
  12. 12 . The at least one computer system of claim 11 , wherein the operations further comprise generating a banner embedding using a machine learning model trained with the semantic relationships learned from the web service banners.
  13. 13 . The at least one computer system of claim 11 , wherein the operations further comprise banner-grabbing the web service banners from the Internet.
  14. 14 . The at least one computer system of claim 8 , wherein the operations further comprise training the LLM using a CPE data.
  15. 15 . The at least one computer system of claim 8 , wherein the operations further comprise determining a vendor associated with the CPE product.
  16. 16 . A non-transitory memory device storing instructions that, when executed by at least one central processing unit, perform operations that predict common platform enumeration (CPE) products, the operations comprising: banner-grabbing web service banners; generating a banner sample by sampling the web service banners; predicting first CPE products by banner-prompting a large language model (LLM) trained using the web service banners; generating filtered CPE products by filtering the banner sample according to the first CPE products predicted by the banner-prompting of the LLM trained using the web service banners; generating a final CPE prediction of the CPE products by re-prompting the LLM using the banner sample and the filtered CPE products generated by the filtering of the banner sample; and classifying the web service banners by a classifier trained using the CPE products.
  17. 17 . The non-transitory memory device of claim 16 , wherein the operations further comprise generating a concatenated web service banner by concatenating the web service banners.
  18. 18 . The non-transitory memory device of claim 17 , wherein the operations further comprise generating banner tokens representing the web service banners by tokenizing the concatenated web service banner.
  19. 19 . The non-transitory memory device of claim 18 , wherein the operations further comprise generating a banner token embedding representing the banner tokens.
  20. 20 . The non-transitory memory device of claim 19 , wherein the operations further comprise determining a vendor associated with the CPE products.

Description

BACKGROUND The subject matter described herein generally relates to electrical communications and to computer security and, more particularly, the subject matter relates to computer vulnerability analysis. Many computers are exposed to cybersecurity threats. It seems every day there is another cybersecurity hack that steals account passwords, business data, and personal information. Large computer networks, in particular, are especially vulnerable to cybersecurity threats. Large computer networks may have hundreds or even thousands of computers, so it's increasingly difficult to monitor such large numbers of computers. Many of these computers may unknowingly connect to the Internet and/or run outdated software, so these computers are especially vulnerable to cybersecurity threats. SUMMARY Accurate prediction of common platform enumeration (CPE) helps resolve cybersecurity vulnerabilities. Many software products and web services have an unknown CPE. The CPE identifies known cybersecurity vulnerabilities and software fixes. When the CPE is unknown, however, the cybersecurity vulnerabilities remain unresolved and computer functioning is jeopardized. A CPE prediction service, though, identifies which CPEs should be matched to their corresponding software products and web services. The CPE prediction service grabs web service banners and predicts the CPEs by banner-prompting a large language model (or “LLM”). The CPE prediction service identifies a CPE that matches or belongs to a software product or web service, based on the web service banners. The CPE prediction service thus elegantly and quickly matches a CPE to its corresponding software product or web service. Once the CPE is known, its cybersecurity vulnerabilities may be fixed and computer functioning is improved. BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS The features, aspects, and advantages of common platform enumeration (or CPE) prediction using a banner-prompted LLM are understood when the following Detailed Description is read with reference to the accompanying drawings, wherein: FIGS. 1-3 illustrate some examples of predicting CPE-to-banner matches; FIGS. 4-9 illustrate some examples of a common platform enumeration (or CPE) prediction service; FIG. 10 illustrates more examples of CPE prediction using a banner-prompted large language model (or LLM); FIGS. 11-13, 14A-14B, and 15A-15B illustrate more examples of CPE prediction using the banner-prompted LLM; FIGS. 16-17 illustrate some examples of vulnerability identification; FIG. 18 illustrates some examples of banner grabbing; FIG. 19 illustrates a detailed example of the service architecture; FIGS. 20-21 illustrate examples of supervised classification; FIGS. 22-25 illustrate examples of data transformations and feature engineering; FIG. 26 illustrates more examples of improved computer functioning; FIG. 27 illustrates examples of cybersecurity notifications FIGS. 28-30 illustrate examples of methods or operations that predict common platform enumeration (CPE) products; and FIG. 31 illustrates a more detailed example of the operating environment. DETAILED DESCRIPTION Old and outdated software is especially vulnerable to cybersecurity threats. As we all know, nearly every day there is another cybersecurity hack that steals account passwords, business data, and personal information. Many of these cybersecurity hacks can be traced back to old and outdated software. People and companies simply fail to update their computer software with the latest fixes. Indeed, some companies are still using years-old or even decades-old software that is easily exploited by hackers. Some examples relate to predicting when computers need software updates. A common platform enumeration (or CPE) prediction service simply, quickly, and elegantly predicts when a computer needs a software update. The CPE prediction service, in particular, identifies computers that are unknowingly connected to the public Internet. These unknown, Internet-facing computers are blind spots to users and to IT administrators. These unknown, Internet-facing computers may thus be riddled with vulnerable software. The CPE prediction service, however, identifies a computer that connects to the public Internet. The CPE prediction service then also predicts one or more software vendors, products, and versions that are installed to the computer. Once the CPE prediction service predicts what software is installed to the computer, the CPE prediction service may then quickly and easily determine whether the software is out of date. The CPE prediction service, for example, may use the predicted software vendor/product/version to lookup the known vulnerabilities, patches, and other updates. The CPE prediction service may thus alert consumers and companies that they have an Internet-exposed computer running outdated software that is vulnerable to cybersecurity attacks. The CPE prediction service will now be described more fully hereinafter with