Search

US-20260127292-A1 - Systems and Methods for Accurate Assessment of Application Vulnerabilities

US20260127292A1US 20260127292 A1US20260127292 A1US 20260127292A1US-20260127292-A1

Abstract

Techniques for accurately assessing the vulnerability status of a collection of software applications are disclosed herein. An example computer-implemented method includes computing a metric indicative of time-to-vulnerability-remediation for each software application in the collection of software applications. The method further includes classifying each software application as one of a predefined set of classifications using at least the computed metrics indicative of time-to-vulnerability-remediation. The method also predicts at least one future classification for each software application in the collection of software applications. The method also outputs at least one data object indicative of, for each software application in the collection of software applications, the software application, the classification of the software application, and the predicted classification(s) of the software application.

Inventors

  • Ziqian Huang
  • Daniel Tabor
  • Anne Owen Jackson
  • Jinxia Yao

Assignees

  • OPTUM, INC.

Dates

Publication Date
20260507
Application Date
20241107

Claims (20)

  1. 1 . A method comprising: computing, by one or more processors, a metric indicative of time-to-vulnerability-remediation for a first software application; classifying, by the one or more processors, vulnerability remediation associated with the first software application as a respective initial classification from a predetermined set of classifications using at least the metric; predicting, by the one or more processors, a first sequence comprising one or more predicted classifications, from the predetermined set of classifications, that the first software application is predicted to be classified as at a set of discrete future times; and generating, by the one or more processors, a data object that identifies the first software application and indicates at least one of the respective initial classification or the first sequence.
  2. 2 . The method of claim 1 , further comprising: computing, by the one or more processors, an updated metric indicative of time-to-vulnerability-remediation for the first software application; re-classifying, by the one or more processors, vulnerability remediation associated with the first software application as an updated respective classification from the predetermined set of classifications using at least the updated metric; predicting, by the one or more processors, an updated first sequence comprising one or more predicted classifications, from the predetermined set of classifications, that the first software application is predicted to be classified as at a set of discrete future times; and generating, by the one or more processors, an updated data object that identifies the first software application and indicates at least one of the updated respective classification or the updated first sequence.
  3. 3 . The method of claim 1 wherein the first software application is an element of a set of software applications.
  4. 4 . The method of claim 1 , wherein predicting the first sequence comprises: determining frequencies at which one or more software applications have transitioned from one classification before re-classification to a second classification after re-classification during a time window; generating a transition (probability) matrix based on the determined frequencies; determining a Markov chain based on the transition (probability) matrix; and generating the first sequence based on the Markov chain.
  5. 5 . The method of claim 1 , wherein the first sequence further comprises one or more matrices that are computed using a Markov chain and are indicative of one or more probabilities that the first software application will be classified as one or more classifications from the predetermined set of classifications at a discrete set of times.
  6. 6 . The method of claim 5 , further comprising: determining to schedule the first software application for uninstallation or decommissioning based at least in part on at least one of: determining the Markov chain has a steady state, determining a frequency with which one or more classifications from the predetermined set of classifications appear in the first sequence, or determining that a first probability of a set of probabilities associated with the steady state meets or exceeds a threshold.
  7. 7 . The method of claim 1 , wherein computing the metric comprises determining a median time-to-vulnerability-remediation.
  8. 8 . The method of claim 7 , wherein computing the median time-to-vulnerability-remediation comprises using a survival analysis technique, the survival analysis technique comprising at least one of a Kaplan-Meier estimator or a proportional hazards model.
  9. 9 . The method of claim 8 , wherein the survival analysis technique is the Kaplan-Meier estimator and the predetermined set of classifications comprises at least one of: a first classification that is indicative of applications where a Kaplan-Meier median can be computed for a set of durations of vulnerabilities associated with the application and detected or received within an enrollment window; a second classification that is indicative of applications where a Kaplan-Meier median cannot be computed for the set of durations and vulnerabilities were detected or received within the enrollment window; a third classification that is indicative of applications where a Kaplan-Meier median cannot be computed for the set of durations, no vulnerabilities were detected or received within the enrollment window, and vulnerabilities were detected or received after the enrollment window closed; a fourth classification that is indicative of applications where a Kaplan-Meier median cannot be computed for the set of durations, no new vulnerabilities were detected or received within the enrollment window, no vulnerabilities were detected or received after the enrollment window closed, and vulnerabilities were detected or received prior to the enrollment window opening; or a fifth classification that is indicative of applications where a Kaplan-Meier median cannot be computed for the set of durations, no vulnerabilities were detected or received prior to the enrollment window opening, no vulnerabilities were detected or received within the enrollment window, and no vulnerabilities were detected or received after the enrollment window closed.
  10. 10 . A system comprising: one or more processors; and one or more memories storing processor-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: computing a metric indicative of time-to-vulnerability-remediation for a first software application; classifying vulnerability remediation associated with the first software application as a respective initial classification from a predetermined set of classifications using at least the metric; predicting a first sequence comprising one or more predicted classifications, from the predetermined set of classifications, that the first software application is predicted to be classified as at a set of discrete future times; and generating a data object that identifies the first software application and indicates at least one of the respective initial classification or the first sequence.
  11. 11 . The system of claim 10 , wherein the processor-executable instructions further cause the one or more processors to perform operations comprising: computing an updated metric indicative of time-to-vulnerability-remediation for the first software application; re-classifying vulnerability remediation associated with the first software application as an updated respective classification from the predetermined set of classifications using at least the updated metric; predicting an updated first sequence comprising one or more predicted classifications, from the predetermined set of classifications, that the first software application is predicted to be classified as at a set of discrete future times; and generating an updated data object that identifies the first software application and is indicative of at least one of the updated respective classification or the updated first sequence.
  12. 12 . The system of claim 10 , wherein the processor-executable instructions cause the one or more processors to predict the first sequence at least in part by: determining frequencies at which software applications transition from one classification before re-classification to a second classification after re-classification; generating a transition (probability) matrix based on the determined frequencies; determining a Markov chain based on the transition (probability) matrix; and generating the first sequence based on the Markov chain.
  13. 13 . The system of claim 12 , wherein the processor-executable instructions cause the one or more processors to, when the Markov chain has a steady state, generate the updated data object to be indicative of the steady state.
  14. 14 . The system of claim 10 , wherein the metric indicative of time-to-vulnerability-remediation comprises a median time-to-vulnerability-remediation.
  15. 15 . The system of claim 14 , wherein the processor-executable instructions cause the one or more processors to compute the median time-to-vulnerability-remediation using a survival analysis technique, the survival analysis technique comprising at least one of a Kaplan-Meier estimator or a proportional hazards model.
  16. 16 . The system of claim 15 , wherein the survival analysis technique is the Kaplan-Meier estimator and the predetermined set of classifications comprises at least one of: a first classification that is indicative of applications where a Kaplan-Meier median can be computed for a set of durations of vulnerabilities associated with the application and detected or received within an enrollment window; a second classification that is indicative of applications where a Kaplan-Meier median cannot be computed for the set of durations and vulnerabilities were detected or received within the enrollment window; a third classification that is indicative of applications where the Kaplan-Meier median cannot be computed for the set of durations and no vulnerabilities were detected or received within the enrollment window and vulnerabilities were detected or received after the enrollment window closed; a fourth classification that is indicative of applications where a Kaplan-Meier median cannot be computed for the set of durations, no vulnerabilities were detected or received within the enrollment window, no vulnerabilities were detected or received after the enrollment window closed, and vulnerabilities were detected or received prior to the enrollment window opening; or a fifth classification that is indicative of applications where a Kaplan-Meier median cannot be computed for the set of durations, no vulnerabilities were detected or received prior to the enrollment window opening, no vulnerabilities were detected or received within the enrollment window, and no vulnerabilities were detected or received after the enrollment window closed.
  17. 17 . One or more non-transitory computer-readable storage media storing processor-executable instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: computing a metric indicative of time-to-vulnerability-remediation for a first software application; classifying vulnerability remediation associated with the first software application as a respective initial classification from a predetermined set of classifications using at least the metric; predicting a first sequence comprising one or more predicted classifications, from the predetermined set of classifications, that the first software application is predicted to be classified as at a set of discrete future times; and generating a data object that identifies the first software application and indicates at least one of the respective initial classification or the first sequence.
  18. 18 . The one or more non-transitory computer-readable storage media of claim 17 , wherein the processor-executable instructions further cause the one or more processors to perform operations comprising: computing an updated metric indicative of time-to-vulnerability-remediation for a first software application; re-classifying vulnerability remediation associated with the first software application as an updated respective classification from the predetermined set of classifications using at least the updated metric; predicting an updated first sequence comprising one or more predicted classifications, from the predetermined set of classifications, that the first software application is predicted to be classified as at a set of discrete future times; and generating an updated data object that identifies the first software application and indicates at least one of the updated respective classification or the updated first sequence.
  19. 19 . The one or more non-transitory computer-readable storage media of claim 17 , wherein the processor-executable instructions cause the one or more processors to predict the first sequence at least in part by: determining frequencies at which software applications transition from one classification before re-classification to a second classification after re-classification; generating a transition (probability) matrix based on the determined frequencies; determining a Markov chain based on the transition (probability) matrix; and generating the updated first sequence based on the Markov chain.
  20. 20 . The one or more non-transitory computer-readable storage media of claim 17 , wherein the processor-executable instructions cause the one or more processors to compute the metric indicative of time-to-vulnerability-remediation at least in part by using at least one of a Kaplan-Meier estimator or a proportional hazards model.

Description

TECHNICAL FIELD The present disclosure generally relates to techniques for assessing the vulnerability status of a collection of software applications. BACKGROUND Prioritizing maintenance for software applications based on known vulnerabilities is a well-established problem. Classical approaches to prioritization rely on pre-computed metrics. Examples of such approaches include ranking the software applications according to their time-to-vulnerability-remediation or their adherence to service level objectives. Those applications that perform the worst with respect to these metrics are typically considered to be the highest priority candidates for maintenance or additional resources. However, these approaches neglect contextual information about the software applications. BRIEF DESCRIPTION OF THE DRAWINGS The Figures described below depict preferred embodiments for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the systems and methods illustrated herein may be employed without departing from the principles of the disclosure described herein. FIG. 1 depicts an example computing system in which various techniques of the present disclosure can be implemented. FIG. 2A depicts an example classification and prediction architecture in accordance with various embodiments described herein. FIG. 2B depicts an example classification and prediction architecture similar to that of FIG. 2A but incorporating re-classifications in accordance with various embodiments described herein. FIG. 3 depicts an example architecture for predicting a sequence of one or more predicted classifications and the steady state of a Markov chain based on a transition (probability) matrix, in accordance with various embodiments described herein. FIG. 4 depicts an example configuration of output data objects in accordance with various embodiments described herein. FIG. 5 depicts a flow diagram representing an example computer-implemented method, in accordance with various embodiments described herein. DETAILED DESCRIPTION Broadly, the present disclosure relates to techniques for quantifying and predicting risk associated with software vulnerabilities for a set of software applications. In some embodiments, the resultant data structure includes, for a first software application of the set of software applications, the name of the software application, an initial classification of the software application, and/or a sequence of one or more predicted classifications. A system classifies the first software application based at least in part on a metric indicative of a time-to-vulnerability-remediation of the software application. In some examples, the metric indicative of time-to-vulnerability-remediation may comprise a median time-to-vulnerability-remediation of a set of vulnerabilities associated with the first software application. In some embodiments, the system generates, as the set of classifications to associate with the first software application, a subset of classifications from a predetermined set of classifications based at least in part on a particular context in which the present techniques are deployed. For example, the system may generate the predetermined set of classifications specifically for developing an automated prioritization scheme or to provide a comprehensive overview of the vulnerability status of the set of software applications to a human reviewer. In some examples, the automated prioritization scheme may be used to determine a timeline for decommissioning a software application, decommission or otherwise prevent or pause execution a software application if a classification or time-to-vulnerability-remediation satisfies a criteria, transmit a software application or portion thereof to a computing device associated with vulnerability remediation, and/or the like. Alternatively and/or additionally, the automated prioritization scheme may rank a set of software applications according to need for additional maintenance and assign resources accordingly. The system may additionally or alternatively predict a sequence of (one or more) future classifications for a first software application. The sequence of classifications predicted for a given software application generally indicates how the metrics associated with the time-to-vulnerability-remediation of the software application are predicted to change over time and may be associated with respective likelihoods (i.e., posterior probabilities) of future occurrence. In some instances, metrics indicative of a software application's time-to-vulnerability remediation may be misleading with respect to application prioritization when an organization considers those metrics in isolation. For example, a newly deployed, mission-critical software application may have many vulnerabilities but, because the application is newly deployed, the application may have an undefined or very low mean-time-to-v