Search

US-12619724-B2 - Scanning of partial downloads

US12619724B2US 12619724 B2US12619724 B2US 12619724B2US-12619724-B2

Abstract

By way of example, a method includes, responsive to a user request to download, from the internet, a downloadable file with executable content, downloading a portion of the downloadable file, wherein the downloadable file is not executable with the portion; after download the portion of the downloadable file, scanning the portion of the downloadable file for malware characteristics to classify the downloadable file; and completing downloading the downloadable file only after determining, based on the scanning of the portion of the downloadable file, that the downloadable file is not malware.

Inventors

  • Abhishek Tripathi
  • Mayur Arvind Bhole
  • Nithya Nadig Shikarpur
  • Tirumaleswar Reddy Konda
  • Mayank Bhatnagar

Assignees

  • MCAFEE, LLC

Dates

Publication Date
20260505
Application Date
20231219

Claims (20)

  1. 1 . A computer-implemented method, comprising: responsive to a user request to download, from the internet, a downloadable file with executable content, downloading a portion of the downloadable file, wherein the downloadable file is not executable with the portion; after downloading the portion of the downloadable file, scanning the portion of the downloadable file for malware characteristics to classify the downloadable file; and completing downloading the downloadable file only after determining, based on the scanning of the portion of the downloadable file, that the downloadable file is not malware.
  2. 2 . The method of claim 1 , wherein the portion of the downloadable file is greater than 70% of the downloadable file and less than 95% of the downloadable file.
  3. 3 . The method of claim 1 , wherein scanning the portion of the downloadable file comprises using a machine learning (ML) model to analyze the portion of the downloadable file.
  4. 4 . The method of claim 3 , wherein the ML model comprises a computer vision model.
  5. 5 . The method of claim 3 , further comprising converting the portion of the downloadable file to a binary image format for computer vision analysis.
  6. 6 . The method of claim 3 , further comprising training the ML model on partial file images.
  7. 7 . The method of claim 3 , wherein the ML model is a computer vision model, and further comprising training the computer vision model on both partial file images and full file images.
  8. 8 . The method of claim 3 , wherein the ML model is a convolutional neural network (CNN).
  9. 9 . The method of claim 3 , further comprising pre-training the ML model and providing the ML model as a pre-trained model to an endpoint computing device.
  10. 10 . The method of claim 1 , further comprising classifying the portion of the downloadable file as malware, and terminating the request to download without completing download of the downloadable file.
  11. 11 . One or more tangible, nontransitory computer-readable storage media having stored thereon executable instructions to instruct a processor to: responsive to a user request to download, from the internet, a downloadable file with executable content, download a portion of the downloadable file, wherein the downloadable file is not executable with the portion; after downloading the portion of the downloadable file, scan the portion of the downloadable file for malware characteristics to classify the downloadable file; and complete downloading the downloadable file only after determining, based on the scanning of the portion of the downloadable file, that the downloadable file is not malware.
  12. 12 . The one or more tangible, nontransitory computer-readable storage media of claim 11 , wherein the portion of the downloadable file is greater than 70% of the downloadable file and less than 95% of the downloadable file.
  13. 13 . The one or more tangible, nontransitory computer-readable storage media of claim 11 , wherein scanning the portion of the downloadable file comprises using a machine learning (ML) model to analyze the portion of the downloadable file.
  14. 14 . The one or more tangible, nontransitory computer-readable storage media of claim 13 , wherein the instructions are further to provide the ML model as a pre-trained model to an endpoint computing device.
  15. 15 . The one or more tangible, nontransitory computer-readable storage media of claim 11 , wherein the instructions are further to classify the portion of the downloadable file as malware, and terminate the request to download without completing download of the downloadable file.
  16. 16 . The one or more tangible, nontransitory computer-readable storage media of claim 11 , wherein the instructions are further to provide a browser extension.
  17. 17 . An endpoint computing apparatus, comprising: a hardware platform comprising a processor circuit and a memory; a web browser; and instruction encoded within the memory to instruct the processor circuit to: responsive to a user request to download, within the web browser, from the internet, a downloadable file with executable content, download a portion of the downloadable file, wherein the downloadable file is not executable with the portion; after downloading the portion of the downloadable file, scan the portion of the downloadable file for malware characteristics to classify the downloadable file; and complete downloading the downloadable file only after determining, based on the scanning of the portion of the downloadable file, that the downloadable file is not malware.
  18. 18 . The endpoint computing apparatus of claim 17 , wherein the portion of the downloadable file is greater than 70% of the downloadable file and less than 95% of the downloadable file.
  19. 19 . The endpoint computing apparatus of claim 17 , wherein scanning the portion of the downloadable file comprises using a machine learning (ML) model to analyze the portion of the downloadable file.
  20. 20 . The endpoint computing apparatus of claim 17 , wherein the instructions are further to provide an extension for the web browser.

Description

CROSS REFERENCE TO RELATED APPLICATION This application is a continuation of U.S. application Ser. No. 17/168,934, titled “SCANNING OF PARTIAL DOWNLOADS,” filed Feb. 5, 2021, the entire contents of which is hereby incorporated by reference in their entirety. FIELD OF THE SPECIFICATION This application relates in general to computer security, and more particularly, though not exclusively, to providing a system and method for scanning of partial downloads. BACKGROUND Modern computing ecosystems often include “always on” broadband internet connections. These connections leave computing devices exposed to the internet, and the devices may be vulnerable to attack. BRIEF DESCRIPTION OF THE DRAWINGS The present disclosure is best understood from the following detailed description when read with the accompanying FIGURES. It is emphasized that, in accordance with the standard practice in the industry, various features are not necessarily drawn to scale, and are used for illustration purposes only. Where a scale is shown, explicitly or implicitly, it provides only one illustrative example. In other embodiments, the dimensions of the various features may be arbitrarily increased or reduced for clarity of discussion. Furthermore, the various block diagrams illustrated herein disclose only one illustrative arrangement of logical elements. Those elements may be rearranged in different configurations, and elements shown in one block may, in appropriate circumstances, be moved to a different block or configuration. FIG. 1 is a block diagram of selected elements of a security ecosystem. FIG. 2 is a block diagram of a sample analysis system. FIG. 3 is a block diagram of a client device. FIG. 4 is a block diagram of a server. FIG. 5 is a block diagram illustrating a deep learning model architecture. FIG. 6 is a flowchart of a method of performing analysis on a download. FIG. 7 is a flowchart of an additional method. FIG. 8 is a block diagram of selected elements of a hardware platform. FIG. 9 is a block diagram of selected elements of a system-on-a-chip (SoC). FIG. 10 is a block diagram of selected elements of a processor. FIG. 11 is a block diagram of selected elements of a network function virtualization (NFV) infrastructure. FIG. 12 is a block diagram of selected elements of a containerization infrastructure. FIG. 13 illustrates machine learning according to a “textbook” problem with real-world applications. FIG. 14 is a flowchart of a method that may be used to train a neural network. FIG. 15 is a flowchart of a method of using a neural network to classify an object. FIG. 16 is a block diagram illustrating selected elements of an analyzer engine. SUMMARY By way of example, a method includes, responsive to a user request to download, from the internet, a downloadable file with executable content, downloading a portion of the downloadable file, wherein the downloadable file is not executable with the portion; after download the portion of the downloadable file, scanning the portion of the downloadable file for malware characteristics to classify the downloadable file; and completing downloading the downloadable file only after determining, based on the scanning of the portion of the downloadable file, that the downloadable file is not malware. EMBODIMENTS OF THE DISCLOSURE The following disclosure provides many different embodiments, or examples, for implementing different features of the present disclosure. Specific examples of components and arrangements are described below to simplify the present disclosure. These are, of course, merely examples and are not intended to be limiting. Further, the present disclosure may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. Different embodiments may have different advantages, and no particular advantage is necessarily required of any embodiment. Machine learning (ML) has proved in recent years to be a powerful tool for malware detection. ML-based malware detection may in some cases use a combination of features based on, for example, static analysis (such as checksums/hash comparisons, signature matching with earlier fingerprints, and analyzing code and/or code flows) and dynamic analysis (e.g., executing the sample in a virtualized, safe, and similar environment to expected vulnerable conditions). This can be used to try to understand the behavioral aspects of a potentially malicious sample. Typically, these approaches require a complete download of the file to disassemble and/or execute before analysis. Furthermore, it is sometimes cumbersome to create features based on static and dynamic analysis of the file. However, recent research has included the use of detecting malware files based on an image generated from the file binary. In these cases, the file is converted to a binary, such as a g