US-12625948-B2 - Using machine learning to detect QRLjacking to prevent multichannel phishing on applications or IOT devices

US12625948B2US 12625948 B2US12625948 B2US 12625948B2US-12625948-B2

Abstract

Aspects of the disclosure relate to using multiple machine learning models and a sand-box environment to detect malicious uniform resource locators (URLs) and login pages stored in a QR code. An application on a computing device will augment QR code data stored in a QR code and the computing device sends the QR code data and computing device metadata to a deep learning computing platform. The QR code data may comprise a URL and communication protocol information. The deep learning computing platform will generate, by multiple machine learning models and a sand-box environment, a sand-box score using the URL, a user-based score using the computing device metadata, and a connections score using the communication protocol information. The deep learning platform then determines whether the computing device should reject and delete the QR code data based on whether the scores are below a pre-determined threshold.

Inventors

Elvis Nyamwange
Erik Dahl
Brian Jacobson
Pratap Dande
Hari Vuppala
Rahul Yaksh
Rahul Phadnis
Amer Ali
Sailesh Vezzu

Assignees

BANK OF AMERICA CORPORATION

Dates

Publication Date: 20260512
Application Date: 20230713

Claims (20)

1 . A system comprising: a deep learning computing platform for detecting malicious URLs associated with a quick response (QR) code, comprising: at least one processor; and memory storing computer-readable instructions that, when executed by the at least one processor, cause the deep learning computing platform to: receive, by a sand-box controlled module comprising a sand-box environment and a website prediction model, and based on a scan of a QR code, QR code data from a computing device associated with an application, wherein the QR code data comprises a uniform resource locator (URL) associated with a login page and communication protocol information, wherein the application is configured to generate and send, based on the scan of the QR code, a restriction flag to a system root of the computing device that causes the computing device to restrict access while determining whether the URL is malicious; receive, from the computing device, computing device metadata, wherein the computing device metadata comprises a location of the computing device, a timestamp, and a device IP address; generate a website score, by the website prediction model, using the URL; execute the URL in the sand-box environment and generate a sand-box score based on a comparison of the website score to a threshold and whether the executed URL matches a known rejected URL stored in a sand-box training database, wherein the sand-box score is determined to be: a first value when the website score is below the threshold and the executed URL does not match a known rejected URL stored in the sand-box training database or the website score is above the threshold and the URL matches a known rejected URL stored in the sand-box training database; and a second value when the website score is above the threshold and the URL does not match a known rejected URL stored in the sand-box training database; send the sand-box score, by the sand-box controlled module, to an approval unit; send the communication protocol information to a connections prediction model; generate a connections score, by the connections prediction model, using the communication protocol information; send the connections score to the approval unit; send the computing device metadata to a user-based prediction model; generate a user-based score, by the user-based prediction model, using the computing device metadata; send the user-based score to the approval unit; determine, by the approval unit, that at least one of the sand-box score, connections score, or user-based score, or combination thereof, is below a pre-determined threshold; generate, by the approval unit, a rejected flag; and send the rejected flag to the computing device, wherein the rejected flag causes the computing device to delete the QR code data before causing the restriction flag to be removed so that the access to the computing device is no longer restricted.
2 . The system of claim 1 , wherein the memory of the deep learning computing platform stores additional computer-readable instructions that, when executed by the at least one processor, cause the deep learning computing platform to: train and develop the website prediction model based on the sand-box training database, wherein the sand-box training database stores a plurality of known accepted and rejected URLs; train and develop the user-based prediction model based on a user-based training database, wherein the user-based training database stores a plurality of known computing device metadata; and train and develop the connections prediction model based on a connections database, wherein the connections database stores a plurality of known communication protocol information.
3 . The system of claim 1 , wherein the memory of the deep learning computing platform stores additional computer-readable instructions that, when executed by the at least one processor, cause the deep learning computing platform to: send the communication protocol information to a connections training database; update and add the communication protocol information to the connections training database; train the connections prediction model based on the connections training database, wherein the connections prediction model is re-developed; send the computing device metadata to a user-based training database; update and add the computing device metadata to the user-based training database; train the user-based prediction model based on the user-based training database, wherein the user-based prediction model is re-developed; send the URL to a sand-box training database; update and add the URL to the sand-box training database; and train the website prediction model based on the sand-box training database, wherein the website prediction model is re-developed.
4 . The system of claim 1 , wherein the deep learning computing platform further comprises a camera, wherein the camera is configured to scan the QR code, wherein the application is configured to process the QR code to augment the QR code data, and wherein the memory comprises computer-readable instructions that, when executed by the at least one processor, cause the deep learning computing platform: receive, from the computing device, the augmented QR code data.
5 . The system of claim 1 , wherein the memory of the deep learning computing platform stores additional computer-readable instructions that, when executed by the at least one processor, cause the deep learning computing platform to: if the sand-box controlled module is not able to generate a sand-box score, then add the QR code data and computing device metadata to a review queue for manual review.
6 . The system of claim 1 , wherein the communication protocol information comprises an IP packet header that comprises at least a source IP address and a destination IP address.
7 . The system of claim 1 , wherein the communication protocol information comprises secure sockets layer (SSL) or transport layer security (TLS) certificates.
8 . The system of claim 1 , wherein the rejected flag causes the computing device to generate a message on a graphical user interface (GUI) of the computing device that indicates the QR code data is rejected and potentially malicious.
9 . The system of claim 1 , wherein the restriction flag causes the computing device to restrict access to any code or installation that may run on the computing device.
10 . The system of claim 1 , wherein the computing device is a smart phone, tablet, smart watch, or mobile device.
11 . A method for detecting malevolent QR codes used to phish personal information from a user of a computing device, comprising: at a deep learning computing platform comprising at least one processor and memory: receiving, by a sand-box controlled module comprising a sand-box environment and a website prediction model, and based on a scan of a QR code, QR code data from the computing device associated with an application, wherein the QR code data comprises a uniform resource locator (URL) associated with a login page and communication protocol information; receiving, from the computing device, computing device metadata; generating and sending, by the application, a restriction flag to a system root of the computing device that causes the computing device to restrict access while determining whether the URL is malicious; generating a website score, by the website prediction model, using the URL; executing the URL in the sand-box environment; generating a sand-box score based on a comparison of the website score to a threshold and whether the executed URL matches with known rejected URLs in a sand-box training database, wherein the sand-box score is determined to be: a first value when the website score is below the threshold and the executed URL does not match a known rejected URL stored in the sand-box training database or the website score is above the threshold and the URL matches a known rejected URL stored in the sand-box training database; and a second value when the website score is above the threshold and the URL does not match a known rejected URL stored in the sand-box training database; sending the sand-box score, by the sand-box controlled module, to an approval unit; sending the communication protocol information to a connections prediction model; generating a connections score, by the connections prediction model, using the communication protocol information; sending the connections score to the approval unit; sending the computing device metadata to a user-based prediction model; generating a user-based score, by the user-based prediction model, using the computing device metadata; sending the user-based score to the approval unit; determining, by the approval unit, that at least one of the sand-box score, connections score, or user-based score, or combination thereof, is below a pre-determined threshold; generating, by the approval unit, a rejected flag; and sending the rejected flag to the computing device, wherein the rejected flag causes the computing device to delete the QR code data before causing the restriction flag to be removed so that the access to the computing device is no longer restricted.
12 . The method of claim 11 , further comprising: training and developing the website prediction model based on the sand-box training database, wherein the sand-box training database stores a plurality of known accepted and rejected URLs; training and developing the user-based prediction model based on a user-based training database, wherein the user-based training database comprises a plurality of known computing device metadata; and training and developing the connections prediction model based on a connections database, wherein the connections database stores a plurality of known communication protocol information.
13 . The method of claim 11 , further comprising: sending the communication protocol information to a connections training database; updating and adding the communication protocol information to the connections training database; training the connections prediction model based on the connections training database, wherein the connections prediction model is re-developed; sending the computing device metadata to a user-based training database; updating and adding the computing device metadata to the user-based training database; training the user-based prediction model based on the user-based training database, wherein the user-based prediction model is re-developed; sending the URL to a sand-box training database; updating and adding the URL to the sand-box training database; and training the website prediction model based on the sand-box training database, wherein the website prediction model is re-developed.
14 . The method of claim 11 , wherein the computing device further comprises a camera, wherein the camera is configured to scan the QR code, wherein the application is configured to process the QR code to augment the QR code data, wherein the QR code data is sent to the deep learning computing platform, and wherein the deep learning computing platform receives the augmented QR code data.
15 . The method of claim 11 , wherein the restriction flag causes the computing device to restrict access to any code or installation that may run on the computing device.
16 . The method of claim 11 , wherein the rejected flag causes the computing device to generate a message on a graphical user interface (GUI) of the computing device that indicates the QR code data is rejected and potentially malicious.
17 . The method of claim 11 , wherein the communication protocol information comprises an IP packet header that comprises at least a source IP address and a destination IP address.
18 . The method of claim 11 , wherein the communication protocol information comprises (secure sockets layer) SSL or (transport layer security) TLS certificates.
19 . One or more non-transitory computer-readable media storing instructions that, when executed by a deep learning computing platform comprising at least one processor, and memory, cause the deep learning computing platform to: receive, by a sand-box controlled module comprising a sand-box environment and a website prediction model, and based on a scan of a QR code, QR code data from a computing device associated with an application, wherein the QR code data comprises a uniform resource locator (URL) associated with a login page and communication protocol information; generate and send, by the application, a restriction flag to a system root of the computing device that causes the computing device to restrict access while determining whether the URL is malicious; generate a website score, by the website prediction model, using the URL; execute the URL in the sand-box environment and generate a sand-box score based on a comparison of the website score to a threshold and whether the executed URL matches with known rejected URLs in a sand-box training database, wherein the sand-box score is determined to be: a first value when the website score is below the threshold and the executed URL does not match a known rejected URL stored in the sand-box training database or the website score is above the threshold and the URL matches a known rejected URL stored in the sand-box training database; and a second value when the website score is above the threshold and the URL does not match a known rejected URL stored in the sand-box training database; send the sand-box score, by the sand-box controlled module, to an approval unit; send the communication protocol information to a connections prediction model; generate a connections score, by the connections prediction model, using the communication protocol information; send the connections score to the approval unit; determine, by the approval unit, that at least one of the sand-box score, connections score, or combination thereof, is below a pre-determined threshold; generate, by the approval unit, a rejected flag; and send the rejected flag to the computing device, wherein the rejected flag causes the computing device to delete the QR code data before causing the restriction flag to be removed so that the access to the computing device is no longer restricted.
20 . The one or more non-transitory computer-readable media of claim 19 , wherein the rejected flag causes the computing device to generate a message on a graphical user interface (GUI) of the computing device that indicates the QR code data is rejected and potentially malicious.

Description

BACKGROUND Quick response (QR) codes are machine-readable, two dimensional barcodes used to store data. QR codes may also store a variety of data, including transaction information, authentication information, simple text messages, contact information and uniform resource locators (URL). Mobile devices usually come with a built in QR scanner in their camera application, which is secure, while others rely on third-party QR scanners. Attackers can easily embed a malicious URL containing custom malware into a QR code that can exfiltrate data from a mobile device when scanned. The ability to alter a QR code, especially a dynamic QR code, to point to an alternative resource without being detected may be highly effective for malevolent users. SUMMARY The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosure. The summary is not an extensive overview of the disclosure. It is neither intended to identify key or critical elements of the disclosure nor to delineate the scope of the disclosure. The following summary merely presents some concepts of the disclosure in a simplified form as a prelude to the description below. Aspects of this disclosure provide effective, efficient, scalable, and convenient technical solutions that address various issues in the prior art with quick response (QR) code login jacking (QrlJacking) or malicious QR codes used to phish user login information and/or other personal data. Multiple machine learning models and a sand-box environment are used by a deep learning computing platform to detect whether the URL or login page associated with a QR code is potentially malicious and/or attempting to access user login information or personal data on a computing device. In accordance with one or more embodiments, a system or method comprising a computing device and a deep learning computing platform is disclosed. The deep learning computing platform, with at least one processor and memory, may receive by a sand-box controlled module comprising a sand-box environment and a website prediction model, QR code data from a computing device associated with an application. The QR code data may comprise a uniform resource locator (URL) and communication protocol information. The application is configured to generate and send a restriction flag to a system root of the computing device that causes the computing device to restrict access when a QR code is scanned. The deep learning computing platform then may receive computing device metadata from the computing device. The computing device metadata may comprise a location of the computing device, a timestamp, and a device IP address. The website prediction model may generate a website score using the URL. The URL is executed in the sand-box environment which generates a sand-box score based on the website score and whether the executed URL matches a known rejected URL stored in a sand-box training database. The sand-box controlled module sends the sand-box score to an approval unit. The communication protocol information is then sent to a connections prediction model. The connections prediction model generates a connections score using the communication protocol information and the connections score is sent to the approval unit. The computing device metadata is then sent to a user-based prediction model. The user-based prediction model generates a user-based score using the computing device metadata and sends the user-based score to the approval unit. The approval unit determines that at least one of the sand-box score, connections score, or user-based score, or combination thereof, is below a pre-determined threshold and the approval unit generates a rejected flag. The deep learning computing platform sends the rejected flag to the computing device. The rejected flag causes the computing device to delete the QR code data before causing the restriction flag to be removed so that the access to the computing device is no longer restricted. In some embodiments, the deep learning computing platform trains and develops the website prediction model based on a sand-box training database. The sand-box training database may store a plurality of known accepted and rejected URLs. The deep learning computing platform also trains the user-based prediction model based on a user-based training database. The user-based training database may store a plurality of known computing device metadata. The deep learning computing platform may also train the connections prediction model based on the connections training database. The connections training database may store a plurality of known communication protocol information. In some embodiments, the connections protocol information is sent to a connections training database. The deep learning computing platform updates and adds the communication protocol information to the connections training database. The connections prediction model is re-developed by training the con