US-12625956-B2 - Performing automated digital investigations of phishing attempts
Abstract
In some implementations, a method is provided for performing automated digital investigations of phishing attempts. Source identification data is received for a potential source of phishing attempts, and is stored at a data repository. According to a monitoring frequency for the potential source of phishing attempts, monitoring operations are periodically performed, including using the source identification data to retrieve content from the potential source, storing the retrieved content with the source identification data, and executing rules on the source identification data and the retrieved content. Based on a result of executing the rules, it can be determined whether the potential source of phishing attempts is an actual source of phishing attempts, the monitoring frequency can be adjusted, and mitigating actions can optionally be performed.
Inventors
- Eric Robert Brandel
- Daniel Christopher Jay Flettre
Assignees
- TARGET BRANDS, INC.
Dates
- Publication Date
- 20260512
- Application Date
- 20240105
Claims (19)
- 1 . A computer-implemented method for performing automated digital investigations of phishing attempts, the method comprising: inserting a tracking component into a sensitive web page of a subject system; receiving source identification data that identifies a potential source of phishing attempts, wherein at least a portion of the received source identification data includes an address of a copy of the sensitive web page of the subject system that has been received through the tracking component, wherein the copy of the web page is being hosted by a web server other than that of the subject system; storing the source identification data at a data repository of discovered potential sources of phishing attempts, wherein the source identification data is stored at the data repository with data indicating that a tracked page discovery technique has been used to discover the potential source of phishing attempts; determining that another, different discovery technique has also been used to discover the potential source of phishing attempts; in response to determining that the tracked page discovery technique and the another, different discovery technique have both been used to independently discover the potential source of phishing attempts at different times, increasing a monitoring frequency for the potential source of phishing attempts; and according to the monitoring frequency for the potential source of phishing attempts, periodically performing monitoring operations on the potential source of phishing attempts, the monitoring operations comprising: using the source identification data to retrieve content from the potential source of phishing attempts; storing the retrieved content of the potential source of phishing attempts with the source identification data that identifies the potential source of phishing attempts; executing a set of predefined rules on the source identification data and the retrieved content of the potential source of phishing attempts, wherein at least one predefined rule in the set of predefined rules includes a graphical comparison between a subject image included on the sensitive web page of the subject system and a retrieved image included in the retrieved content of the potential source of phishing attempts; and based on a result of executing the set of predefined rules, (i) determining whether the potential source of phishing attempts is an actual source of phishing attempts, and (ii) adjusting the monitoring frequency for the potential source of phishing attempts such that subsequent performances of the monitoring operations occur at a frequency that is different from a current frequency for performing the monitoring operations.
- 2 . The computer-implemented method of claim 1 , further comprising: providing a domain search for a domain registrar, wherein the domain search includes one or more search terms that relate to the subject system; and wherein the received source identification data includes a domain name of the potential source of the phishing attempts, and wherein the domain name has been provided by the domain registrar in response to the domain search.
- 3 . The computer-implemented method of claim 1 , further comprising: providing a search query for a search engine, wherein the search query includes one or more search terms that relate to the subject system; and wherein the received source identification data includes a hyperlink to a landing page of the potential source of phishing attempts, and wherein the hyperlink to the landing page has been provided by the search engine in response to the search query.
- 4 . The computer-implemented method of claim 1 , further comprising: providing an identifier of a platform page of a content platform; and wherein the received source identification data includes (i) an image of a content item being presented by the platform page and (ii) a hyperlink to a landing page of the potential source of phishing attempts that is associated with the content item, and wherein the hyperlink to the landing page and the image of the content item being presented by the platform page have been located using the identifier of the platform page.
- 5 . The computer-implemented method of claim 1 , wherein each rule of the set of predefined rules is associated with a corresponding severity level, and wherein determining whether the potential source of phishing attempts is an actual source of phishing attempts includes determining whether at least one rule that matches the potential source of phishing attempts has a critical severity level.
- 6 . The computer-implemented method of claim 1 , further comprising: in response to determining that the potential source of phishing attempts is an actual source of phishing attempts, (i) generating and transmitting an alert that identifies the source of phishing attempts, and (ii) increasing the monitoring frequency for the source of phishing attempts.
- 7 . The computer-implemented method of claim 6 , wherein the alert is transmitted to a communication channel that had previously been specified through a rule generation interface that had been used to create a rule of the set of predefined rules that matches the source of phishing attempts.
- 8 . The computer-implemented method of claim 6 , wherein the alert is transmitted to a phishing mitigation system that is configured to perform a mitigating action to handle the source of phishing attempts.
- 9 . The computer-implemented method of claim 1 , further comprising: in response to determining that the potential source of phishing attempts is not an actual source of phishing attempts, decreasing the monitoring frequency for the source of phishing attempts.
- 10 . The computer-implemented method of claim 1 , further comprising: executing at least one discovery operation rule on the received source identification data that identifies the potential source of phishing attempts; and wherein the source identification data is stored at the data repository of discovered potential sources of phishing attempts, and the monitoring operations are performed on the potential source of phishing attempts, in response to the source identification data matching the at least one discovery operation rule.
- 11 . The computer-implemented method of claim 1 , further comprising: presenting an interface for specifying a rule to be included in the set of predefined rules, wherein the interface includes a rule definition control for defining computer code that is to be executed against the potential source of phishing attempts.
- 12 . A computer system comprising: one or more data processing apparatuses including one or more processors, memory, and storage devices storing instructions that, when executed, cause the one or more processors to perform operations comprising: inserting a tracking component into a sensitive web page of a subject system; receiving source identification data that identifies a potential source of phishing attempts, wherein at least a portion of the received source identification data includes an address of a copy of the sensitive web page of the subject system that has been received through the tracking component, wherein the copy of the web page is being hosted by a web server other than that of the subject system; storing the source identification data at a data repository of discovered potential sources of phishing attempts, wherein the source identification data is stored at the data repository with data indicating that a tracked page discovery technique has been used to discover the potential source of phishing attempts; determining that another, different discovery technique has also been used to discover the potential source of phishing attempts; in response to determining that the tracked page discovery technique and the another, different discovery technique have both been used to independently discover the potential source of phishing attempts at different times, increasing a monitoring frequency for the potential source of phishing attempts; and according to the monitoring frequency for the potential source of phishing attempts, periodically performing monitoring operations on the potential source of phishing attempts, the monitoring operations comprising: using the source identification data to retrieve content from the potential source of phishing attempts; storing the retrieved content of the potential source of phishing attempts with the source identification data that identifies the potential source of phishing attempts; executing a set of predefined rules on the source identification data and the retrieved content of the potential source of phishing attempts, wherein at least one predefined rule in the set of predefined rules includes a graphical comparison between a subject image included on the sensitive web page of the subject system and a retrieved image included in the retrieved content of the potential source of phishing attempts; and based on a result of executing the set of predefined rules, (i) determining whether the potential source of phishing attempts is an actual source of phishing attempts, and (ii) adjusting the monitoring frequency for the potential source of phishing attempts such that subsequent performances of the monitoring operations occur at a frequency that is different from a current frequency for performing the monitoring operations.
- 13 . The computer system of claim 12 , the operations further comprising: providing a domain search for a domain registrar, wherein the domain search includes one or more search terms that relate to the subject system; and wherein the received source identification data includes a domain name of the potential source of the phishing attempts, and wherein the domain name has been provided by the domain registrar in response to the domain search.
- 14 . The computer system of claim 12 , the operations further comprising: providing a search query for a search engine, wherein the search query includes one or more search terms that relate to the subject system; and wherein the received source identification data includes a hyperlink to a landing page of the potential source of phishing attempts, and wherein the hyperlink to the landing page has been provided by the search engine in response to the search query.
- 15 . The computer system of claim 12 , the operations further comprising: providing an identifier of a platform page of a content platform; and wherein the received source identification data includes (i) an image of a content item being presented by the platform page and (ii) a hyperlink to a landing page of the potential source of phishing attempts that is associated with the content item, and wherein the hyperlink to the landing page and the image of the content item being presented by the platform page have been located using the identifier of the platform page.
- 16 . A non-transitory computer-readable storage medium coupled to one or more processors and having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations comprising: inserting a tracking component into a sensitive web page of a subject system; receiving source identification data that identifies a potential source of phishing attempts, wherein at least a portion of the received source identification data includes an address of a copy of the sensitive web page of the subject system that has been received through the tracking component, wherein the copy of the web page is being hosted by a web server other than that of the subject system; storing the source identification data at a data repository of discovered potential sources of phishing attempts, wherein the source identification data is stored at the data repository with data indicating that a tracked page discovery technique has been used to discover the potential source of phishing attempts; determining that another, different discovery technique has also been used to discover the potential source of phishing attempts; in response to determining that the tracked page discovery technique and the another, different discovery technique have both been used to independently discover the potential source of phishing attempts at different times, increasing a monitoring frequency for the potential source of phishing attempts; and according to the monitoring frequency for the potential source of phishing attempts, periodically performing monitoring operations on the potential source of phishing attempts, the monitoring operations comprising: using the source identification data to retrieve content from the potential source of phishing attempts; storing the retrieved content of the potential source of phishing attempts with the source identification data that identifies the potential source of phishing attempts; executing a set of predefined rules on the source identification data and the retrieved content of the potential source of phishing attempts, wherein at least one predefined rule in the set of predefined rules includes a graphical comparison between a subject image included on the sensitive web page of the subject system and a retrieved image included in the retrieved content of the potential source of phishing attempts; and based on a result of executing the set of predefined rules, (i) determining whether the potential source of phishing attempts is an actual source of phishing attempts, and (ii) adjusting the monitoring frequency for the potential source of phishing attempts such that subsequent performances of the monitoring operations occur at a frequency that is different from a current frequency for performing the monitoring operations.
- 17 . The non-transitory computer-readable storage medium of claim 16 , the operations further comprising: providing a domain search for a domain registrar, wherein the domain search includes one or more search terms that relate to the subject system; and wherein the received source identification data includes a domain name of the potential source of the phishing attempts, and wherein the domain name has been provided by the domain registrar in response to the domain search.
- 18 . The non-transitory computer-readable storage medium of claim 16 , the operations further comprising: providing a search query for a search engine, wherein the search query includes one or more search terms that relate to the subject system; and wherein the received source identification data includes a hyperlink to a landing page of the potential source of phishing attempts, and wherein the hyperlink to the landing page has been provided by the search engine in response to the search query.
- 19 . The non-transitory computer-readable storage medium of claim 16 , the operations further comprising: providing an identifier of a platform page of a content platform; and wherein the received source identification data includes (i) an image of a content item being presented by the platform page and (ii) a hyperlink to a landing page of the potential source of phishing attempts that is associated with the content item, and wherein the hyperlink to the landing page and the image of the content item being presented by the platform page have been located using the identifier of the platform page.
Description
TECHNICAL FIELD This specification generally relates to a platform for discovering potential sources of phishing attempts, performing automated digital investigations of the discovered sources, and performing mitigation actions to prevent phishing occurrences. BACKGROUND Phishing is practice in which a malicious actor attempts to deceive users into revealing sensitive information through deceptive practices. The malicious actor can generate online content that purports to be from a legitimate source, but is instead under the control of the malicious actor and is designed to harvest the users' sensitive information. Technical approaches to prevent phishing attempts can include content-based analysis, applying content filters, and maintaining lists of known phishing sites. SUMMARY This document generally describes computer systems, processes, program products, and devices for discovering potential sources of phishing attempts, performing automated digital investigations of the discovered sources, and performing mitigation actions to prevent phishing occurrences. In general, the Internet can provide a large attack surface area, including a vast and ever-changing pool of potential malicious actors and potential sources of phishing attempts. Tracking and dealing with such potential threats in a proactive (rather than a reactive) manner can be logistically and technically challenging. The presently described technology attempts to detect and mitigate the threats before users are impacted, in such a way that is automated and intelligent. Briefly, the technology described in this document involves performing various discovery operations for identifying potential sources of phishing attempts, and periodically performing automated monitoring operations for determining whether a potential source is an actual source of phishing attempts. The discovery operations and the automated monitoring operations can be performed independently of each other, and according to customized schedules that are designed to balance the use of limited computing resources against the goal of discovering actual phishing attempts in a timely manner. The discovery techniques, for example, can involve searching for potential sources of phishing attempts from a variety of different online platforms, including trusted third party sources, search engines, and content platforms. The automated monitoring operations, for example, can involve periodically visiting the discovered potential sources, retrieving content from the sources, and executing preconfigured rules on the retrieved content. Over time, a frequency of the automated monitoring operations can be appropriately adjusted, such that newly discovered potential sources of phishing attempts are frequently monitored, and over time, the potential sources are monitored less frequently (as long as the potential sources continue to be benign). By independently adjusting the monitoring frequency of the discovered sources, a large number of potential sources of phishing attempts can be concurrently tracked, while focusing the use of limited processing resources on the most likely actual sources. Such techniques, for example, can facilitate the scaling of a discovery/monitoring/mitigation system. In response to the identification of a likely source of an actual phishing attempt, appropriate alerts can be triggered, and appropriate mitigation actions can be performed. User interfaces can be provided to configure the automated discovery and monitoring operations, to configure the rules and alerts, and to facilitate the performance of the mitigation actions. After performing a mitigation action, the source of an actual phishing attempt can continue to be monitored at an appropriate frequency, to verify the performance of the action. By mitigating the sources of actual phishing in a timely manner, for example, sensitive user information can be effectively protected. In some implementations, a method for performing automated digital investigations of phishing attempts, performed by data processing apparatuses, includes receiving source identification data that identifies a potential source of phishing attempts; storing the source identification data at a data repository of discovered potential sources of phishing attempts; and according to a monitoring frequency for the potential source of phishing attempts, periodically performing monitoring operations on the potential source of phishing attempts. The monitoring operations can include using the source identification data to retrieve content from the potential source of phishing attempts; storing the retrieved content of the potential source of phishing attempts with the source identification data that identifies the potential source of phishing attempts; executing a set of predefined rules on the source identification data and the retrieved content of the potential source of phishing attempts; and based on a result of executing the set of predefined rules