US-12626258-B2 - Advanced SIP-based caller identification and voicemail analysis system for fraud prevention in telecommunications

US12626258B2US 12626258 B2US12626258 B2US 12626258B2US-12626258-B2

Abstract

Systems and processes are disclosed for a multi-layered approach to fraud prevention by leveraging a machine learning engine integrated with the Session Initiation Protocol (SIP) to attempt caller identification before transitioning to a voice call, allowing for the potential blocking of unwanted calls. If SIP-based identification remains inconclusive, an anomaly detection engine employing the Viterbi algorithm analyzes the caller's speech patterns during voicemail messages. The Viterbi algorithm converts spoken language into text, identifying suspicious characteristics such as unusual speech patterns, inconsistencies, and keywords associated with scams. If suspicious characteristics are detected, the system automatically blocks callback attempts and notifies the customer of potential spam or unwanted calls. This proactive approach addresses both live and recorded fraudulent calls, enhancing the security of telecommunications by preventing fraudulent interactions before they can cause harm. The system continuously learns from new data, adapting to evolving fraud tactics, providing robust, long-term protection for telecommunications users.

Inventors

Sivashalini Sivajothi
Maneesh Kumar Sethia
Boddu Vikas Teja
Ankit Kumar Sahoo

Assignees

BANK OF AMERICA CORPORATION

Dates

Publication Date: 20260512
Application Date: 20240708

Claims (20)

1 . An artificial intelligence (AI) method for phishing detection and call screening using web crawling, web scraping, machine learning, and real-time speech analysis, comprising the steps of: deploying a web crawler to identify and retrieve web pages potentially associated with phishing activities by continuously scanning an internet for new or modified web pages that match predefined criteria indicative of phishing attempts; using a web scraping agent to extract content from the retrieved web pages, including text, images, and metadata, to create a comprehensive dataset for analysis; performing feature engineering on the extracted content to generate a set of features indicative of phishing characteristics, the phishing characteristics comprising suspicious Uniform Resource Locators (URLs), unusual domain names, and specific keywords commonly found in phishing attacks; training a random forest algorithm on a dataset comprising known phishing and legitimate web pages to create a trained phishing detection model, utilizing supervised learning techniques to enhance model accuracy; applying the trained phishing detection model to the set of features to classify the web pages as phishing or legitimate, providing a probabilistic score for each classification to indicate a confidence level of a prediction; integrating the phishing detection model into a security infrastructure to provide real-time phishing detection and alerts, ensuring that potential threats are identified and addressed promptly; updating the phishing detection model periodically with new data to improve accuracy and adapt to emerging phishing tactics, utilizing continuous integration and deployment practices to maintain model effectiveness; generating a report on detected phishing activities, including details on the classified phishing web pages, the set of features used for classification, and confidence scores, to provide comprehensive insights into phishing threats; implementing a feedback mechanism to refine the web crawler and web scraping agent based on the detected phishing activities and user inputs, allowing for iterative improvements in performance of a system; ensuring compliance with relevant data protection regulations throughout a phishing detection process, including anonymizing sensitive data and adhering to privacy standards; providing an interface for users to review and manage the detected phishing activities, offering tools for the users to report false positives or confirm detections to further refine the detection model; deploying a machine learning engine integrated with a Session Initiation Protocol (SIP) to attempt caller identification before transitioning to a voice call, enhancing the initial screening process by leveraging detailed caller information; augmenting SIP invite with detailed authentication information about a caller, including identity, location, and call initiation timestamp, to enable a thorough assessment of call legitimacy; analyzing the SIP invite with the machine learning engine, comparing the SIP invite against known patterns of legitimate and fraudulent calls, using a model trained on a vast dataset to identify indicators of potential scams; blocking the call at an initiation stage if the machine learning engine determines the call to be likely unwanted or spam, preventing the call from reaching a recipient and reducing unwanted calls received; employing an anomaly detection engine using a Viterbi algorithm to analyze caller speech patterns during voicemail messages if analyzing the SIP invite remains inconclusive, leveraging advanced speech recognition and analysis techniques; identifying suspicious characteristics in voicemail analysis, the suspicious characteristics comprising unusual speech patterns, inconsistencies, high-pressure tactics, or keywords associated with spam, by comparing voicemail content to known scam scripts and behavioral patterns; automatically blocking any callback attempts from the recipient if the suspicious characteristics are detected during the voicemail analysis, preventing further engagement by potential scammers; notifying a customer of a potential spam or unwanted call and providing detailed information about a detected threat, including a reason for a suspicion and any relevant patterns identified; continuously improving detection capabilities of the system by adapting to new scam tactics through an adaptive learning process, ensuring the system remains effective against evolving threats; integrating the system with existing telecommunications infrastructure to enhance security measures without requiring significant changes, ensuring compatibility and ease of deployment for service providers; and providing insights and analytics to service providers by analyzing call patterns and identifying common characteristics of spam and fraud to develop new strategies for enhancing call security, offering valuable data for proactive threat management.
2 . The AI method of claim 1 , wherein the web crawler is configured to use multiple search engines and social media platforms to identify a broader range of potentially malicious web pages.
3 . The AI method of claim 2 , wherein the web scraping agent includes natural language processing (NLP) capabilities to analyze textual content for context and sentiment, enhancing accuracy of feature extraction.
4 . The AI method of claim 3 , wherein the feature engineering incorporates user behavioral data comprising click patterns and browsing history to improve detection model ability to distinguish between phishing and legitimate web pages.
5 . The AI method of claim 4 , wherein the phishing detection model is periodically retrained using a federated learning approach, enabling integration of new data from multiple sources without compromising user privacy.
6 . The AI method of claim 5 , wherein the security infrastructure includes an automated response mechanism that can quarantine or block access to the classified phishing web pages in real-time.
7 . The AI method of claim 6 , wherein the interface provided for the users includes customizable alert settings, allowing the users to specify preferred notification methods and sensitivity levels for phishing detection.
8 . The AI method of claim 7 , wherein the machine learning engine integrated with a Session Initiation Protocol (SIP) utilizes a multi-layer perceptron (MLP) network to enhance accuracy of caller identification.
9 . The AI method of claim 8 , wherein the anomaly detection engine employing the Viterbi algorithm is supplemented with a secondary algorithm, Hidden Markov Models (HMM), to improve detection of complex speech patterns indicative of fraudulent activity.
10 . The AI method of claim 9 , wherein the system provides a detailed analytics dashboard for service providers, including metrics on call blocking rates, types of detected scams, and geographical distribution of fraud attempts, to aid in strategic decision-making and resource allocation.
11 . An artificial intelligence (AI) system for phishing detection and call screening, comprising a storage device storing: a web crawler configured to continuously scan an internet and retrieve web pages potentially associated with phishing activities by matching predefined criteria indicative of phishing attempts; a web scraping agent operatively connected to the web crawler, configured to extract content from the retrieved web pages, including text, images, and metadata; a feature engineering module configured to process the extracted content to generate a set of features indicative of phishing characteristics, the phishing characteristics comprising suspicious Uniform Resource Locators (URLs), unusual domain names, and specific keywords commonly found in phishing attacks; a machine learning model comprising a random forest algorithm trained on a dataset of known phishing and legitimate web pages, configured to classify the web pages as phishing or legitimate based on the generated set of features; a real-time detection engine operatively connected to the machine learning model, configured to apply the machine learning model to the set of features and provide real-time phishing detection and alerts; a model update module configured to periodically update the machine learning model with new data to improve accuracy and adapt to emerging phishing tactics; a reporting module configured to generate reports on detected phishing activities, including details on the classified phishing web pages, the set of features used for classification, and confidence scores; a feedback mechanism integrated into the web crawler and web scraping agent, configured to refine their operations based on the detected phishing activities and user inputs; a compliance module configured to ensure that the system adheres to relevant data protection regulations throughout a phishing detection process; a user interface configured to allow users to review and manage the detected phishing activities, report false positives, and confirm detections; a Session Initiation Protocol (SIP) module configured to process SIP invites with detailed authentication information about a caller, including identity, location, and call initiation timestamp; a caller identification engine comprising a machine learning model trained on a dataset of legitimate and fraudulent calls, configured to analyze the SIP invites and determine legitimacy of the caller; a call blocking module operatively connected to the caller identification engine, configured to block calls at an initiation stage if the calls determined to be likely unwanted or spam; an anomaly detection engine employing a Viterbi algorithm, configured to analyze caller speech patterns during voicemail messages if analyzing the SIP invites remains inconclusive; a voicemail analysis module configured to identify suspicious characteristics in voicemail analysis, the suspicious characteristics comprising unusual speech patterns, inconsistencies, high-pressure tactics, or keywords associated with spam; a callback blocking module configured to automatically block any callback attempts from a recipient if the suspicious characteristics are detected during the voicemail analysis; a notification module configured to notify a customer of a potential spam or unwanted call and provide detailed information about a detected threat; an adaptive learning module configured to continuously improve detection capabilities of the system by adapting to new scam tactics; an integration module configured to ensure compatibility with existing telecommunications infrastructure, enhancing security measures without requiring significant changes; and an analytics dashboard for service providers, configured to provide insights and analytics on call patterns, characteristics of spam and fraud, and metrics on call blocking rates.
12 . The AI system of claim 11 , wherein the web crawler is configured to use multiple search engines and social media platforms to identify a broader range of potentially malicious web pages.
13 . The AI system of claim 12 , wherein the web scraping agent includes natural language processing (NLP) capabilities to analyze textual content for context and sentiment, enhancing accuracy of feature extraction.
14 . The AI system of claim 13 , wherein the feature engineering module incorporates user behavioral data comprising click patterns and browsing history to improve detection model ability to distinguish between phishing and legitimate web pages.
15 . The AI system of claim 14 , wherein the machine learning model is periodically retrained using a federated learning approach, enabling integration of new data from multiple sources without compromising user privacy.
16 . The AI system of claim 15 , wherein the real-time detection engine includes an automated response mechanism that can quarantine or block access to the classified phishing web pages in real-time.
17 . The AI system of claim 16 , wherein the user interface includes customizable alert settings, allowing the users to specify preferred notification methods and sensitivity levels for phishing detection.
18 . The AI system of claim 17 , wherein the caller identification engine utilizes a multi-layer perceptron (MLP) network to enhance accuracy of caller identification.
19 . The AI system of claim 18 , wherein the anomaly detection engine employing the Viterbi algorithm is supplemented with a secondary algorithm, Hidden Markov Models (HMM), to improve detection of complex speech patterns indicative of fraudulent activity.
20 . An artificial intelligence (AI) method for phishing detection and call screening using web crawling, web scraping, machine learning, and real-time speech analysis, comprising the steps of: deploying a web crawler to identify and retrieve web pages potentially associated with phishing activities; using a web scraping agent to extract content from the retrieved web pages; performing feature engineering on the extracted content to generate a set of features indicative of phishing characteristics; training a random forest algorithm on a dataset comprising known phishing and legitimate web pages to create a trained phishing detection model; applying the trained phishing detection model to the set of features to classify the web pages as phishing or legitimate; integrating the phishing detection model into a system's security infrastructure to provide real-time phishing detection and alerts; updating the phishing detection model periodically with new data to improve accuracy and adapt to emerging phishing tactics; generating a report on detected phishing activities, including details on the classified phishing web pages and the set of features used for classification; implementing a feedback mechanism to refine the web crawler and web scraping agent based on the detected phishing activities and user inputs; ensuring compliance with relevant data protection regulations throughout a phishing detection process; providing an interface for users to review and manage the detected phishing activities; deploying a machine learning engine integrated with a Session Initiation Protocol (SIP) to attempt caller identification before transitioning to a voice call; augmenting SIP invite with detailed authentication information about a caller, including identity, location, and call initiation timestamp; analyzing the SIP invite with the machine learning engine, comparing the SIP invite against known patterns of legitimate and fraudulent calls; blocking the call before an initiation stage if the machine learning engine determines the call to be likely unwanted or spam; employing an anomaly detection engine using a Viterbi algorithm to analyze caller speech patterns during voicemail messages if analyzing the SIP invite remains inconclusive; identifying suspicious characteristics in voicemail analysis, the suspicious characteristics comprising unusual speech patterns, inconsistencies, high-pressure tactics, or keywords associated with spam; automatically blocking any callback attempts from a recipient if the suspicious characteristics are detected during the voicemail analysis; notifying a customer of a potential spam or unwanted call and providing detailed information about a detected threat; continuously improving detection capabilities of the system by adapting to new scam tactics through an adaptive learning process; integrating the system with existing telecommunications infrastructure to enhance security measures without requiring significant changes; and providing insights and analytics to service providers by analyzing call patterns and identifying common characteristics of spam and fraud to develop new strategies for enhancing call security.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application is a continuation of U.S. application Ser. No. 18/763,525 filed on Jul. 3, 2024, which is entitled “Intelligent Technical Protocol Based Approach Leveraging AI-ML to Block Vishing Scammers” which is incorporated by reference herein in its entirety. TECHNICAL FIELD The technical field of the invention pertains to information security that integrates advanced artificial intelligence, machine learning algorithms, and speech recognition technologies to detect and prevent fraudulent activities. It focuses on enhancing the security of communication channels and transaction processes by utilizing real-time data analysis, voice pattern recognition, and anomaly detection. The invention operates within telecommunications frameworks such as the Session Initiation Protocol (SIP) and STIR/SHAKEN for call authentication and extends to in-person banking interactions where it analyzes application data and customer conversations to identify potential fraud. This comprehensive approach ensures robust protection against unauthorized access, identity spoofing, and fraudulent transactions. DESCRIPTION OF THE RELATED ART Voice phishing, or vishing, has become a significant threat in the digital era, particularly within the financial sector. Fraudsters use this method to deceive individuals into revealing sensitive information over the phone, often by posing as legitimate entities such as banks or government institutions. These scammers exploit the anonymity afforded by telecommunications technologies like Voice over Internet Protocol (VoIP) to mask their true identities and locations. The strategies employed by vishing perpetrators are increasingly sophisticated, involving techniques such as using personal information obtained from illicit sources to build trust and manipulate victims into sharing confidential details like social security numbers, bank account information, and passwords. This sophistication in tactics makes it difficult for individuals to recognize when they are being targeted by fraudsters, leading to significant financial losses and breaches of personal security. The primary challenge in combatting vishing lies in the limitations of existing technological measures. Traditional security systems and caller identification technologies often fail to effectively detect or block these fraudulent calls. Scammers can easily bypass conventional monitoring and tracking systems by using VoIP, which allows them to make calls inexpensively from anywhere in the world. These systems enable scammers to present any chosen phone number on the recipient's caller ID, making fraudulent calls appear as though they are coming from trusted sources. This spoofing of caller ID is a critical weakness in current telecommunications security, as it undermines the reliability of caller identification systems that consumers rely on to verify the legitimacy of incoming calls. The complexity of the telecommunications infrastructure further complicates the issue, as calls can pass through multiple networks and service providers before reaching their final destination, making it difficult to trace the origin and verify the authenticity of a call. This lack of transparency in the call's provenance is a critical gap that scammers exploit to their advantage. Additionally, the fragmented nature of telecommunications networks means that no single entity has a complete view of the call path, making coordinated efforts to detect and block fraudulent calls challenging. This fragmentation results in a piecemeal approach to security, where each service provider implements its own measures without a unified strategy, leaving gaps that fraudsters can exploit. Moreover, the reactive nature of current anti-vishing systems means that they often only identify and block known scam numbers after fraud has been reported. This method is inherently flawed as it fails to prevent the initial wave of scams from new or previously unreported numbers, thereby allowing significant damage before any protective action can be taken. The lack of integration between different anti-fraud systems and telecommunications technologies also hampers the effectiveness of current solutions. Existing systems do not adequately share information about emerging threats or suspicious patterns, which reduces the overall effectiveness of anti-vishing measures. This lack of coordinated defense makes it easier for scammers to modify their strategies and continue targeting victims. In-person bank fraud presents another significant challenge, particularly as fraudsters develop more sophisticated methods to deceive bank associates and circumvent traditional security measures. Fraudsters often present falsified documents or use stolen identities to open accounts, apply for loans, or conduct other fraudulent transactions. The ability of fraudsters to manipulate and coerce bank associates during in-person interactions further exacerbates the issu