KR-20260064358-A - Deep voice phishing detection method

KR20260064358AKR 20260064358 AKR20260064358 AKR 20260064358AKR-20260064358-A

Abstract

The present invention relates to a method for detecting deep voice phishing. According to one embodiment, the present method can determine whether the voice of a call partner is genuine or fake by analyzing received voice data using deep learning techniques and spectrogram-based analysis techniques. In this process, the voice is converted into a spectrogram image, a CNN model is trained, and then the authenticity of the voice is determined based on the trained model. Through this, a technical foundation is established to effectively detect and prevent deep voice phishing crimes, and it contributes to enhancing security by providing rapid analysis results for voice files uploaded by users.

Inventors

조상현

Assignees

조상현

Dates

Publication Date: 20260507
Application Date: 20241031

Claims (4)

A step of converting multiple audio files into spectrogram images; A step of training the above spectrogram image with a CNN AI model; A deep voice phishing detection method characterized by including a step of determining whether deep voice phishing is occurring by analyzing a received voice file through the above-mentioned trained CNN AI model.
In Article 1, The above deep voice phishing detection method is, Step of saving a trained CNN AI model in .h5 file format, uploading it to an AWS EC2 server, and distributing it to web users; A deep voice phishing detection method characterized by providing a web-based user interface that allows a user to upload a voice file, and including the step of securely transmitting and storing the uploaded voice to a server.
In Paragraph 1 or 2, The above deep voice phishing detection method is, A deep voice phishing detection method characterized by further including a step of analyzing an uploaded voice file to determine whether the voice is a real voice or a generated voice.
In paragraph 3, The above deep voice phishing detection method is, A deep voice phishing detection method characterized by including the step of indicating the analyzed result as real or fake, and providing the probability that it is a real voice or an artificially generated fake voice.

Description

Deep voice phishing detection method The present invention relates to a method for detecting deep voice phishing, and more specifically, to a method for converting large-scale voice data into spectrogram images to train a deep learning model and analyzing a received voice file to detect whether deep voice phishing has occurred. Recently, as voice phishing crimes have become more sophisticated and diversified, there is an urgent need for technological approaches to detect and block them. Existing voice phishing detection technologies have primarily relied on methods such as analyzing call content, comparing voice characteristics, and tracking speakers' voice patterns; however, as voice patterns change and the scale of data increases, these technologies are showing limitations in accuracy and efficiency. Recently, 'deep voice phishing' crimes, which involve recording a user's voice and training it with deep learning to mimic the actual user's voice, are on the rise. Fraudulent schemes exploiting such deep voice phishing technology are expanding the damage to the user's family and acquaintances, and the need for new countermeasures is being emphasized. To solve this problem, the present invention aims to improve the accuracy and reliability of voice phishing detection by utilizing a CNN model to convert large-scale voice data into spectrogram images and then training it. This approach can be presented as an effective alternative for determining the authenticity of audio files. Figure 1 is a flowchart illustrating the overall flow of a deep voice phishing detection method performed by a computer device. Figure 2 is a flowchart illustrating the detailed training process of a CNN AI model for deep voice phishing detection. Figure 3 illustrates a spectrogram distribution of a voice file according to an embodiment of the present invention, visually showing the difference in distribution between real voice and artificially generated fake voice in a deep voice phishing detection system. FIGS. 4 to 8 are drawings showing the screen of a user terminal during the deep voice phishing detection process illustrated in FIG. 1. The present invention aims to provide a system for detecting deep voice phishing through a voice recognition model utilizing deep learning techniques. The system is based on a CNN model and includes a process of converting a voice file into a spectrogram image for training, and a step of predicting whether voice phishing is occurring in real time using the trained model. Specifically, when voice data is uploaded to a user terminal, a system trained with a CNN model receives and analyzes the voice data. Subsequently, based on the analysis results, it determines whether the voice is real or an artificially generated fake voice, and by providing the user with the determination result and reliability, it enables real-time preparation against voice phishing risks. This system performs advanced analysis using deep learning during the voice file upload stage, thereby improving the previously limited detection accuracy and providing high detection reliability even for voices with various speech patterns. In particular, it is configured to predict the authenticity of voice files with high reliability by utilizing a model pre-trained on a large-scale spectrogram dataset. In the system configuration of the present invention, when voice data is uploaded to a server, it is automatically converted into a spectrogram image, and subsequently, a CNN-based deep learning model learns this data. In addition, the system is designed to enable faster and more accurate identification of deep voice phishing compared to existing detection methods by providing deep voice phishing detection results to the user based on the learning results of a pre-trained model.