CN-122019769-A - AI generated text detection method and system based on multi-granularity hidden space feature enhancement

CN122019769ACN 122019769 ACN122019769 ACN 122019769ACN-122019769-A

Abstract

The invention discloses an AI (advanced technology attachment) generated text detection method and system based on multi-granularity hidden space feature enhancement, and belongs to the technical field of AI generated text detection. The method comprises the steps of 1, obtaining text generated by AI and human text data, carrying out voxelization on the text data and converting the text data into word vectors, 2, extracting coarse-granularity document-level features and fine-granularity fragment-level features of the text, 3, injecting noise into document-level and fragment-level feature hidden spaces, 4, denoising the document-level features and the fragment-level features by using a reconstruction network to restore original features, 5, respectively carrying out contrast learning and calculating contrast loss by using the document-level features and the fragment-level features restored by positive and negative samples, 6, merging the document-level text features and the fragment-level text features to carry out text classification, 7, inputting a test set into a trained model to obtain a recognition result of the text generated by AI, and compared with the prior art, in a character disturbance scene, the method improves the robustness of the text generated by the AI.

Inventors

LI KAN
LIU XIN

Assignees

北京理工大学

Dates

Publication Date: 20260512
Application Date: 20251201

Claims (5)

1. The AI generated text detection method based on multi-granularity hidden space feature enhancement is characterized by comprising the following steps, Step 1, acquiring AI generated text and human text data, and converting the text data into word vectors by means of tokenization; Step 2, extracting coarse-granularity document-level features and fine-granularity fragment-level features of the text; inputting word vectors into a pre-training language model, outputting to obtain hidden state vectors corresponding to each word element of each layer of the pre-training language model through calculation of the pre-training language model, and taking the hidden state vectors corresponding to the first word element of the last layer of the pre-training language model as coarse-granularity document-level features of corresponding text data; 2.2, inputting word vectors into a disconnected cyclic neural network, and outputting hidden state vectors corresponding to text fragments in a sliding window with a specified size; Step 2.3, merging the hidden layer state vectors corresponding to the text fragments obtained in the step 2.2 into a feature vector through pooling operation, and taking the feature vector as fine-granularity fragment level features of the corresponding text data; step 3, injecting noise into the document-level and fragment-level feature hidden spaces; step 4, denoising the document-level and fragment-level features by using a reconstruction network, and restoring original features; Step 4.1, inputting a low-dimensional hidden space characteristic representation z' view with noise into a reconstruction network for decoding, wherein the structure of the reconstruction network is a multi-layer perceptron, and outputting to obtain a reconstruction characteristic with noise removed as shown in a formula (4); Wherein, the In order to reconstruct the feature vector after removing the noise, Reconstructing the network; Step 4.2, calculating the reconstructed feature vector Mean square error with the original eigenvector h view As shown in formula (5); Step 4.3 calculating the average reconstruction loss of the reconstruction network over the entire training set data As shown in formula (6); Wherein, the For test set data, β is a hyper-parameter used to adjust the influence of the regularization term; Step 5, respectively utilizing the document level and fragment level characteristics of the positive and negative samples after reduction to conduct contrast learning, wherein the positive samples refer to all samples with the same category as each other as positive samples, the negative samples refer to all samples with different categories as negative samples, and the contrast loss shown in the formula (7) is calculated Where n + represents the number of positive samples, Representing the reconstructed characteristics of the positive samples, Representing the reconstructed features of all positive and negative samples except the sample itself, sim represents similarity calculation, τ is the temperature coefficient; Step 6, fusing document-level and segment-level text features to classify texts, and learning AI generated text detection tasks; and 7, inputting the test set into the trained model to obtain the identification result of the AI generated text.
2. The AI generated text detection method based on multi-granularity hidden space feature enhancement of claim 1, wherein step 1 is implemented by, Dividing AI generated text and human text into a training set and a testing set; step 1.2, performing tokenization processing on the text by using a word segmentation device of a pre-training language model; and 1.3, converting the tokenized data into word vectors by utilizing a word embedding layer of the pre-training language model.
3. The AI generated text detection method based on multi-granularity hidden space feature enhancement of claim 1, wherein step3 is implemented by, Step 3.1, respectively inputting the document-level and fragment-level feature vectors into a multi-layer perceptron, and encoding the two feature vectors into a low-dimensional hidden space as shown in a formula (1): wherein z doc and z seg represent document-level and fragment-level feature vectors, respectively, in a low-dimensional hidden space, And Respectively representing a multi-layer perceptron for encoding document-level and fragment-level feature vectors into a low-dimensional hidden space, h doc and h seg respectively representing document-level and fragment-level feature vectors in an original feature space; Step 3.2 decomposing the document-level and fragment-level feature vectors z view in the low-dimensional hidden space into semantic terms using two fully connected layers, respectively Disturbance term Step 3.3, disturbance term Medium-injected gaussian noise in combination with semantic terms The low-dimensional hidden space characteristic with noise, which is shown as a formula (2), represents z' view ; Wherein, the Representing the gaussian noise and the noise level of the signal, A vector representing 1 for one element; Step 3.4 utilizing canonical terms Semantic item pair Disturbance term The scale and the noise size are constrained as shown in the formula (3); where α is a super parameter used to adjust the degree of constraint on noise.
4. The AI generated text detection method based on multi-granularity hidden space feature enhancement of claim 1, wherein step 6 is implemented by, Step 6.1 document level reconstruction features And fragment level reconstruction features Fusing, inputting the classifier to output corresponding class probability, and calculating binary cross entropy loss as class loss Step 6.2, calculating the loss function of the whole model As shown in formula (8); Wherein, gamma 1 and gamma 2 are two super parameters for the importance of the balance and contrast losses; Step 6.3, updating model parameters by using a gradient descent method to obtain a minimum loss function Using minimum loss function Training the model to obtain a trained model.
5. The AI generation text detection system based on multi-granularity hidden space feature enhancement for implementing the method of claim 1, comprising a data preprocessing module, a document level feature extraction module, a segment level feature extraction module, a feature hidden space noise enhancement module, a contrast learning module, a feature fusion module and a classification recognition module; The data preprocessing module is used for carrying out data cleaning and word element embedding processing on an input text and is used as input of the document-level feature extraction module and the fragment-level feature extraction module; the document-level feature extraction module is used for extracting global semantic features from the preprocessed text through a pre-training language model and taking the extracted global semantic features as the input of the feature hidden space noise enhancement module; The segment-level feature extraction module is used for extracting local semantic features by the disconnected cyclic neural network and takes the local semantic features as input of the comparison learning module; The feature hidden space noise enhancement module is used for carrying out noise processing on the global semantic features and the local semantic features by the variation self-encoder and taking the global semantic features and the local semantic features as the input of the feature fusion module; the contrast learning module is used for enhancing the characteristic distinction between positive and negative samples by using contrast loss and takes the contrast distinction as the input of the characteristic fusion module; the feature fusion module is used for fusing the document-level features and the fragment-level features and taking the document-level features and the fragment-level features as the input of the classification recognition module; The classification recognition module is used for AI to generate a text recognition result.

Description

AI generated text detection method and system based on multi-granularity hidden space feature enhancement Technical Field The invention relates to an AI (advanced technology attachment) generated text detection method and system based on multi-granularity hidden space feature enhancement, belongs to the technical field of AI generated text detection, and is applied to an AI generated text fragment level identification scene. Background In recent years, the rise of large language models has greatly facilitated the development of the field of Artificial Intelligence (AI). More and more conversational AI products help people to greatly improve productivity by virtue of their strong text generation capabilities. Since AI is able to generate text that is highly similar to human expression, while helping people to write text, the risk of unreasonable use of AI generated text is increased. In this context, AI-generated text detection has evolved, which utilizes mainly machine learning, deep learning algorithms to efficiently distinguish AI-generated text from human text. The current AI generation text detection methods mainly comprise three types, namely a zero sample detection method, a detection method based on training and a watermark detection method. The zero sample detection method judges whether to generate text for AI by extracting the inherent statistical characteristics of the text and comparing with a predetermined threshold value. Such methods are easy to deploy, but detection performance is difficult to guarantee. The training-based detection method utilizes the annotation data to fine tune the pre-training model to complete detection. Such methods have high detection performance, but typically require a large amount of labeling data and computing resources. The watermark detection method divides the candidate word list into a red list and a green list, and improves the probability of selecting words in the green list when generating the text, thereby expanding the duty ratio of the words in the green list finally generated in the text. Upon detection, the text is considered AI-generated text when the green list word duty cycle in the text exceeds a specified threshold. This approach also has good detection performance, but is typically implemented by the provider of AI services, and lacks flexibility. The current detection methods have the problem of insufficient robustness although the detection methods have thousands of ages. It has been found that applying a small perturbation to the text, such as deleting or adding a letter, exchanging the order of the two letters, and changing a punctuation mark, may interfere with the judgment of the detection model. This greatly affects the reliability of the AI-generated text detection method in a real application scenario. Therefore, how to improve the robustness of the AI generated text recognition model in the character disturbance scene has become a problem to be solved. Disclosure of Invention The invention aims at solving the technical problem of improving the robustness of an AI generated text recognition model in a character disturbance scene, and provides an AI generated text detection method and system based on multi-granularity hidden space feature enhancement. The invention aims at realizing the following technical scheme: The invention discloses an AI generated text detection method based on multi-granularity hidden space feature enhancement, which comprises the following steps: step 1, acquiring AI generated text and human text data, and converting the text data into word vectors by means of tokenization; dividing AI generated text and human text into a training set and a testing set; step 1.2, performing tokenization processing on the text by using a word segmentation device of a pre-training language model; Step 1.3, converting the tokenized data into word vectors by utilizing a word embedding layer of a pre-training language model; Step 2, extracting coarse-granularity document-level features and fine-granularity fragment-level features of the text; inputting word vectors into a pre-training language model, outputting to obtain hidden state vectors corresponding to each word element of each layer of the pre-training language model through calculation of the pre-training language model, and taking the hidden state vectors corresponding to the first word element of the last layer of the pre-training language model as coarse-granularity document-level features of corresponding text data; step 2.2, inputting word vectors into a disconnected cyclic neural network (Disconnected RNN, disconnected Recurrent Neural Network), and outputting hidden layer state vectors corresponding to text fragments in a sliding window with a specified size; Step 2.3, merging the hidden layer state vectors corresponding to the text fragments obtained in the step 2.2 into a feature vector through pooling operation, and taking the feature vector as fine-granularity fragme