CN-122021796-A - Super digital intelligent police system generation method and program product based on endogenous security element knowledge alignment

CN122021796ACN 122021796 ACN122021796 ACN 122021796ACN-122021796-A

Abstract

The invention discloses a super digital intelligent police system generating method and program product based on alignment of endogenous safety element knowledge, the method comprises the steps of determining a basic model, adopting a safety question-answer test sample to conduct forward reasoning on the basic model, recording output activation values, quantifying safety sensitivity indexes of each layer of the basic model according to the output activation values, screening safety sensitive layers, constructing a weight matrix and a bias matrix by using weighted sum of the safety sensitive layers and random initialization layers, constructing an auxiliary model taking the weight matrix and the bias matrix as parameters, training the auxiliary model to obtain optimal values of the weight matrix and the bias matrix as an endogenous safety element knowledge block, expanding the basic digital intelligent police system based on the endogenous safety element knowledge block, conducting fine adjustment after intelligent police scene knowledge is injected, and generating the super digital intelligent police system. The invention has the advantages of high reliability and high safety of the model.

Inventors

YANG GUANGXUE

Assignees

南通市公安局

Dates

Publication Date: 20260512
Application Date: 20260205

Claims (10)

1. The method for generating the super digital intelligent police system based on the alignment of the endogenous security element knowledge is characterized by comprising the following steps of: Determining a large language model which is subjected to general safety training as a basic model, adopting a safety question-answer test sample to forward reasoning the basic model, and recording an output activation value of each layer of the basic model in each test sample; quantifying the safety sensitivity index of each layer of the basic model according to the output activation value, and taking the layer with the safety sensitivity index lower than a preset threshold value as a safety sensitive layer; constructing a weight matrix and a bias matrix by using the weighted sum of the security sensitive layer and the random initialization layer; Constructing an auxiliary model taking the weight matrix and the bias matrix as parameters, taking the basic model as a teacher, training the auxiliary model by using a distillation technology, and finally training to obtain optimal values of the weight matrix and the bias matrix as an endogenous safety element knowledge block; Based on the endogenous security element knowledge block, a basic digital intelligent police system meeting the actual demand scale is expanded; and injecting the intelligent police scene knowledge into the basic digital intelligent police system to perform fine adjustment, so as to generate the super digital intelligent police system.
2. The method for generating a super digital intelligent police system based on endogenous security element knowledge alignment according to claim 1, wherein the security question-answer test samples comprise a plurality of positive security question-answer samples and a plurality of negative security question-answer samples.
3. The method for generating the super digital intelligent police system based on the alignment of the intrinsic safety meta-knowledge according to claim 1, wherein the quantifying the safety sensitivity index of each layer of the basic model according to the output activation value specifically comprises: Randomly sampling K groups of question-answer samples aiming at each layer of the basic model, wherein each group of question-answer samples comprises a positive safety question-answer sample and a negative safety question-answer sample; Aiming at each layer of the basic model, calculating the cosine similarity of the output activation value of the positive safety question-answering sample in the layer and the output activation value of the negative safety question-answering sample in the layer in each group of question-answering samples; and counting the average value of cosine similarity corresponding to all groups of question-answer samples for each layer of the basic model, and taking the average value as a safety sensitivity index of the layer.
4. The method for generating the super digital intelligent police system based on the alignment of the intrinsic safety element according to claim 1, wherein the construction of the weight matrix and the bias matrix by the weighted sum of the safety sensitive layer and the random initialization layer specifically comprises the following steps: constructing a weight matrix with the dimensions identical to those of the single-layer transducer structure parameters of the basic model according to the following steps Bias matrix : , , Wherein, the In order to randomly initialize the layer(s), For a set of all security sensitive layers The j-th security-sensitive layer of (c), 、 Is a learnable parameter.
5. The method for generating the super digital intelligent police system based on the alignment of the intrinsic safety meta-knowledge according to claim 1, wherein the number of layers of the auxiliary model Less than a preset threshold, and the parameters of each layer are: , In the formula, Representing auxiliary model number The parameters of the layer are set to be, As a matrix of weights, the weight matrix, The matrix of the offset is set, Representing the normalized position scale.
6. The method for generating the super digital intelligent police system based on the alignment of the endogenous safety meta-knowledge according to claim 1, wherein the basic digital intelligent police system meeting the actual demand scale is expanded based on the endogenous safety meta-knowledge block, and the method specifically comprises the following steps: determining the number of layers N of a basic digital intelligent police service system according to actual deployment requirements; According to the endogenous safety element knowledge block, according to the same linear expansion rule, expanding a vertical domain large model of an N layer to serve as a basic digital intelligent police system, wherein parameters of an N layer of the basic digital intelligent police system are as follows: , Wherein, the Is the parameter of the nth layer of the basic digital intelligent police system, 、 For the endogenous security element knowledge block, i.e. the optimal value of the weight matrix and the bias matrix, To expand the position scale.
7. The method for generating the super digital intelligent police system based on the alignment of the endogenous safety element knowledge according to claim 1, wherein the step of injecting the intelligent police scene knowledge into the basic digital intelligent police system for fine tuning to generate the super digital intelligent police system specifically comprises the following steps: Collecting a smart police scene data set related to smart police; fine tuning the underlying digital intelligent police system using a universal safety dialogue dataset; and (3) performing deep instruction fine adjustment on the fine-adjusted model by using the collected intelligent police scene data set to obtain the super digital intelligent police system.
8. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the method according to any of claims 1-7.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to implement the method of any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program/instruction is stored, characterized in that the computer program/instruction, when executed by a processor, implements the method of any of claims 1-7.

Description

Super digital intelligent police system generation method and program product based on endogenous security element knowledge alignment Technical Field The invention relates to an artificial intelligence technology, in particular to a super digital intelligent police system generation method and a program product based on endogenous safety element knowledge alignment. Background With the rapid development of artificial intelligence technology, large models have demonstrated excellent intelligent processing capabilities in a number of vertical fields. In the public safety field, an intelligent large model oriented to a super digital intelligent police system is constructed, and the method has important significance for improving police decision efficiency and realizing intelligent early warning and response. However, digital police systems have extremely high requirements for data security, model reliability and behavior controllability compared to other vertical fields. Directly using a general large model or performing simple scene fine tuning, security risks, such as generation of harmful content, leakage of confidential information, or misleading results by malicious guidance, are extremely easy to introduce. Currently, the dominant technical routes for large model security alignment include reinforcement learning (RHFL) based on human feedback, direct preference learning (DPO), security fine tuning, and the like. Although the safety of the model can be improved to a certain extent, the method has obvious limitations that firstly, a large amount of high-quality safety labeling data is relied on, the cost is high, all risk scenes are difficult to cover, and secondly, most of the method is an external safety mechanism, and the internal safety cannot be realized by penetrating into the internal structure of the model, so that the generalization capability is insufficient when facing novel attack or complex countermeasure samples. Disclosure of Invention Aiming at the problems existing in the prior art, the invention aims to provide a super digital intelligent police system generating method and a program product based on the alignment of endogenous safety element knowledge, which simultaneously meet the requirements of data safety and model reliability. In order to achieve the above object, the present invention provides the following technical solutions: a super digital intelligent police system generation method based on endogenous security element knowledge alignment comprises the following steps: Determining a large language model which is subjected to general safety training as a basic model, adopting a safety question-answer test sample to forward reasoning the basic model, and recording an output activation value of each layer of the basic model in each test sample; quantifying the safety sensitivity index of each layer of the basic model according to the output activation value, and taking the layer with the safety sensitivity index lower than a preset threshold value as a safety sensitive layer; constructing a weight matrix and a bias matrix by using the weighted sum of the security sensitive layer and the random initialization layer; Constructing an auxiliary model taking the weight matrix and the bias matrix as parameters, taking the basic model as a teacher, training the auxiliary model by using a distillation technology, and finally training to obtain optimal values of the weight matrix and the bias matrix as an endogenous safety element knowledge block; Based on the endogenous security element knowledge block, a basic digital intelligent police system meeting the actual demand scale is expanded; and injecting the intelligent police scene knowledge into the basic digital intelligent police system to perform fine adjustment, so as to generate the super digital intelligent police system. Further, the safety questioning and answering test samples comprise a plurality of positive safety questioning and answering samples and a plurality of negative safety questioning and answering samples. Further, the quantifying the safety sensitivity index of each layer of the basic model according to the output activation value specifically includes: Randomly sampling K groups of question-answer samples aiming at each layer of the basic model, wherein each group of question-answer samples comprises a positive safety question-answer sample and a negative safety question-answer sample; Aiming at each layer of the basic model, calculating the cosine similarity of the output activation value of the positive safety question-answering sample in the layer and the output activation value of the negative safety question-answering sample in the layer in each group of question-answering samples; and counting the average value of cosine similarity corresponding to all groups of question-answer samples for each layer of the basic model, and taking the average value as a safety sensitivity index of the layer. Further, the cons