CN-122021939-A - Emergency plan digital generation system based on vector knowledge base and large model fine adjustment
Abstract
The invention discloses an emergency plan digitalized generation system based on vector knowledge base and large model fine tuning, which relates to the technical field of electric digital data processing and can solve the problems of how to improve the professional accuracy and scene pertinence of a plan generated by a language model, and comprises a data acquisition module for acquiring historical emergency plan data; the system comprises a knowledge base construction module, a model training module, a prediction generation module and a prediction generation module, wherein the knowledge base construction module is used for constructing a vector knowledge base according to historical emergency prediction data, the accompanying word vector module is used for identifying words with different semantics in a plurality of demand scenes, constructing a plurality of accompanying word vectors corresponding to different demand scenes for each word according to the vector knowledge base, the model training module is used for adjusting and training a preset language model according to the historical emergency prediction data and the plurality of accompanying word vectors to obtain an emergency prediction generation model, and the prediction generation module is used for determining corresponding accompanying word vectors from the plurality of accompanying word vectors according to the demand scenes of the latest demand instructions and determining an emergency prediction text by combining the emergency prediction generation model.
Inventors
- ZHAO BINGWEN
Assignees
- 江苏伟岸纵横科技股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260403
Claims (10)
- 1. The emergency plan digital generation system based on vector knowledge base and large model fine tuning is characterized by comprising a data acquisition module, a knowledge base construction module, an associated word vector module, a model training module and a plan generation module; the data acquisition module is used for acquiring historical emergency plan data, wherein the historical emergency plan data comprises a demand instruction text aiming at a historical emergency event and a corresponding standard plan text; The knowledge base construction module is used for constructing a vector knowledge base according to the historical emergency plan data, wherein the vector knowledge base comprises keyword vectors obtained after text content is converted; The associated word vector module is used for identifying vocabularies with different semantics in a plurality of demand scenes and constructing a plurality of associated word vectors corresponding to the different demand scenes for each vocabulary according to the vector knowledge base, wherein the associated word vectors are vector representations obtained by carrying out scene-based semantic adjustment on keyword vectors in the vector knowledge base; the model training module is used for adjusting and training a preset language model according to the historical emergency plan data and the plurality of companion word vectors to obtain an emergency plan generation model; The plan generating module is used for determining corresponding associated word vectors from the plurality of associated word vectors according to the demand scene of the latest demand instruction, and determining an emergency plan text according to the associated word vectors corresponding to the latest demand instruction and the emergency plan generating model.
- 2. The system for digitally generating an emergency plan based on vector knowledge base and large model fine tuning of claim 1, wherein the knowledge base construction module, when constructing a vector knowledge base from the historical emergency plan data, specifically performs the following steps: The knowledge base construction module is also used for identifying non-Chinese characters in the historical emergency plan data; the knowledge base construction module is further used for determining candidate vocabularies according to the contexts of the non-Chinese characters for each non-Chinese character, wherein the candidate vocabularies are formed by character sequences associated with the non-Chinese characters; the knowledge base construction module is further used for determining target keywords from the candidate words according to the recurrence degree of each candidate word in the historical emergency plan data; the knowledge base construction module is further used for vectorizing the conventional text vocabulary and the target keywords of all non-Chinese characters to obtain the vector knowledge base containing conventional keyword vectors and special keyword vectors.
- 3. The system for generating the emergency plan digitization based on vector knowledge base and large model fine tuning of claim 2, wherein the knowledge base construction module specifically performs the following steps when determining a target keyword from the candidate vocabulary according to the recurrence degree of each candidate vocabulary in the history emergency plan data: the knowledge base construction module is further used for determining the recurrence degree of each candidate vocabulary according to the occurrence times of each candidate vocabulary in the historical emergency plan data and the average value of the occurrence times of the candidate vocabulary corresponding to each non-Chinese character; The knowledge base construction module is further configured to determine the candidate vocabulary with the recurrence degree greater than a preset recurrence threshold as the target keyword.
- 4. The system for generating a digitized emergency plan based on fine tuning of a vector knowledge base and a large model according to claim 1, wherein the associated word vector module, when recognizing words having different semantics in a plurality of demand scenarios, constructs a plurality of associated word vectors corresponding to different demand scenarios for each word according to the vector knowledge base, specifically performs the following steps: The associated word vector module is further used for dividing the demand instruction text into different demand scene cluster types according to the semantics of the demand instruction text; the associated word vector module is further used for determining semantic specificity evaluation values of words in the vector knowledge base in different demand scene clusters, and identifying multi-semantic words with different semantics in a plurality of demand scenes according to the semantic specificity evaluation values; The associated word vector module is further used for determining cross-scene sensitivity according to the semantic specificity evaluation value of each multi-semantic word and determining a target word from the multi-semantic words according to the cross-scene sensitivity, wherein the target word is the word of which the cross-scene sensitivity is greater than a preset sensitivity threshold value and is used for constructing an associated word vector; The associated word vector module is further configured to construct a plurality of associated word vectors for each target vocabulary, and establish an association relationship between each associated word vector and a corresponding demand scene cluster.
- 5. The system for digitally generating an emergency plan based on vector knowledge base and large model fine tuning of claim 4, wherein the associated word vector module, when dividing the demand instruction text into different demand scene clusters according to the semantics of the demand instruction text, specifically performs the following steps: The associated word vector module is further used for extracting keywords contained in each demand instruction text and obtaining keyword vectors corresponding to the keywords in the vector knowledge base so as to form vectorized representations of each demand instruction text; The associated word vector module is further configured to determine semantic similarity between the vectorized representations of each pair of demand instruction texts, and aggregate the demand instruction texts into a plurality of demand scene clusters according to all the computed semantic similarity and a preset clustering algorithm.
- 6. The system for generating the emergency plan digitization based on the vector knowledge base and the large model fine tuning of claim 4, wherein the associated word vector module, when determining the semantic specificity evaluation values of the vocabulary in the vector knowledge base in the different requirement scene clusters, specifically performs the following steps: The associated word vector module is further used for determining the occurrence frequency of each word to be analyzed in the vector knowledge base in each required scene cluster; The associated word vector module is further configured to input a demand instruction text in the demand scene cluster class into a preset language model for each demand scene cluster class, obtain a generated text output by the preset language model before training, and determine a context semantic feature distribution of the vocabulary to be analyzed in the output text; The associated word vector module is further configured to determine a semantic specificity evaluation value of the vocabulary to be analyzed in the requirement scene cluster according to the occurrence frequency and the context semantic feature distribution.
- 7. The system for generating a digitized emergency plan based on vector knowledge base and large model fine tuning of claim 4, wherein said associated word vector module, when determining a cross-scene sensitivity according to the semantic specificity evaluation value of each of said multi-semantic words, specifically performs the following steps: the associated word vector module is further used for acquiring semantic specificity evaluation values and occurrence frequencies of the multi-semantic words in all the required scene clusters according to each multi-semantic word; The associated word vector module is further configured to determine a cross-scene sensitivity of each multi-semantic word according to the semantic specificity evaluation value and the occurrence frequency of each multi-semantic word in all the required scene clusters.
- 8. The system for generating a digitized emergency plan based on fine tuning of a vector knowledge base and a large model according to claim 1, wherein the model training module performs the following steps when training the adjustment of the preset language model according to the historical emergency plan data and the plurality of associated word vectors: The model training module is also used for calling associated word vectors to replace original vector representations of corresponding words in the input text according to the demand scene cluster to which the current input historical demand instruction text belongs, so as to form training input; The model training module is also used for determining a training loss value according to the difference between the predicted text output by the training input after model calculation and the corresponding standard plan text; the model training module is further used for updating model parameters of the preset language model according to the training loss value to generate the emergency plan generation model.
- 9. The system for generating a digital emergency plan based on fine tuning of a vector knowledge base and a large model according to claim 1, wherein the plan generating module determines a corresponding associated word vector from the plurality of associated word vectors according to a requirement scene of a latest requirement instruction, and determines an emergency plan text according to the associated word vector corresponding to the latest requirement instruction and the emergency plan generating model, specifically performs the following steps: The plan generating module is further configured to determine a target demand scene cluster class to which the latest demand instruction belongs; The plan generating module is further configured to replace a vocabulary belonging to the constructed associated word vector in the latest demand instruction with an associated word vector having an association relationship with the target demand scene cluster, so as to obtain an input representation of scene adaptation; the plan generating module is further used for inputting the input representation of the scene adaptation to the emergency plan generating model and outputting the emergency plan text.
- 10. The system for digitally generating an emergency plan based on vector knowledge base and large model fine tuning of any one of claims 1-9, wherein the data acquisition module, when acquiring historical emergency plan data, specifically performs the steps of: The data acquisition module is also used for acquiring an emergency event description text and a corresponding emergency response scheme text from the history record source; the data acquisition module is also used for cleaning, format standardization and pairing association processing on the acquired emergency event description text and the emergency response scheme text, and generating the structured historical emergency plan data.
Description
Emergency plan digital generation system based on vector knowledge base and large model fine adjustment Technical Field The invention relates to the technical field of electric digital data processing, in particular to an emergency plan digital generation system based on vector knowledge base and large model fine tuning. Background With the penetration of digital transformation in the field of emergency management, the traditional mode of compiling an emergency plan by relying on manual experience is gradually transformed into intelligent auxiliary decision making. Under the trend, how to quickly generate an emergency plan which meets the specific accident situation and is high in quality and standard by utilizing unstructured text knowledge such as existing massive historical plans, legal cases and the like becomes a key technical direction for improving emergency response capability. Currently, the prior art attempts to directly generate text using a large language model or to obtain relevant knowledge segments in combination with conventional retrieval methods. However, these methods generally treat text as a generic corpus, fail to fully consider the characteristics of emergency plan text that is highly specialized, contains a large number of domain-specific acronyms and symbols, and the same term may contain different operational semantics in different emergency scenarios. The method leads to the fact that the prepared text generated by the existing scheme often has defects in terms of professionality, accuracy and scene pertinence, and the actual requirement of high-reliability emergency decision is difficult to meet. Disclosure of Invention In order to solve the technical problems of how to enable a language model to be in an emergency plan generating task at present, accurately understand and distinguish specific semantics of ambiguous vocabulary in the field under different emergency scenes, thereby improving the professional accuracy and scene pertinence of the generated plan, the invention aims to provide an emergency plan digital generating system based on vector knowledge base and large model fine tuning, and the adopted technical scheme is as follows: The invention provides an emergency plan digitization generation system based on vector knowledge base and large model fine tuning, which comprises a data acquisition module, a knowledge base construction module, an associated word vector module, a model training module and a plan generation module, wherein the data acquisition module is used for acquiring historical emergency plan data, the historical emergency plan data comprises a demand instruction text and a corresponding standard plan text aiming at a historical emergency event, the knowledge base construction module is used for constructing a vector knowledge base according to the historical emergency plan data, the vector knowledge base comprises a keyword vector obtained by converting text content, the associated word vector module is used for identifying words with different semantics in a plurality of demand scenes and constructing a plurality of associated word vectors corresponding to different demand scenes according to the vector knowledge base, the associated word vector is a vector representation obtained by carrying out scene-based semantic adjustment on the keyword vectors in the vector knowledge base, the model training module is used for carrying out adjustment training on a preset language model according to the historical emergency plan data and the plurality of associated words, the emergency plan generation module is used for obtaining an emergency plan generation model, and the associated word vector is used for determining the associated word vector corresponding to the emergency plan demand instruction from the emergency plan generation module according to the new demand instruction and the emergency plan. The emergency plan digitalized generation method based on the vector knowledge base and the large model fine tuning comprises the steps of obtaining historical emergency plan data, wherein the historical emergency plan data comprise demand instruction texts and corresponding standard plan texts aiming at historical emergency events, constructing a vector knowledge base according to the historical emergency plan data, wherein the vector knowledge base comprises keyword vectors obtained after text content is converted, recognizing vocabularies with different semantics in a plurality of demand scenes, constructing a plurality of associated word vectors corresponding to different demand scenes according to the vector knowledge base, wherein the associated word vectors are vector representations obtained by carrying out scene-based semantic adjustment on the keyword vectors in the vector knowledge base, carrying out adjustment training on a preset language model according to the historical emergency plan data and the plurality of associated word vectors, obtaining an emer