CN-122022008-A - Method, device and equipment for processing project data based on large model

CN122022008ACN 122022008 ACN122022008 ACN 122022008ACN-122022008-A

Abstract

The embodiment of the application provides a method, a device and equipment for processing project data based on a large model. The method comprises the steps of obtaining initial project data and prediction requirements of a target project, carrying out correction processing on the initial project data based on a large model to obtain project data to be processed, wherein the project data to be processed comprises static attribute fields and dynamic attribute fields, adjusting a preset initial prediction model based on the large model, the project data to be processed and the prediction requirements to obtain a multi-source feature fusion prediction model matched with the prediction requirements, extracting fusion feature vectors matched with the prediction requirements from the static attribute fields and the dynamic attribute fields of the project data to be processed based on the multi-source feature fusion prediction model and the prediction requirements, and processing the fusion feature vectors based on the multi-source feature fusion prediction model and the prediction requirements to obtain a prediction result corresponding to the prediction requirements. The method is used for achieving the effects of improving project data processing efficiency and reducing model development cost.

Inventors

LEI XIAOCHEN
REN FAN
LU YIRUI
YAO FUMING
BAI WEIGANG

Assignees

上海远图未来信息技术有限公司

Dates

Publication Date: 20260512
Application Date: 20251226

Claims (14)

1. A method for processing project data based on a large model, the method comprising: Acquiring initial project data and predicted requirements of a target project; Correcting the initial project data based on a large model to obtain project data to be processed, wherein the project data to be processed comprises static attribute fields and dynamic attribute fields; based on the large model, according to the project data to be processed and the prediction demand, adjusting a preset initial prediction model to obtain a multi-source feature fusion prediction model matched with the prediction demand; extracting fusion feature vectors matched with the prediction requirements from static attribute fields and dynamic attribute fields of the item data to be processed based on the multi-source feature fusion prediction model and the prediction requirements; And processing the fusion feature vector based on the multi-source feature fusion prediction model and the prediction requirement to obtain a prediction result corresponding to the prediction requirement.
2. The method according to claim 1, wherein the modifying the initial project data based on the large model to obtain project data to be processed includes: Classifying field attributes of fields in the initial project data based on a large model to obtain project data to be corrected, wherein each field in the project data to be corrected has field attributes, and the field attributes comprise static attribute fields and dynamic attribute fields; determining the problem type of a field to be corrected in the project data to be corrected based on a large model; Determining a correction mode corresponding to the field to be corrected based on a preset knowledge base, the problem type of the field to be corrected and the field attribute of the field to be corrected, wherein the preset knowledge base comprises correction modes corresponding to the problem types of different field attributes; And carrying out correction processing on the field to be corrected according to a correction mode corresponding to the field to be corrected based on a preset tool chain in the large model to obtain the item data to be processed, wherein data correction logic is preset in the preset tool chain.
3. The method according to claim 2, wherein classifying field attributes of the fields in the initial item data based on the large model to obtain item data to be corrected includes: constructing a query vector aiming at a field to be classified in the initial item data based on field information of the field to be classified; determining association information matched with the field to be classified from a preset knowledge base according to the preset knowledge base and the query vector; and determining the field attribute of the field to be classified according to the association information based on the large model, wherein the item data to be corrected is initial item data after field attribute classification.
4. The method of claim 1, wherein the extracting, based on the multi-source feature fusion prediction model and the prediction requirements, a fusion feature vector matching the prediction requirements from static attribute fields and dynamic attribute fields of the item data to be processed comprises: Performing feature vectorization processing on the dynamic attribute fields of the item data to be processed based on a dynamic feature selection module in the multi-source feature fusion prediction model to obtain initial dynamic attribute feature vectors corresponding to the dynamic attribute fields; Based on the dynamic characteristic selection module, screening the initial dynamic attribute characteristic vector according to the prediction requirement to obtain a dynamic attribute characteristic vector; Extracting a static attribute feature vector from a static attribute field of the item data to be processed based on a static attribute injection module in the multi-source feature fusion prediction model, wherein the static attribute feature vector is consistent with the dimension of the dynamic attribute feature vector; And based on a coding module in the multisource feature fusion prediction model, splicing the static attribute feature vector and the dynamic attribute feature vector to obtain the fusion feature vector.
5. The method of claim 4, wherein the dynamic attribute field includes a dynamic history attribute field and a dynamic future-known attribute field, wherein the filtering the initial dynamic attribute feature vector based on the prediction requirement to obtain a dynamic attribute feature vector based on the dynamic feature selection module comprises: based on the dynamic feature selection module, calculating importance weights of all initial dynamic attribute feature vectors to the prediction requirements by adopting an attention mechanism, wherein the initial dynamic attribute feature vectors corresponding to the dynamic history attribute fields and the initial dynamic attribute feature vectors corresponding to the dynamic future known attribute fields are given weight calculation preference in the calculation process; and screening the initial dynamic attribute feature vector based on the importance weight to obtain a dynamic attribute feature vector.
6. The method of claim 4, wherein the static attribute fields include a classification type static attribute field and a numerical static attribute field, wherein the extracting a static attribute feature vector from the static attribute field of the item data to be processed based on the static attribute injection module in the multi-source feature fusion prediction model comprises: based on the static attribute injection module, performing embedded conversion processing on the classified static attribute fields in the item data to be processed to obtain a first vector; Based on the static attribute injection module, carrying out normalization processing on the numerical static attribute fields in the item data to be processed, and carrying out linear transformation processing on the numerical static attribute fields after normalization processing to obtain a second vector; And obtaining the static attribute feature vector based on the first vector and the second vector.
7. The method of claim 1, wherein the fusion feature vector includes a dynamic attribute feature vector and a static attribute feature vector, and wherein the processing the fusion feature vector based on the multi-source feature fusion prediction model and the prediction requirement to obtain a prediction result corresponding to the prediction requirement includes: Based on a long sequence modeling module in the multi-source feature fusion prediction model, a preset gating unit network is adopted to code part of feature vectors in the fusion feature vectors step by step in time to obtain a coding feature matrix, wherein the part of feature vectors represent dynamic attribute feature vectors with the time sequence length being greater than a preset threshold value; Based on a double-axis attention module in the multi-source feature fusion prediction model, carrying out attention feature fusion processing on the coding feature matrix according to the prediction requirement to obtain a double-axis fusion feature matrix; Based on a modulation module in the multi-source feature fusion prediction model, modulating the double-shaft fusion feature matrix according to the prediction requirement to obtain a modulated feature matrix; And carrying out prediction processing on the modulated feature matrix based on a prediction head in the multi-source feature fusion prediction model to obtain a prediction result corresponding to the prediction requirement.
8. The method of claim 7, wherein the performing, based on the dual-axis attention module in the multi-source feature fusion prediction model, attention feature fusion processing on the encoded feature matrix according to the prediction requirement to obtain a dual-axis fusion feature matrix includes: Performing decoding pretreatment on the coding feature matrix to obtain a pretreated coding feature matrix; Determining a time axis attention weight based on the time granularity represented by the predicted demand and the preprocessed coding feature matrix, wherein the time axis attention weight represents the weight among different time step dynamic attribute feature vectors in the preprocessed coding feature matrix; Weighting the preprocessed coding feature matrix based on the time axis attention weight to obtain a time axis attention output feature matrix; Determining a variable axis attention weight based on the preprocessed coding feature matrix, wherein the variable axis attention weight represents weights among different feature vectors in the preprocessed coding feature matrix; weighting the preprocessed coding feature matrix based on the variable axis attention weight to obtain a variable axis attention output feature matrix; And carrying out weighted fusion processing on the time axis attention output characteristic matrix and the variable axis attention output characteristic matrix to obtain the biaxial fusion characteristic matrix.
9. The method of claim 7, wherein the modulating the dual-axis fusion feature matrix based on the modulation module in the multi-source feature fusion prediction model according to the prediction requirement to obtain a modulated feature matrix comprises: Screening the characteristic channels of the biaxial fusion characteristic matrix based on a gating residual error network in the modulation module to obtain a screened characteristic matrix; Determining a scaling coefficient and an offset coefficient corresponding to a scene tag based on the scene tag characterized by the predicted demand; And based on a characteristic channel-by-characteristic channel linear modulation module in the modulation module, modulating the filtered characteristic matrix based on the scaling coefficient and the offset coefficient to obtain the modulated characteristic matrix.
10. The method according to any one of claims 1 to 9, wherein the adjusting, based on the large model, a preset initial prediction model according to the to-be-processed project data and the predicted demand to obtain a multi-source feature fusion prediction model matched with the predicted demand includes: Training the initial prediction model based on the to-be-processed project data and the prediction demands to obtain an intermediate prediction model; Determining a state vector of the intermediate prediction model based on the large model, wherein the state vector characterizes model precision of an initial prediction model; If the state vector is determined not to meet the preset condition, based on the large model, adjusting parameters of the intermediate prediction model according to the state vector until model accuracy of the intermediate prediction model meets the preset condition; and the intermediate prediction model obtained when the preset conditions are met is the multi-source feature fusion prediction model.
11. The method of claim 10, wherein the state vector comprises one or more of training loss curve morphology, gradient stability, model output bias structure information, data state change information, scene description tags; The training loss curve form represents the form of a curve of a loss function of an intermediate prediction model, the gradient stability represents the gradient norm of the intermediate prediction model, the model output deviation structure information represents the deviation distribution characteristics of a prediction result and a true value of the intermediate prediction model, the data state change information represents the distribution characteristics of item data to be processed, and the scene description label represents a scene identifier corresponding to the prediction requirement.
12. A large model-based item data processing apparatus, comprising: The acquisition module is used for acquiring initial project data and predicted requirements of a target project; The correction module is used for correcting the initial project data based on the large model to obtain the project data to be processed, wherein the project data to be processed comprises a static attribute field and a dynamic attribute field; The training module is used for adjusting a preset initial prediction model based on the large model according to the to-be-processed project data and the prediction demand to obtain a multi-source feature fusion prediction model matched with the prediction demand; the extraction module is used for extracting fusion feature vectors matched with the prediction requirements from static attribute fields and dynamic attribute fields of the item data to be processed based on the multi-source feature fusion prediction model and the prediction requirements; And the prediction module is used for processing the fusion feature vector based on the multi-source feature fusion prediction model and the prediction requirement to obtain a prediction result corresponding to the prediction requirement.
13. An electronic device is characterized by comprising a memory and a processor; The memory stores computer-executable instructions; The processor executing computer-executable instructions stored in the memory, causing the processor to perform the method of any one of claims 1-11.
14. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out the method of any one of claims 1-11.

Description

Method, device and equipment for processing project data based on large model Technical Field The present application relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, and a device for processing project data based on a large model. Background In the field of software development, time sequence prediction based on project data is a core support for guaranteeing efficient project promotion and accurate resource configuration. The data such as man-hour consumption, resource investment, progress promotion and the like generated in the project operation process all show remarkable time sequence characteristics, and key information such as man-hour requirements, project period nodes, resource gaps and the like in a future period can be obtained in advance through time sequence prediction. In the prior art, based on time sequence prediction of project data, the project data is usually subjected to data cleaning manually, and then the processed project data is subjected to prediction processing by adopting various deep learning models. However, in the above manner, the efficiency of manually cleaning the data is low, and in order to adapt to multiple prediction requirements of the same project, a plurality of models are deployed, so that not only is the model development and training cost increased, but also the inherent association of the data of the same project is difficult to fully utilize, so that the consistency of the multiple requirement prediction results is poor, and the scene suitability is insufficient. Disclosure of Invention The method, the device and the equipment for processing the project data based on the large model are used for achieving the effects of improving the project data processing efficiency and reducing the model development cost. In a first aspect, an embodiment of the present application provides a method for processing project data based on a large model, including: Acquiring initial project data and predicted requirements of a target project; Correcting the initial project data based on a large model to obtain project data to be processed, wherein the project data to be processed comprises static attribute fields and dynamic attribute fields; based on the large model, according to the project data to be processed and the prediction demand, adjusting a preset initial prediction model to obtain a multi-source feature fusion prediction model matched with the prediction demand; extracting fusion feature vectors matched with the prediction requirements from static attribute fields and dynamic attribute fields of the item data to be processed based on the multi-source feature fusion prediction model and the prediction requirements; And processing the fusion feature vector based on the multi-source feature fusion prediction model and the prediction requirement to obtain a prediction result corresponding to the prediction requirement. In one possible implementation manner, the modifying the initial project data based on the large model to obtain the project data to be processed includes: Classifying field attributes of fields in the initial project data based on a large model to obtain project data to be corrected, wherein each field in the project data to be corrected has field attributes, and the field attributes comprise static attribute fields and dynamic attribute fields; determining the problem type of a field to be corrected in the project data to be corrected based on a large model; Determining a correction mode corresponding to the field to be corrected based on a preset knowledge base, the problem type of the field to be corrected and the field attribute of the field to be corrected, wherein the preset knowledge base comprises correction modes corresponding to the problem types of different field attributes; And carrying out correction processing on the field to be corrected according to a correction mode corresponding to the field to be corrected based on a preset tool chain in the large model to obtain the item data to be processed, wherein data correction logic is preset in the preset tool chain. In a possible implementation manner, the classifying the field attribute of the field in the initial item data based on the big model to obtain the item data to be corrected includes: constructing a query vector aiming at a field to be classified in the initial item data based on field information of the field to be classified; determining association information matched with the field to be classified from a preset knowledge base according to the preset knowledge base and the query vector; and determining the field attribute of the field to be classified according to the association information based on the large model, wherein the item data to be corrected is initial item data after field attribute classification. In a possible implementation manner, the extracting, based on the multi-source feature fusion prediction model a