Search

CN-121980269-A - Automatic labeling method, system and related equipment for vehicle diagnostic data

CN121980269ACN 121980269 ACN121980269 ACN 121980269ACN-121980269-A

Abstract

The application discloses an automatic labeling method, a system and related equipment of vehicle diagnostic data, wherein the method comprises the steps of receiving multi-source heterogeneous vehicle diagnostic data; the method comprises the steps of carrying out multi-mode feature fusion and automatic labeling through a large language model optimized in the vehicle diagnosis field to obtain a first labeling result, carrying out quality evaluation on the first labeling result to generate an evaluation result comprising an integrity score and a data value rating, carrying out a data modification task on the first labeling result when the evaluation result shows that the first labeling result is high in value but not full in information, obtaining a second labeling result with complete information, storing the second labeling result into a vector knowledge base according to vehicle attributes, and continuously optimizing the labeling model and quality evaluation logic based on feedback data generated by the first labeling result and the second labeling result. The application realizes automatic labeling of vehicle diagnosis data and provides a good data basis for diagnosis of vehicle fault data.

Inventors

  • LIU XIN
  • YANG MINGZHAO

Assignees

  • 深圳市元征科技股份有限公司

Dates

Publication Date
20260505
Application Date
20260129

Claims (10)

  1. 1. An automatic labeling method for vehicle diagnostic data, comprising: Acquiring vehicle diagnostic data, the vehicle diagnostic data including text data and image data; Inputting the vehicle diagnosis data into a labeling model, wherein the labeling model fuses the text data and the image data to output a first labeling result, and the labeling model is a large language model subjected to data optimization in the vehicle diagnosis field; Performing quality evaluation on the first labeling result based on the labeling model, and generating an evaluation result of the data quality of the first labeling result, wherein the evaluation result comprises first data quality and second data quality, the first data quality indicates that the data in the first labeling result is complete, and the second data quality indicates that the data in the first labeling result is wrong or missing; Determining the first labeling result of the first data quality as a second labeling result, wherein the second labeling result contains vehicle attribute information; performing data modification on the first labeling result of the second data quality, re-inputting the first labeling result after data modification into the labeling model for quality evaluation, and iteratively updating the first labeling result until the first labeling result is determined to be the second labeling result; Classifying the second labeling result based on the vehicle attribute information, and storing the second labeling result into a vector database corresponding to the vehicle attribute information; And determining feedback data through the first labeling result and the second labeling result, performing incremental training on the labeling model based on the feedback data, and optimizing the labeling performance of the labeling model.
  2. 2. The method of claim 1, wherein the annotation model aligns and merges visual features of the image data with semantic features of the text data via a cross-modal attentiveness mechanism for enabling joint analysis of multimodal data.
  3. 3. The method of claim 1, wherein the quality evaluation of the first annotation result by the annotation model generates an evaluation result of the data quality of the first annotation result, comprising: The labeling model quantitatively scores the first labeling result based on three dimensions of integrity, accuracy and value degree to generate the evaluation result, wherein the integrity is used for indicating the filling state of fields in the first labeling result, the accuracy is used for indicating the reasonable state of numerical logic in the first labeling result, and the value degree is used for indicating the novelty and complexity of fault cases in the first labeling result.
  4. 4. The method of claim 3, wherein the labeling model quantitatively scores the first labeling result based on three dimensions of integrity, accuracy, and valuation, generating the evaluation result comprising: The labeling model determines the first labeling result with the integrity score exceeding a first threshold, the accuracy score exceeding a second threshold and the value score exceeding a third threshold as the first data quality; The labeling model determines the first labeling result with the value score exceeding the third threshold and the accuracy score not exceeding the second threshold, or with the value score exceeding the third threshold and the integrity score not exceeding the first threshold, as the second data quality.
  5. 5. The method of claim 1, wherein said data modifying said first annotation result of said second data quality comprises: Pushing the first labeling result of the second data quality to a labeling platform for data modification, and sending an interaction prompt to diagnostic equipment for data modification.
  6. 6. The method according to claim 1, wherein the classifying the second labeling result based on the vehicle attribute information specifically includes: analyzing brand fields, vehicle type fields, affiliated system fields and fault code fields in the vehicle attribute information, and determining the vector database according to the analyzed field combination, wherein the vector database is established according to the pre-combination division of the vehicle attribute information.
  7. 7. The method of claim 1, wherein the feedback data comprises a training sample pair for which the first annotation result matches the second annotation result.
  8. 8. An automatic labeling system for vehicle diagnostic data, comprising: The data access module is used for receiving and preprocessing vehicle diagnosis data; The data labeling and evaluating module comprises a labeling model which is used for automatically labeling the vehicle diagnosis data, evaluating the quality of the generated first labeling result and generating an evaluation result, wherein the labeling model is a large language model subjected to data optimization in the vehicle diagnosis field; The closed-loop supplementary scheduling module comprises a task priority queue management unit and a notification unit, wherein the task priority queue management unit is used for deciding and triggering a data modification flow according to the evaluation result to generate a second labeling result, and the notification unit is used for generating prompt information and sending the prompt information to diagnostic equipment; The database management module is configured with a routing rule table and is used for storing the second labeling result into the corresponding vector database according to the vehicle attribute information extracted from the second labeling result; And the model optimization module is used for collecting feedback data in the operation of the automatic labeling system and iteratively updating the labeling model by utilizing the feedback data.
  9. 9. A computer device comprising one or more memories, one or more processors; the memory being coupled to the one or more processors, the memory being for storing computer program code comprising computer instructions that are invoked by the one or more processors to cause the computer device to implement the method of any one of claims 1 to 7.
  10. 10. A computer readable storage medium having stored thereon computer instructions, which when executed by a processor, implement the method of any of claims 1 to 7.

Description

Automatic labeling method, system and related equipment for vehicle diagnostic data Technical Field The application relates to the technical field of artificial intelligence and data management, in particular to an automatic labeling method, an automatic labeling system and related equipment for vehicle diagnosis data. Background With the rapid development of internet-connected automobiles and artificial intelligent diagnosis technologies, high-quality and large-scale structured diagnosis data has become a key foundation for training a high-performance diagnosis large model. However, the labeling and processing of current vehicle diagnostic data is largely dependent on manual means, and there is a significant bottleneck. The vehicle diagnosis data labeling has the following problems that 1, data labeling relies on manual work, engineers need to manually extract fields such as fault codes, symptoms, solutions and the like, the process is time-consuming and labor-consuming, the efficiency and scale of data production are severely limited, and the process becomes a main bottleneck of model development, 2, the existing vehicle diagnosis data processing flow cannot automatically evaluate the integrity and accuracy of labeling results, a system cannot identify key information deficiency (such as fault codes only and no corresponding maintenance scheme), and further cannot actively trigger a supplementing flow, so that low-quality incomplete data flows into a knowledge base in a silent manner, pollutes a training data set and influences the reliability of a final model, 3, multi-source heterogeneous data cannot be effectively fused, diagnosis is related to text reports, field pictures, data flow waveforms, technical bulletins and the like, the existing scheme is mostly limited to single text processing, and is difficult to support three-dimensional reasoning of complex faults, 4, the system lacks iterative optimization capability, the existing data processing flow is mostly an open-loop system, once the labeling model is deployed, the knowledge base and the performance of the system knowledge base tends to be solidified, and the system knowledge base cannot adapt to the quick iterative and new fault mode of the vehicle technology. Therefore, how to construct a vehicle diagnostic data labeling method and system capable of automatically integrating multi-source information, automatically generating labeling results, controlling the quality of data labeling and continuously optimizing capability has become a core technical problem to be solved in the art. Disclosure of Invention The application provides an automatic labeling method, a system and related equipment for vehicle diagnostic data, wherein the method automatically labels the vehicle diagnostic data based on a labeling model, and carrying out quality evaluation on the labeling result, and iteratively supplementing or modifying the labeling result until the obtained labeling result is high-quality complete data. And simultaneously, optimizing the labeling model correspondingly. The technical scheme is as follows: According to the method, vehicle diagnosis data are acquired, the vehicle diagnosis data comprise text data and image data, the vehicle diagnosis data are input into a labeling model, the labeling model fuses the text data and the image data and analyzes the image data, a first labeling result is output, the labeling model is a large language model obtained through data optimization in the vehicle diagnosis field, the first labeling result is subjected to quality assessment through the labeling model to generate an assessment result of data quality of the first labeling result, the assessment result comprises first data quality and second data quality, the first data quality indicates that data in the first labeling result is complete, the second data quality indicates that data in the first labeling result is wrong or missing, the first labeling result of the first data quality is determined to be a second labeling result, the second labeling result contains vehicle attribute information, the first labeling result after data modification is subjected to data modification is input into the labeling model again to carry out quality assessment, the first labeling result is updated to be determined to be the first labeling result, the second labeling result is fed back to the second labeling result, the vehicle is subjected to the first labeling result is obtained through the first labeling result, the second labeling result is obtained through the data modification, the second labeling result is obtained through the data correction, and the second labeling result is obtained through the data attribute feedback information, and the second labeling result is obtained through the data base, and the second labeling result is obtained through the feedback, and the data is obtained, and the corresponding to the second labeling result is obtained, and the dat