CN-118797031-B - Method for selecting subjective questions in vertical domain and scoring method of subjective questions in vertical domain

CN118797031BCN 118797031 BCN118797031 BCN 118797031BCN-118797031-B

Abstract

The embodiment of the disclosure provides a method for selecting a model for scoring a subjective question in a vertical domain and a method for scoring the subjective question in the vertical domain. A method for selecting a model for scoring a subjective question in a vertical domain includes the steps of constructing a plurality of scoring prompt templates aiming at the subjective question in the vertical domain, constructing model input based on each scoring prompt template, processing model input by a large language model to be selected to obtain single-angle scores aiming at answers, sorting the single-angle scores corresponding to the same scoring prompt template in size, determining weighting weights based on the sorting order, respectively carrying out weighted summation on the single-angle scores output by the large language models to be selected according to the weighting weights to obtain multi-angle scores of the large language models to be selected, and selecting the large language models to be selected with the largest or smallest preset number of the multi-angle scores as the model for scoring the subjective question in the vertical domain. The model for selecting and determining the vertical subjective question score is more reasonable to be used as a real application model by adopting the scheme of the embodiment of the disclosure.

Inventors

REN XINGYU
LIU HAIBO
WANG FANG

Assignees

北京智谱华章科技有限公司

Dates

Publication Date: 20260505
Application Date: 20240808

Claims (10)

1.A method for selecting a model for vertical domain subjective question scoring, comprising: constructing a plurality of scoring prompt templates aiming at the vertical subjective questions, and constructing model inputs based on the scoring prompt templates, the vertical subjective questions and answers aiming at the vertical subjective questions, wherein each scoring prompt template scores the answers from different prompt angles respectively, and the prompt angles of each scoring prompt template are determined according to the experience of a vertical expert; Processing model input by adopting a large language model to be selected, and obtaining single-angle scores of the large language model to be selected for answers under each scoring prompt template; The method comprises the steps that single-angle scores obtained by processing a large language model to be selected and inputting a model comprising the same scoring prompt template are subjected to size sorting, sorting order is determined, and weighting weights of all single-angle scores corresponding to the same scoring prompt template are determined based on the sorting order; Respectively carrying out weighted summation on single-angle scores output by the large language models to be selected according to the weighted weights to obtain multi-angle scores of the large language models to be selected; According to the correlation relation between the single-angle score and the answer quality of the subjective questions in the vertical domain, selecting the large language models to be selected with the largest or smallest multi-angle score as models for the subjective questions in the vertical domain, if the single-angle score is positively correlated with the answer quality of the subjective questions in the vertical domain, using the large language models to be selected with the largest multi-angle score as models for the subjective questions in the vertical domain, and if the single-angle score is inversely correlated with the answer quality of the subjective questions in the vertical domain, using the large language models to be selected with the smallest multi-angle score as models for the subjective questions in the vertical domain.
2. The method of claim 1, wherein constructing a plurality of scoring hint templates for a vertical subjective question comprises: determining a plurality of prompt angles for the vertical domain subjective questions based on expert experience, scoring dimensions under each prompt angle, and scoring criteria under each scoring dimension; the plurality of scoring hint templates are constructed based on the hint angle, the corresponding scoring dimension, and scoring criteria under the scoring dimension.
3. The method of selecting according to claim 2, wherein determining a plurality of cue angles for the vertical subjective questions based on expert experience comprises: The plurality of cue angles determined based on expert experience include at least two of an integrity angle, an accuracy angle, and a utility angle.
4. The method of claim 2, further comprising obtaining a reference answer to the vertical subjective question; The method further comprises the step of adding the reference answers to each scoring prompt template in the process of constructing the scoring prompt templates.
5. The selection method according to any one of claims 1 to 4, wherein the number of vertical subjective questions is at least two; the model input is constructed based on each scoring prompt template, the vertical subjective questions and answers to the vertical subjective questions, and comprises the following steps: respectively constructing model input based on each scoring prompt template, each vertical subjective question and answers to each vertical subjective question so that each model input only comprises one vertical subjective question and corresponding answers, or constructing model input based on each scoring prompt template, all vertical subjective questions and answers to each vertical subjective question so that the model input comprises all vertical subjective questions and corresponding answers and the association relation of all vertical subjective questions and corresponding answers; the method for processing model input by adopting the large language model to be selected, obtaining single-angle scores of the large language model to be selected for answers under each scoring prompt template comprises the following steps: Processing model output by adopting a large language model to be selected, and obtaining scores of the model to be selected for subjective questions and corresponding answers in each vertical domain under each score prompting template; and solving the average or sum of all scores obtained by each large language model to be selected under the corresponding score prompting template, and taking the average or sum as the corresponding single-angle score.
6. The selection method according to any one of claims 1-4, wherein determining a weighted weight for each single-angle score corresponding to the same scoring hint template based on the ranking order comprises: determining a weight amplification coefficient or a weight reduction coefficient corresponding to each single-angle score input by the same model based on the sorting order; and determining the weighted weight of the single-angle scores corresponding to the sorting order based on the weight amplification coefficient or the weight reduction coefficient and the reference weight of each score prompting template.
7. The selection method of claim 6, comprising, prior to said determining weighted weights for individual single-angle scores of a same scoring hint template based on said ranking order: Determining the relative importance of each scoring prompt template according to expert experience, and constructing a comparison matrix based on the relative importance; performing column normalization processing on the comparison matrix to obtain a normalized matrix; and respectively solving the average value of matrix elements corresponding to each scoring prompt template in the normalized matrix, and taking the average value as the weighting weight of the corresponding single-angle score.
8. A method for scoring a subjective question in a vertical domain, characterized by applying the model for scoring a subjective question in a vertical domain selected according to any one of claims 1 to 7, comprising: The method comprises the steps of constructing model input based on a plurality of pre-constructed scoring prompt templates, a vertical subjective question and answers to be scored aiming at the vertical subjective question, scoring the answers from different prompt angles by each scoring prompt template, and determining the prompt angles of each scoring prompt template according to the experience of a vertical expert; processing each model input by a model through a preselected vertical domain subjective question score to obtain a corresponding single-angle score; And carrying out weighted summation based on the single-angle scores and the corresponding weighted weights to obtain multi-angle scores, and taking the multi-angle scores as scores for the answers to be scored.
9. A computing device comprising a processor and a memory for storing a computer program which, when loaded by the processor, causes the processor to perform the method of selecting a model for scoring a subjective question in a domain as claimed in any one of claims 1 to 7 and/or the method of scoring a subjective question in a domain as claimed in claim 8.
10. A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to implement the method of selecting a model for scoring a subjective question in a domain as claimed in any one of claims 1 to 7 and/or the method of scoring a subjective question in a domain as claimed in claim 8.

Description

Method for selecting subjective questions in vertical domain and scoring method of subjective questions in vertical domain Technical Field The disclosure relates to the technical field of text processing, in particular to a method for selecting a vertical subjective question scoring model and a vertical subjective question scoring method. Background In order to reduce the influence of expert on subjective questions evaluation deviation of a specific vertical domain on subjective questions, improve fairness, accuracy and high efficiency of scoring, a scheme for scoring the subjective questions of the specific vertical domain by adopting a large language model is provided in the related technical field. The precondition of adopting the large language model to carry out the vertical domain subjective question scoring is to reasonably evaluate the advantages and disadvantages of the large language model based on expert scoring standards and select the large language model with better scoring quality. However, when the existing model selection evaluation method is used for manual evaluation comparison, the correlation between the scoring result and the manual evaluation result is low, and the reverse selection evaluation method is unreasonable. Disclosure of Invention In order to solve the problem that the existing scoring model is unreasonable to select, the embodiment of the disclosure provides a novel method for selecting a model for scoring the subjective questions in the vertical domain and a method for scoring the subjective questions in the vertical domain. In a first aspect, an embodiment of the present disclosure provides a method for selecting a model for scoring a subjective question in a vertical domain, including: The method comprises the steps of constructing a plurality of scoring prompt templates aiming at a vertical subjective question, and constructing model inputs based on each scoring prompt template, the vertical subjective question and an answer aiming at the vertical subjective question, wherein each scoring prompt template respectively prompts the answer from a specific prompt angle, the prompt angle of each scoring prompt template is determined according to the experience of a vertical expert, and the prompt angles of each scoring prompt template are different; Processing model input by adopting a large language model to be selected, and obtaining single-angle scores of the large language model to be selected for answers under each scoring prompt template; the method comprises the steps that single-angle scores obtained by model input of a large language model to be selected and comprising the same scoring prompt template are subjected to size sorting, sorting order is determined, and weighting weights of all single-angle scores corresponding to the same scoring prompt template are determined based on the sorting order; Respectively carrying out weighted summation on single-angle scores output by the large language models to be selected according to the weighted weights to obtain multi-angle scores of the large language models to be selected; And selecting a preset number of large language models to be selected with the largest or smallest multi-angle scores as models for the subjective questions in the vertical domain according to the correlation relation between the single-angle scores and the subjective questions in the vertical domain. Optionally, the constructing a plurality of scoring hint templates for the vertical domain subjective questions includes: determining a plurality of prompt angles for the vertical domain subjective questions based on expert experience, scoring dimensions under each prompt angle, and scoring criteria under each scoring dimension; the plurality of scoring hint templates are constructed based on the hint angle, the corresponding scoring dimension, and scoring criteria under the scoring dimension. Optionally, the determining, based on expert experience, a plurality of prompting angles for the vertical subjective questions includes: The plurality of cue angles determined based on expert experience include at least two of an integrity angle, an accuracy angle, and a utility angle. Optionally, the method further comprises the steps of obtaining a reference answer for the vertical domain subjective questions; The method further comprises the step of adding the reference answers to each scoring prompt template in the process of constructing the scoring prompt templates. Optionally, the number of the vertical domain subjective questions is at least two; the model input is constructed based on each scoring prompt template, the vertical subjective questions and answers to the vertical subjective questions, and comprises the following steps: Respectively constructing model input based on each scoring prompt template, each vertical subjective question and answers to each vertical subjective question, or constructing model input based on each scoring prompt template,