CN-122020086-A - Model evaluation method, device, equipment and medium
Abstract
The application relates to a model evaluation method, a device, equipment and a medium, wherein the method comprises the steps of responding to editing operation, obtaining model information of a first model and an evaluation question bank, wherein the evaluation question bank comprises safety evaluation questions, responding to evaluation starting operation, and displaying an evaluation result of the first model, wherein the evaluation result updates the evaluation progress of the first model until the evaluation is completed based on the model information of the first model and the evaluation question bank. According to the application, the evaluation of the safety dimension of the model can be rapidly initiated through extremely simple editing operation and a low-threshold path, and the dynamic update of the evaluation result can be displayed without waiting for completion, so that the efficiency of displaying the safety dimension evaluation result is ensured.
Inventors
- XU RONGHUA
- ZHU CHEN
- JIN LU
Assignees
- 北京字跳网络技术有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260127
Claims (12)
- 1. A model evaluation method, comprising: responding to the editing operation, obtaining model information of a first model and an evaluation question library, wherein the evaluation question library comprises safety evaluation questions; And responding to an evaluation starting operation, and displaying an evaluation result of the first model, wherein the evaluation result updates the evaluation progress of the first model based on the model information of the first model and the evaluation question bank until the evaluation is completed.
- 2. The method of claim 1, wherein the obtaining model information of the first model and evaluating the question bank in response to the editing operation comprises: displaying a first page, wherein the first page comprises an editing control and a question bank control; Responding to the triggering operation of the editing control, and acquiring the model information of the first model; And responding to the selection operation of the question bank control, and acquiring an evaluation question bank of the first model.
- 3. The method of claim 1, wherein the model information of the first model includes at least one of a model name, a model platform, an interface address, and interface authentication information.
- 4. The method of claim 1, the presenting the evaluation result of the first model, comprising at least one of: displaying the safety overview information of the evaluation result in a first area of a second page; at least one sub-result of the evaluation result is displayed in a second area of the second page, wherein different sub-results correspond to different security defect types; displaying abnormal method information of the evaluation result in a third area of the second page; and displaying the key abnormal questions of the evaluation result in a fourth area of the second page.
- 5. The method of claim 4, wherein the safety profile information comprises at least one of an evaluation progress information, a safety defect result, a safety score, a total safety level, and an anomaly profile, the safety level being determined based on the safety score, the safety score being determined from the safety defect result, the safety defect result comprising a number of safety defects of different safety levels.
- 6. The method according to claim 4 or 5, wherein the abnormality method information includes an abnormality profile for showing an abnormality method of the first model, and an abnormality method table including at least one of a rectangular duty ratio map, a tree diagram, and showing details of a plurality of abnormality methods in a list form.
- 7. The method of claim 6, the method further comprising: And responding to the triggering operation of the first abnormal method, displaying a detail card, wherein the detail card comprises detail information of the first abnormal method.
- 8. The method of claim 6, the dendrogram exhibiting a second anomaly method, the method further comprising: And switching the second abnormal method into a third abnormal method in response to the switching operation of the tree diagram, wherein the number of the second abnormal method and the number of the third abnormal method are multiple, and the switching operation comprises a scrolling operation or a triggering operation of a switching control.
- 9. The method of claim 4, wherein the key anomaly topics are extracted from the plurality of anomaly topics according to a ranking result of a security score from large to small.
- 10. A model evaluation device, comprising: The editing module is used for responding to editing operation, obtaining model information of the first model and an evaluation question library, wherein the evaluation question library comprises safety evaluation questions; And the evaluation module is used for responding to the evaluation starting operation and displaying the evaluation result of the first model, wherein the evaluation result is based on the model information of the first model and the evaluation question bank to update the evaluation progress of the first model until the evaluation is completed.
- 11. An electronic device, the electronic device comprising: A processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the executable instructions to implement the model evaluation method according to any one of the preceding claims 1-9.
- 12. A computer-readable storage medium storing a computer program for executing the model evaluation method according to any one of the preceding claims 1-9.
Description
Model evaluation method, device, equipment and medium Technical Field The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a medium for model evaluation. Background Evaluation of a model or a program integrating multiple models generally evaluates the ability of a functional layer, and the industry has set forth requirements for evaluation of safety issues of a model or a program integrating multiple models. Disclosure of Invention In order to solve the technical problems, the application provides a model evaluation method, a device, equipment and a medium. The application provides a model evaluation method, which comprises the following steps: responding to the editing operation, obtaining model information of a first model and an evaluation question library, wherein the evaluation question library comprises safety evaluation questions; And responding to an evaluation starting operation, and displaying an evaluation result of the first model, wherein the evaluation result updates the evaluation progress of the first model based on the model information of the first model and the evaluation question bank until the evaluation is completed. The application also provides a model evaluation device, which comprises: The editing module is used for responding to editing operation, obtaining model information of the first model and an evaluation question library, wherein the evaluation question library comprises safety evaluation questions; And the evaluation module is used for responding to the evaluation starting operation and displaying the evaluation result of the first model, wherein the evaluation result is based on the model information of the first model and the evaluation question bank to update the evaluation progress of the first model until the evaluation is completed. The application further provides electronic equipment which comprises a processor, a memory for storing executable instructions of the processor, and the processor is used for reading the executable instructions from the memory and executing the executable instructions to realize the model evaluation method provided by the application. The application also provides a computer readable storage medium storing a computer program for executing the model evaluation method as provided by the application. Compared with the prior art, the technical scheme provided by the application has the advantages that the model evaluation scheme provided by the application is used for responding to the editing operation to acquire the model information of the first model and the evaluation question bank, wherein the evaluation question bank comprises the safety evaluation questions, responding to the evaluation starting operation to display the evaluation result of the first model, and updating the evaluation progress of the first model until the evaluation is completed based on the model information of the first model and the evaluation question bank. By adopting the technical scheme, the model information of the first model and the evaluation question bank comprising the safety evaluation questions are rapidly acquired based on the editing operation of the user, the evaluation result can be obtained and displayed based on the model information of the first model and the evaluation question bank after the user performs the evaluation starting operation, the evaluation of the safety dimension of the model can be rapidly initiated through the extremely simple editing operation and the path with a low threshold, the evaluation result can be dynamically updated and displayed without waiting for completion, and the efficiency of displaying the safety dimension evaluation result is ensured. Drawings The above and other features, advantages and aspects of embodiments of the present application will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale. FIG. 1 is a schematic diagram of an interaction scenario provided by the present application; FIG. 2 is a schematic flow chart of the model evaluation method provided by the application; FIG. 3 is a schematic diagram of a first page according to the present application; FIG. 4 is a schematic view of another first page according to the present application; FIG. 5 is a schematic flow chart of another model evaluation method provided by the application; FIG. 6 is a schematic diagram of a second page according to the present application; FIG. 7 is a schematic structural diagram of a model evaluation device provided by the application; Fig. 8 is a schematic structural diagram of an electronic device according to the present application.