CN-122019368-A - Automatic test method, device and storage medium

CN122019368ACN 122019368 ACN122019368 ACN 122019368ACN-122019368-A

Abstract

The application discloses an automatic test method, an automatic test device and a storage medium. The method comprises the steps of obtaining a screenshot of a target interface, identifying interface components in the screenshot through a target detection model, obtaining the category, first position information and visual characteristics of each interface component, conducting text detection and identification on the screenshot, extracting text blocks in the screenshot and second position information and text content of each text block, determining one interface component matched with each text block based on the first position information of the interface components and the second position information of the text blocks, constructing a semantic layout diagram based on the interface components, the text blocks and matching relations among the text blocks and the interface components, generating at least one user operation sequence according to the semantic layout diagram reasoning, and executing the user operation sequence to execute automatic testing.

Inventors

ZHU BEILEI
ZHAO ZILONG
XIANG WEI

Assignees

中科云谷科技有限公司

Dates

Publication Date: 20260512
Application Date: 20251231

Claims (11)

1. An automated testing method, the method comprising: obtaining a screenshot of a target interface; identifying interface components in the screenshot through a target detection model, and acquiring the category, the first position information and the visual characteristics of each interface component; text detection and recognition are carried out on the screenshot, and text blocks in the screenshot, second position information of each text block and text content are extracted; determining an interface component matched with each text block based on the first position information of the interface component and the second position information of the text block; constructing a semantic layout diagram based on the interface component, the text block and the matching relation between the text block and the interface component; at least one user operation sequence is generated in an inference mode according to the semantic layout diagram; The sequence of user operations is performed to perform an automated test.
2. The automated testing method of claim 1, wherein the obtaining the category, the first location information, and the visual characteristics of each interface component comprises: Outputting a component category, a position coordinate bounding box and a confidence score corresponding to each identified component, and determining the position coordinate of the component as the first position information; extracting features of the image areas of each component to generate corresponding visual feature vectors; combining the detection frames under the condition that a plurality of detection frames correspond to the same component; Calculating the average color and transparency of the image area of the component under the condition that the component is in a disabled or semitransparent state, and comparing the average color and transparency with a preset threshold value to mark the state of the component; And under the condition that one of the components is blocked by the other components, analyzing the hierarchical stacking relation of the blocked components in the interface, and judging whether the blocked components are positioned on the interactive top layer.
3. An automated testing method according to claim 2, wherein said determining whether the occluded assembly is at an interactable top layer comprises: Acquiring style information of the shielded assembly, wherein the style information comprises the stacking sequence of the shielded assembly; Determining that the occluded assembly is on an interactable top layer if the stacking order property of the occluded assembly indicates that the occluded assembly is on the top layer; And in the case that the position coordinates of the shielded assembly are overlapped with the coordinate set of the user interaction event, judging that the shielded assembly is positioned at the interactable top layer.
4. The automated testing method of claim 1, wherein the extracting the text block and the second location information and text content of each text block in the screenshot comprises: acquiring a position coordinate bounding box of each text region, and determining the position coordinate of the text region as the second position information; Extracting text content of each text region for each identified text region; Carrying out semantic correction on the text content based on a preset language model; grouping the corrected text blocks and deducing the reading sequence to generate structured text information.
5. The automated test method of claim 1, wherein the determining an interface component that matches each text block based on the first location information of the interface component and the second location information of the text block comprises: Calculating the cross-over ratio between the second position information of each text block and the first position information of each interface component; directly associating the text block to an interface component with the largest intersection ratio under the condition that the intersection ratio is larger than or equal to a preset threshold value; and under the condition that the intersection ratio is smaller than a preset threshold value, calculating the space distance between the text block and the interface component, and associating the text block to the interface component with the nearest space distance.
6. The automated testing method of claim 1, wherein the constructing a semantic layout based on the interface component, the text block, and a matching relationship between text blocks and interface components comprises: defining each interface component and each text block as a node in the semantic layout diagram; Creating and storing node attributes for each node, wherein the attributes of the interface component nodes comprise at least one of component category, first location information, visual feature vector and text feature vector obtained based on associated text blocks, and the attributes of the text block nodes comprise at least one of text content, second location information and text feature vector extracted based on the text content; calculating the space relative position between any two nodes based on the first position information and/or the second position information of the nodes; Establishing an edge representing a spatial relationship between two corresponding nodes under the condition that the spatial relative position meets a preset adjacent condition; Establishing an edge representing a logical relationship between the associated text block node and an interface component node based on a matching relationship between the text block and the interface component; and fusing the visual feature vector and the text feature vector to generate a unified feature representation of each node.
7. The automated testing method of claim 1, wherein said inferentially generating at least one sequence of user operations from the semantic layout graph comprises: inputting the semantic layout diagram into a pre-trained graphic neural network model; Performing characterization learning and reasoning on each node in the graph through the graph neural network model, and predicting a node access sequence conforming to a target business process, wherein each node in the node access sequence corresponds to an interface component to be operated; Determining the operation type and operation parameters of each interface component to be operated according to the node access sequence so as to generate the user operation sequence; Wherein the operation type comprises at least one of clicking, inputting and sliding, and the operation parameter comprises text content to be input or a sliding track.
8. The automated test method of claim 1, further comprising: triggering a repair flow when operation failure caused by interface change is detected in the process of executing the user operation sequence; The repairing process comprises the steps of obtaining a latest screenshot of a current target interface, constructing a current semantic layout diagram based on the latest screenshot, comparing the current semantic layout diagram with a historical semantic layout diagram based on which an execution failure step is performed, calculating matching scores between each node in the historical semantic layout diagram and each node in the current semantic layout diagram, wherein the matching scores are calculated in a weighting mode based on at least one of text semantic similarity, visual feature similarity, position distance and component category similarity among the nodes, determining a mapping relation between the nodes in the historical semantic layout diagram and the nodes in the current semantic layout diagram based on the matching scores, replacing the historical nodes pointed by failure steps in the user operation sequence with corresponding current nodes according to the mapping relation, updating the user operation sequence, and re-executing the previous failure steps based on the updated user operation sequence.
9. The automated test method of claim 8, further comprising: Presetting a first threshold and a second threshold, wherein the first threshold is higher than the second threshold; performing a replacement operation if the match score is above the first threshold; Verifying the validity of the sequence of operations after replacement in a sandboxed environment, the replacement operation being performed after verification has passed, if the match score is between the second threshold and the first threshold; And generating prompt information to be confirmed manually under the condition that the matching score is lower than the second threshold value.
10. An automated test equipment, the equipment comprising: A memory configured to store instructions; A processor configured to invoke the instructions from the memory and when executing the instructions is capable of implementing the automated test method according to any of claims 1 to 9.
11. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the automated test method according to any one of claims 1 to 9.

Description

Automatic test method, device and storage medium Technical Field The application relates to the technical field of automated software testing, in particular to an automated testing method, an automated testing device and a storage medium. Background The automatic software test is a key link for guaranteeing the software quality and improving the development efficiency. Current automated testing methods generally rely on internal structural information of the application to implement interface element positioning and operation. Specifically, the test tool obtains its internal UI hierarchy, such as a DOM tree of the Web, a view tree of the mobile end, or an accessibility tree, through interaction with the application under test, from which the fixed identifier of each interface control, such as a button, a text box, a list item, is parsed. The test script is written based on the predefined identifiers, and operations such as clicking, inputting, sliding and the like of a user are simulated by calling Selenium, appium or an API of an underlying test framework such as a UI Autoator, so that the correctness of the preset function point is verified. However, the test script is highly coupled with the internal implementation details of the application program, and once the UI structure is changed (even if the visual appearance is unchanged) due to version update, the original control identifier may fail, so that a large number of test scripts cannot be executed, the elements need to be manually repositioned and the scripts need to be updated, the maintenance cost is high, and the test suite is extremely fragile; secondly, the method is difficult to effectively cope with scenes of cross-platform application or interface content dynamic generation, because UI control tree structures under different platforms or different data are huge in difference, cross-environment adaptation is difficult to carry out by using a set of stable identifiers, and more fundamentally, the existing method lacks understanding capability of interface semantics, can only mechanically execute preset scripts, cannot understand interface functions and interaction logic through comprehensively perceiving visual components, text information and spatial layout relations of the visual components and the text information on an interface like a human tester, so that test operation sequences conforming to business logic cannot be independently inferred and generated. Therefore, how to break through the dependence on the internal structure, so that the automatic test has the semantic understanding capability based on vision and text, so as to realize more robust and intelligent test automation, and the method becomes a technical problem to be solved. Disclosure of Invention The embodiment of the application aims to provide an automatic test method, an automatic test device and a storage medium, which are used for solving the problems of high vulnerability of a test script, high maintenance cost, poor suitability of a cross-platform and dynamic interface and lack of intelligent operation generating capability based on semantic understanding caused by over-dependence on UI structure information in an application program in the prior art. To achieve the above object, a first aspect of the present application provides an automated testing method, including: obtaining a screenshot of a target interface; Identifying interface components in the screenshot through a target detection model, and acquiring the category, the first position information and the visual characteristics of each interface component; Text detection and recognition are carried out on the screenshot, and text blocks in the screenshot, second position information of each text block and text content are extracted; Determining an interface component matching each text block based on the first location information of the interface component and the second location information of the text block; Constructing a semantic layout diagram based on the interface component, the text block and the matching relation between the text block and the interface component; At least one user operation sequence is generated in an inference mode according to the semantic layout diagram; A sequence of user operations is performed to perform an automated test. The method comprises the steps of obtaining category, first position information and visual characteristics of each interface component, outputting component category, position coordinate bounding boxes and confidence scores corresponding to the components according to each identified component, determining the position coordinates of the components as the first position information, extracting characteristics of image areas of each component to generate corresponding visual characteristic vectors, merging detection frames under the condition that a plurality of detection frames correspond to the same component, calculating average color and transparency of