Search

CN-122021641-A - Page interaction method, page interaction device, electronic device, storage medium and program product

CN122021641ACN 122021641 ACN122021641 ACN 122021641ACN-122021641-A

Abstract

The invention relates to the technical field of computers and discloses a page interaction method, a device, electronic equipment, a storage medium and a program product, wherein the method is used for indicating a first model to determine an interaction task chain according to natural language instructions and sending the interaction task chain to a second model by receiving the natural language instructions; the second model is used for determining the interaction action corresponding to the current interaction task in the interaction task chain; and executing interaction processing circularly until the interaction task chain is finished. By receiving the natural language instruction, the first model automatically generates an interactive task chain, the second model generates interactive actions for the current interactive task according to the interactable page elements, and the interactive processing is automatically carried out according to the interactive actions, so that the problems of lack of universal adaptability and weak sensing capability of webpage content information in the related technology are solved. The two models are used for respectively completing the respective tasks and work and cooperate, so that the operation accuracy of the required actions of each step in the composite intention or the complex task is enhanced.

Inventors

  • LI XIAOSHUANG

Assignees

  • 杭州网易智企科技有限公司

Dates

Publication Date
20260512
Application Date
20251231

Claims (10)

  1. 1. A method of page interaction, the method comprising: Receiving a natural language instruction to instruct a first model to determine an interactive task chain comprising at least one interactive task according to the natural language instruction, and sending the interactive task chain to a second model, wherein the second model is used for determining an interactive action corresponding to a current interactive task in the interactive task chain; performing interactive processing circularly until the interactive task chain is finished; The interaction processing comprises the following steps: Acquiring the interaction action corresponding to the current interaction task determined by the second model; Executing the interaction action corresponding to the current interaction task on the target page element in the current page; Extracting a first interactable page element in a current page after executing interaction, and sending a first page element set containing each first interactable page element to the second model, wherein the first page element set is used for enabling the second model to determine interaction corresponding to a next interaction task according to the first page element set.
  2. 2. The method of claim 1, wherein prior to performing the first round of interaction processing, the method further comprises: extracting second interactable page elements in the current page, and generating a second page element set containing each second interactable page element; And the second page element set is used for enabling the second model to determine the interaction action corresponding to the current interaction task in the first round of interaction processing according to the second page element set.
  3. 3. The method of claim 1, wherein the first set of page elements is for: Determining whether an execution result after the current interaction task is executed accords with an execution expectation corresponding to the current interaction task or not according to the first page element set by the second model; And under the condition that the execution result accords with the execution expectation, the second model determines the interaction action corresponding to the next interaction task according to the first page element set.
  4. 4. The method of claim 3, wherein the first set of page elements is further for: Under the condition that the execution result does not accord with the execution expectation, the first model or the second model updates the interactive task chain according to the first page element set; and the second model determines the interaction action corresponding to the current interaction task in the updated interaction task chain according to the first page element set.
  5. 5. The method of claim 4, wherein in the event that the execution result does not meet the execution expectation, the method further comprises: Waiting for the current page to be updated under the condition that the second model cannot determine the interaction action corresponding to the current interaction task in the updated interaction task chain according to the first page element set; And under the condition that the current page is updated, extracting a third interactable page element in the updated current page, and sending a third page element set containing each third interactable page element to the second model, wherein the third page element set is used for enabling the second model to determine interaction actions corresponding to the current interaction tasks in the updated interaction task chain according to the third page element set.
  6. 6. A method according to claim 3, wherein the first model is further for: Determining execution expectations corresponding to the interaction tasks; And generating the interactive task chain, wherein the interactive task chain comprises each interactive task with the execution expectation, and transmitting the interactive task chain to a second model.
  7. 7. A page interaction device, the device comprising: The system comprises a language instruction receiving module, a first model, a second model, a target page element, a first model and a second model, wherein the language instruction receiving module is used for receiving a natural language instruction, indicating the first model to determine an interaction task chain containing at least one interaction task according to the natural language instruction, and sending the interaction task chain to the second model; And the interaction processing module is used for circularly executing the interaction processing until the interaction task chain is ended.
  8. 8. An electronic device, comprising: a memory and a processor, the memory and the processor being communicatively connected to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the page interaction method of any of claims 1 to 6.
  9. 9. A computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the page interaction method of any of claims 1 to 6.
  10. 10. A computer program product comprising computer instructions for causing a computer to perform the page interaction method of any of claims 1 to 6.

Description

Page interaction method, page interaction device, electronic device, storage medium and program product Technical Field The present invention relates to the field of computer technologies, and in particular, to a page interaction method, apparatus, electronic device, storage medium, and program product. Background In the current mainstream browser interaction mode, a user needs to manually operate through a physical input device, and in order to improve the operation efficiency, a script automation scheme is generally adopted in the related art, that is, a script writes a browser extension program and accurately analyzes an HTML (Hyper Text Markup Language ) structure of a target webpage, so as to locate a specific page element to be operated. Then, the developer needs to write the program code to simulate the events such as mouse click or keyboard input. Finally, the script is deployed and executed to automatically complete a preset, highly specific task sequence. However, the automation script needs to be hard-coded for a single website or a fixed flow, and cannot be adaptively migrated to a new page with similar structure but different elements based on the existing operation experience, so that the universal adaptability is lacking in the aspect of dynamic update of the web page. Disclosure of Invention The invention provides a page interaction method, a page interaction device, electronic equipment, storage media and program products, which are used for solving the problem that an automatic script cannot be adaptively migrated to a new page with similar structure but different elements based on the existing operation experience, and the general adaptability is lacking in the dynamic update of a webpage. In a first aspect, the present invention provides a page interaction method, including: Receiving a natural language instruction to instruct a first model to determine an interactive task chain comprising at least one interactive task according to the natural language instruction, and sending the interactive task chain to a second model, wherein the second model is used for determining an interactive action corresponding to a current interactive task in the interactive task chain; performing interactive processing circularly until the interactive task chain is finished; The interaction processing comprises the following steps: Acquiring the interaction action corresponding to the current interaction task determined by the second model; Executing the interaction action corresponding to the current interaction task on the target page element in the current page; Extracting a first interactable page element in a current page after executing interaction, and sending a first page element set containing each first interactable page element to the second model, wherein the first page element set is used for enabling the second model to determine interaction corresponding to a next interaction task according to the first page element set. According to the page interaction method provided by the embodiment, through receiving the natural language instruction, the first model can automatically generate the interaction task chain according to the natural language instruction, the second model generates the corresponding interaction action for the current interaction task in the interaction task chain, so that the browser can automatically conduct interaction processing according to the interaction action, and in the process of generating the interaction action, the browser can generate according to the interactable page element, and the problem that the general adaptability is lacking in the dynamic update of the webpage in the related technology is solved. In addition, the method and the device extract page elements in the webpage, solve the problem of weak webpage content information perception capability caused by visual screenshot analysis in the related technology, respectively finish respective tasks through two models, and work and cooperate separately, so that the operation accuracy of the required actions of each step in the composite intention or complex task is enhanced. In some alternative embodiments, the method further comprises, prior to performing the first round of interaction processing: And extracting second interactable page elements in the current page, and generating a second page element set containing each second interactable page element. The second set of page elements is sent to the second model. The second page element set is used for enabling the second model to determine interaction actions corresponding to the current interaction task in the first round of interaction processing according to the second page element set. In some alternative embodiments, the first page set of primitives is used to: And determining whether an execution result after the execution of the current interaction task is completed accords with an execution expectation corresponding to the current interaction tas