CN-122021562-A - Method and device for generating target text and intelligent display equipment

CN122021562ACN 122021562 ACN122021562 ACN 122021562ACN-122021562-A

Abstract

The application provides a target text generation method, a target text generation device and intelligent display equipment, and belongs to the technical field of intelligent terminals. The method comprises the steps of responding to a wake-up instruction, determining a target input box, obtaining voice data, identifying the voice data based on the target input box to obtain an initial text, displaying the initial text to a user, enabling the user to correct the initial text through mobile equipment to obtain a target text, and transmitting the target text to the target input box when a confirmation instruction triggered by the user is received. Thus, the initial text is generated based on the voice data, so that a user can revise the initial text by combining the mobile equipment to obtain the target text, dual-mode cooperative input of the voice and the mobile equipment is realized, the operation flow is reduced, and the generation efficiency of the target text is improved.

Inventors

FU HUADONG
JIANG RUN
WANG KUN

Assignees

深圳创维显示科技有限公司

Dates

Publication Date: 20260512
Application Date: 20251212

Claims (10)

1. A method for generating a target text, which is applied to an intelligent display device, the method comprising: Determining a target input box in response to the wake-up instruction; Acquiring voice data; based on the target input box, recognizing the voice data to obtain an initial text, and displaying the initial text to a user so that the user corrects the initial text through mobile equipment to obtain a target text; and when receiving the confirmation instruction triggered by the user, transmitting the target text to the target input box.
2. The method of claim 1, wherein the determining the target input box in response to the wake instruction comprises: acquiring a target input box keyword from the wake-up instruction; Acquiring at least one candidate input box; Matching the target input frame keywords with the identifications corresponding to the candidate input frames aiming at any candidate input frame, and determining the matching degree; And determining the candidate input box with the largest matching degree as the target input box in the matching degree corresponding to the at least one candidate input box.
3. The method of claim 1, wherein the identifying the voice data based on the target input box to obtain initial text comprises: converting the voice data into semantic text; Acquiring an identification word list corresponding to the target input box; And correcting the semantic text according to the recognition word list to obtain the initial text.
4. A method according to claim 3, wherein said modifying said semantic text based on said recognition vocabulary to obtain said initial text comprises: Matching the semantic text with the vocabulary in the recognition vocabulary; when the semantic text is successfully matched with the vocabulary in the recognition vocabulary, candidate options are generated, and the candidate options are displayed to the user, so that the user generates a selection instruction according to the candidate options; and receiving the selection instruction, and correcting the semantic text by combining the selection instruction to obtain the initial text.
5. The method of claim 1, wherein displaying the initial text to a user to enable the user to modify the initial text via a mobile device to obtain a target text comprises: generating a two-dimensional code based on the initial text, displaying the two-dimensional code to the user, so that the user scans the two-dimensional code through the mobile equipment, and synchronizing the initial text to the mobile equipment; and receiving correction operation of the user on the initial text in real time, and correcting the initial text based on the correction operation to obtain the target text.
6. The method of claim 1, wherein the user-triggered confirmation instruction comprises a voice submission instruction issued by the user or a submission instruction triggered by the user at the mobile device.
7. The method of claim 1, wherein after said recognizing said voice data to obtain initial text, comprising: detecting the initial text, and triggering a collaborative input mode if the initial text is detected to be empty; Under the collaborative input mode, generating a collaborative two-dimensional code based on the identification of the intelligent display device and the identification of the target input box; displaying the collaborative two-dimensional code to the user so that the user inputs a collaborative text based on the collaborative two-dimensional code; And receiving the collaborative text, and filling the content of the collaborative text into the initial text.
8. The method of claim 7, wherein the user entering collaborative text based on the collaborative two-dimensional code comprises: The user scans the collaborative two-dimensional code through the mobile device, so that the mobile device responds to the collaborative two-dimensional code to start an editing interface; and the user inputs the collaborative text in the editing interface.
9. A target text generation apparatus, which is applied to an intelligent display device, the apparatus comprising: the input box determining module is used for determining a target input box in response to the wake-up instruction; The data acquisition module is used for acquiring voice data; the text generation module is used for identifying the voice data based on the target input box to obtain an initial text, and displaying the initial text to a user so that the user corrects the initial text through mobile equipment to obtain a target text; And the text transmission module is used for transmitting the target text to the target input box when receiving the confirmation instruction triggered by the user.
10. The intelligent display device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus; a memory for storing a computer program; A processor for implementing the method of any of claims 1-8 when executing a program stored on a memory.

Description

Method and device for generating target text and intelligent display equipment Technical Field The application relates to the technical field of intelligent terminals, in particular to a method and a device for generating target text and intelligent display equipment. Background Along with the continuous expansion and popularization of the functions of the intelligent display device, the functions of the intelligent display device are expanded from the traditional video playing to a plurality of fields including game entertainment, online education, intelligent home control and the like. Meanwhile, voice input is used as an important link of interaction between the intelligent display device and a user, and application scenes of the intelligent display device are increasingly abundant and complex. In the related art, the existing intelligent display device is commonly equipped with a basic voice input function, and a user can complete volume adjustment, content play control and simple text input through voice instructions. However, when a complex target text input scene is faced, for example, special characters such as "#" @ "in a WiFi password are identified, the identification error rate is high, and meanwhile, under the condition that voice identification fails or the environment is noisy, the intelligent display device lacks an effective alternative scheme, so that a user has to return to a traditional remote controller key or a virtual keyboard for inputting so as to generate the target text, which not only causes complicated operation flow, but also greatly influences the generation efficiency of the target text. Disclosure of Invention The embodiment of the application aims to provide a method, a device and intelligent display equipment for generating target texts, so that the generation efficiency of the target texts is improved when a complex target text input scene is faced. The specific technical scheme is as follows: In a first aspect of the embodiment of the present application, there is first provided a method for generating a target text, applied to an intelligent display device, the method including: Determining a target input box in response to the wake-up instruction; Acquiring voice data; based on the target input box, recognizing the voice data to obtain an initial text, and displaying the initial text to a user so that the user corrects the initial text through mobile equipment to obtain a target text; and when receiving the confirmation instruction triggered by the user, transmitting the target text to the target input box. In an alternative embodiment, the determining the target input box in response to the wake instruction includes: acquiring a target input box keyword from the wake-up instruction; Acquiring at least one candidate input box; Matching the target input frame keywords with the identifications corresponding to the candidate input frames aiming at any candidate input frame, and determining the matching degree; And determining the candidate input box with the largest matching degree as the target input box in the matching degree corresponding to the at least one candidate input box. In an optional embodiment, the identifying the voice data based on the target input box to obtain an initial text includes: converting the voice data into semantic text; Acquiring an identification word list corresponding to the target input box; And correcting the semantic text according to the recognition word list to obtain the initial text. In an optional embodiment, the correcting the semantic text according to the recognition vocabulary to obtain the initial text includes: Matching the semantic text with the vocabulary in the recognition vocabulary; when the semantic text is successfully matched with the vocabulary in the recognition vocabulary, candidate options are generated, and the candidate options are displayed to the user, so that the user generates a selection instruction according to the candidate options; and receiving the selection instruction, and correcting the semantic text by combining the selection instruction to obtain the initial text. In an optional embodiment, the displaying the initial text to the user, so that the user corrects the initial text through the mobile device to obtain the target text, includes: generating a two-dimensional code based on the initial text, displaying the two-dimensional code to the user, so that the user scans the two-dimensional code through the mobile equipment, and synchronizing the initial text to the mobile equipment; and receiving correction operation of the user on the initial text in real time, and correcting the initial text based on the correction operation to obtain the target text. In an alternative embodiment, the confirmation instruction triggered by the user comprises a voice submitting instruction sent by the user or a submitting instruction triggered by the user at the mobile device. In an alternative embodiment, after the