CN-121983041-A - Information transmission method, device and storage medium

CN121983041ACN 121983041 ACN121983041 ACN 121983041ACN-121983041-A

Abstract

The application provides an information transmission method, an information transmission device and a storage medium, relates to the technical field of communication, and aims to ensure that internet traffic (IoT) equipment can normally conduct voice communication as far as possible. The method comprises the steps of obtaining voice information, converting the voice information into text information through automatic voice recognition ASR, and sending a first message, wherein the first message is used for indicating information required for converting the text information into the voice information.

Inventors

LIANG BO
LIU HUI
ZHOU JING
SHI YU

Assignees

中国联合网络通信集团有限公司

Dates

Publication Date: 20260505
Application Date: 20260105

Claims (10)

1. An information transmission method, applied to a first terminal, where the first terminal is located in a low-speed network or a high-speed network, the method comprising: acquiring voice information; Converting the voice information into text information through automatic voice recognition ASR; And sending a first message, wherein the first message is used for indicating information required for converting the text information into the voice information.
2. The method of claim 1, wherein the first message includes tone color information and/or time axis information, the time axis information including a length of time corresponding to each of at least one of the text information and a time interval between two adjacent ones of the at least one of the text information.
3. The method according to claim 1 or 2, wherein the sending the first message comprises: and sending the first message to a second terminal so that the second terminal converts the voice information based on the first message.
4. The method according to claim 1 or 2, wherein the sending the first message comprises: And sending the first message to network equipment so that the network equipment converts the voice information based on the first message and sends the voice information to a second terminal.
5. The method according to claim 1 or 2, wherein said converting said speech information into text information by automatic speech recognition, ASR, comprises: Converting the voice information into text information to be selected through automatic voice recognition ASR; and correcting errors of the text information based on the candidate text information through natural language understanding processing to obtain the text information.
6. An information transmission method, applied to a second terminal, where the second terminal is located in a low-speed network or a high-speed network, the method comprising: Receiving a first message, wherein the first message is used for indicating information required for converting the text information into the voice information; the text information is converted to the speech information by an automated speech recognition TSS process.
7. An information transmission method, applied to a network device, where the network device is located in a low-speed network or a high-speed network, the method comprising: Receiving a first message, wherein the first message is used for indicating information required for converting the text information into the voice information; converting the text information into the speech information by an automatic speech recognition TSS process; And sending the voice information.
8. An information transmission device is characterized by being applied to a first terminal, wherein the first terminal is positioned in a low-speed network or a high-speed network, and comprises a communication unit and a processing unit; The communication unit is used for acquiring voice information; The processing unit is used for converting the voice information into text information through automatic voice recognition ASR; The communication unit is further configured to send a first message, where the first message is used to indicate information required for converting the text information into the voice information.
9. An information transmission apparatus comprising a processor and a communication interface, the communication interface and the processor being coupled, the processor being operable to execute a computer program or instructions to implement an information transmission method as claimed in any one of claims 1 to 7.
10. A computer readable storage medium having instructions stored therein, characterized in that when the instructions are executed by a computer, the computer performs the information transmission method as claimed in any one of the preceding claims 1-7.

Description

Information transmission method, device and storage medium Technical Field The present application relates to the field of communications technologies, and in particular, to an information transmission method, an information transmission device, and a storage medium. Background In a non-terrestrial communication network, services supported by internet of things (internet of things, ioT) devices have low requirements on rate, delay, jitter and instantaneity, and can be implemented only by successfully transmitting a single instruction field. At present, although there is a study on a high compression code rate based on narrowband voice communication, in extreme scenes such as actual application of IoT devices, coverage edges, poor signals and the like, the conventional voice call cannot be supported. Disclosure of Invention The application provides an information transmission method, an information transmission device and a storage medium, which ensure that an internet traffic (IoT) device can normally conduct voice communication as far as possible. In order to achieve the above purpose, the application adopts the following technical scheme: In a first aspect, the present application provides an information transmission method applied to a first terminal, where the first terminal is located in a low-speed network or a high-speed network, the method includes obtaining speech information, converting the speech information into text information through automatic speech recognition ASR, and sending a first message, where the first message is used to indicate information required for converting the text information into the speech information. The technical scheme at least has the advantages that when a first terminal wants to conduct voice communication, the first terminal can convert voice information into text information and send information required for converting the text information into the voice information, so that the voice information can be not required to be directly sent, bandwidth resource constraint characteristics, such as low bandwidth characteristics of an internet traffic (IoT) NTN device, the data volume of the information required for converting the text information into the voice information is far smaller than that of original voice information, transmission bandwidth dependence can be reduced, communication interruption caused by insufficient bandwidth is avoided, in addition, stability of transmission of extreme scenes (such as scenes with weak signals and covered edges) can be improved, the problems of small volume data packet loss, low error probability and the like are avoided as far as possible, and the communication equipment can conduct voice communication normally. In one possible implementation, the first message includes tone color information and/or time axis information, and the time axis information includes a time length corresponding to each of at least one text in the text information and a time interval between two adjacent text in the at least one text. In one possible implementation, sending the first message includes sending the first message to the second terminal to cause the second terminal to translate the voice information based on the first message. In one possible implementation, sending the first message includes sending the first message to the network device to cause the network device to translate the voice information based on the first message and send the voice information to the second terminal. In one possible implementation, the conversion of speech information into text information by automatic speech recognition ASR includes converting speech information into text information to be selected by automatic speech recognition ASR, and correcting errors based on the text information to be selected by natural language understanding processing to obtain text information. In a second aspect, the present application provides an information transmission method applied to a second terminal, where the second terminal is located in a low-speed network or a high-speed network, the method including receiving a first message, where the first message is used to indicate information required for converting text information into voice information, and converting the text information into voice information through automatic voice recognition TSS processing. In one possible implementation, the first message includes tone color information and/or time axis information, and the time axis information includes a time length corresponding to each of at least one text in the text information and a time interval between two adjacent text in the at least one text. In a third aspect, the present application provides an information transmission method applied to a network device, where the network device is located in a low-speed network or a high-speed network, the method including receiving a first message, where the first message is used to indicate information required for co