CN-117395230-A - Low-delay medical intercom control method, system, equipment and storage medium

CN117395230ACN 117395230 ACN117395230 ACN 117395230ACN-117395230-A

Abstract

The invention relates to a low-delay medical intercom control method, a system, equipment and a storage medium, belonging to the technical field of medical intercom, wherein the method comprises the following steps: receiving an audio stream, and storing the audio stream into a corresponding audio stream queue according to the type of the audio stream; the audio stream comprises a dialogue audio stream and a non-dialogue audio stream, and the dialogue audio stream is an audio stream sent by a sending end; circularly reading the audio frames in each audio stream queue according to a preset period; judging whether the read audio frames have effective audio to be broadcast or not, if any audio stream queue has effective audio to be broadcast, executing sound mixing operation, and outputting the sound mixing audio to the annular data buffer area; the low-delay index of the intercom can be always kept at an excellent level under the condition of mixing and superposition of multi-voice services.

Inventors

ZHANG HENG
YUN XIANFU
LU SHIWEI
YANG CAIJUN

Assignees

SPEECH TECH CO LTD

Dates

Publication Date: 20240112
Application Date: 20231108
Priority Date: 20231108

Claims (10)

1. A low-latency medical intercom control method, characterized by being applied to a receiving end, comprising: receiving an audio stream, and storing the audio stream into a corresponding audio stream queue according to the type of the audio stream; the audio stream comprises a dialogue audio stream and a non-dialogue audio stream, and the dialogue audio stream is an audio stream sent by a sending end; circularly reading the audio frames in each audio stream queue according to a preset period; judging whether the read audio frames have effective audio to be broadcast or not, if any audio stream queue has effective audio to be broadcast, executing the audio mixing operation, and outputting the audio mixing to the annular data buffer area.
2. The method of claim 1, wherein prior to performing the mixing operation, the method further comprises: inquiring a current idle interval and a current occupied interval of the annular data cache area; if the current idle interval is smaller than the frame length of the audio mixing frequency or the current occupied interval is larger than a first preset threshold value, directly entering the next reading period; if the current occupied interval is larger than a second preset threshold value, deleting the original data in the annular data cache area and then entering the next reading period; wherein the first preset threshold is less than the second preset threshold.
3. The method of claim 2, wherein the first and second preset thresholds are positively correlated with a third preset threshold; the third preset threshold is a data volume threshold of the annular data cache area during audio output.
4. The method of claim 1, wherein when the audio stream is an intercom audio stream, prior to reading the audio frames in the intercom audio stream queue, further comprising: judging whether the intercom audio stream is in a cut-off state or not; and if the talkback audio stream is not in the cut-off state, executing the operation of reading the audio frames in the talkback audio stream queue.
5. The method according to claim 1, wherein the method further comprises: the conversational audio frames in the conversational audio stream queue are read in a blocking manner and the non-conversational audio frames in the non-conversational audio stream are read in a non-blocking manner.
6. The method of claim 5, wherein when the dialogue audio frames in the dialogue audio stream queue are read in a blocking manner, entering a next reading period if a remaining audio frame length in the dialogue audio stream queue is smaller than the dialogue audio frame length after a preset timeout period; the preset timeout time is 2-4 times of the frame length of the intercom audio frame.
7. The method according to any one of claims 1 to 6, wherein when the receiving end network is abnormal, after reading the first intercom audio frame after the recovery to the normal state, deleting the intercom audio frame within the preset time in the intercom audio stream queue; the preset time is determined based on the time that the receiving end is in the network abnormal state.
8. A low-latency medical intercom control system employing the low-latency medical intercom control method of any of claims 1 to 7, said system comprising: the sending end is used for collecting the intercom audio signal, preprocessing the intercom audio signal and sending the intercom audio signal to the receiving end; the receiving end is used for receiving the audio stream and storing the audio stream into a corresponding audio stream queue according to the type of the audio stream; the audio stream comprises a dialogue audio stream and a non-dialogue audio stream, and the dialogue audio stream is an audio stream sent by a sending end; circularly reading the audio frames in each audio stream queue according to a preset period; judging whether the read audio frames have effective audio to be broadcast or not, if any audio stream queue has effective audio to be broadcast, executing the audio mixing operation, and outputting the audio mixing to the annular data buffer area.
9. An electronic device comprising a processor and a memory; the memory stores at least one instruction for execution by the processor to implement the low latency medical intercom control method of any of claims 1 to 7.
10. A computer readable storage medium storing at least one instruction for execution by a processor to implement the low latency medical intercom control method of any of claims 1 to 7.

Description

Low-delay medical intercom control method, system, equipment and storage medium Technical Field The invention belongs to the technical field of medical intercom, and particularly relates to a low-delay medical intercom control method, a system, equipment and a storage medium. Background The conventional TCP/IP or UDP network transmission protocol is adopted by the conventional networked intercom equipment, and the communication efficiency is greatly improved based on a wireless transmission mode, so that the method is widely applied to the medical field. In order to facilitate the patient to page the medical staff in the ward of the hospital at present, a paging device is generally arranged, and when the patient has uncomfortable symptoms, the patient or the family member of the patient calls the medical staff by pressing a call button. In the medical image examination room, because the noise after the machine is started may be too large, an intercom device is generally arranged in order to facilitate the conversation between the doctor in the room and the patient in the examination room. However, for the medical intercom system or product with AI voice interaction, namely, under the condition of mixing and superposition of multiple voice services, the intercom function is the most important service in the intercom system, but because all functions share the hardware or software resources of an embedded terminal system, preemption (CPU, memory, network broadband and the like) can occur, and the tuning-in and tuning-out of different voice services can cause fluctuation of intercom instantaneity. Therefore, maintaining the real-time and low latency of medical intercom functions has been a constant concern and improvement in the field. Currently, researchers in the related art often submit logic including mixing, broadcasting control, etc. of multiple services to an ALSA audio system for management and control, so as to reduce complexity of application development, which may cause the researchers to ignore the judgment of different service priorities and the requirements of monitoring, controlling, and improving delay at the application layer. Therefore, how to realize that the talkback delay index can still reach the excellent level under the condition of mixing and superposition of multiple voice services is a problem to be solved urgently. Disclosure of Invention The invention aims to provide a low-delay medical intercom control method, a system, equipment and a storage medium, which can realize that intercom low-delay index always keeps at an excellent level under the condition of mixing and superposition of multi-voice services. In order to achieve the above purpose, the present invention provides the following technical solutions: in a first aspect, an embodiment of the present invention provides a low-latency medical intercom control method, the method including: receiving an audio stream, and storing the audio stream into a corresponding audio stream queue according to the type of the audio stream; the audio stream comprises a dialogue audio stream and a non-dialogue audio stream, and the dialogue audio stream is an audio stream sent by a sending end; circularly reading the audio frames in each audio stream queue according to a preset period; judging whether the read audio frames have effective audio to be broadcast or not, if any audio stream queue has effective audio to be broadcast, executing the audio mixing operation, and outputting the audio mixing to the annular data buffer area. Further, before the performing the mixing operation, the method further includes: inquiring a current idle interval and a current occupied interval of the annular data cache area; if the current idle interval is smaller than the frame length of the audio mixing frequency or the current occupied interval is larger than a first preset threshold value, directly entering the next reading period; if the current occupied interval is larger than a second preset threshold value, deleting the original data in the annular data cache area and then entering the next reading period; wherein the first preset threshold is less than the second preset threshold. The first preset threshold value and the second preset threshold value are positively correlated with a third preset threshold value; the third preset threshold is a data volume threshold of the annular data cache area during audio output. Further, when the audio stream is an intercom audio stream, before reading the audio frame in the dialogue audio stream queue, the method further comprises: judging whether the intercom audio stream is in a cut-off state or not; and if the talkback audio stream is not in the cut-off state, executing the operation of reading the audio frames in the talkback audio stream queue. Further, the method further comprises: the conversational audio frames in the conversational audio stream queue are read in a blocking manner and the non-conversational a