CN-121979472-A - Bluetooth call audio processing method and device
Abstract
The invention provides a Bluetooth call audio processing method and device, the method comprises the steps of receiving original audio data from at least one Bluetooth call device, packaging the original audio data and a corresponding Bluetooth call device type identifier to form an audio data packet, acquiring corresponding processing configuration information based on a pre-stored Bluetooth call device configuration file according to the device type identifier in the audio data packet, wherein the processing configuration information at least comprises audio format parameters and processing parameters, carrying out standardized processing on the original audio data according to the processing configuration information to generate standard audio data conforming to a unified format, and notifying a storage position of the standard audio data to at least one AI application of an application layer. The invention reduces the development complexity and the integration cost of the AI application side, and remarkably improves the system stability, the processing efficiency and the reliability of the whole voice interaction experience under the multi-model collaborative processing scene.
Inventors
- Sun tianhong
Assignees
- 惠州华阳通用电子有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251224
Claims (10)
- 1. A method for processing audio of a vehicle-mounted bluetooth call, which is executed by a framework layer of a vehicle-mounted system, comprising: Receiving raw audio data from at least one bluetooth telephony device; packaging the original audio data and the corresponding Bluetooth communication equipment type identifier to form an audio data packet; Based on a pre-stored Bluetooth communication equipment configuration file, corresponding processing configuration information is obtained according to equipment type identifiers in the audio data packet, wherein the processing configuration information at least comprises audio format parameters and processing parameters; performing standardization processing on the original audio data according to the processing configuration information to generate standard audio data conforming to a unified format, and And notifying at least one AI application of the application layer of the storage position of the standard audio data.
- 2. The method of claim 1, wherein the bluetooth telephony device profile is pre-configured with configuration information entries for a plurality of different bluetooth telephony device types, respectively, each configuration information entry including a device type identifier and corresponding audio base format parameters and audio processing parameters, the audio base format parameters including at least one of a sampling rate, a channel number, and a bit width, the audio processing parameters including a gain factor and a noise reduction factor.
- 3. The method of claim 2, wherein normalizing the original audio data according to the process configuration information comprises: Resampling and/or recoding the original audio data according to a comparison result of the audio basic format parameter and a preset standard audio format, wherein the standard audio format comprises a standard sampling rate, a standard channel number and a standard bit width; Performing gain amplification processing on the audio data according to the gain factors, and performing amplitude limiting processing on the gain data to limit the amplitude of the gain data within a preset numerical range; and according to the noise reduction factors, performing noise reduction processing on the audio data by adopting spectral subtraction.
- 4. The method of claim 3, wherein the noise reduction is performed by spectral subtraction using the formula |Y (k) |= |X (k) | -NoiseFactor |N (k) |, wherein |Y (k) | is an enhanced amplitude spectrum, |X (k) | is an amplitude spectrum of noisy speech, |N (k) | is an estimated noise amplitude spectrum, noiseFactor is the noise reduction factor, and the noise amplitude spectrum |N (k) | is estimated during an initial silence period after a call starts.
- 5. The method of claim 2, wherein the normalizing process further comprises: Adding a data head for the processed audio data, wherein the data head comprises parameter information of the unified format and a data tail format identifier corresponding to the equipment type identifier; the standard audio data set to which the data header and the corresponding trailer data are added is stored to a memory.
- 6. The method of claim 1, wherein the bluetooth telephony device profile supports updating over a network.
- 7. The vehicle-mounted Bluetooth call audio processing system is characterized by comprising a framework layer module and an application layer module: The frame layer module includes: the recording module is used for receiving the original audio data from at least one Bluetooth communication device and packaging the original audio data with the device type identifier; The algorithm engine module is connected with the recording module and the storage module and is used for acquiring processing configuration information from the Bluetooth communication equipment configuration file in the storage module according to the equipment type identifier and carrying out standardized processing on the packaged original audio data to generate standard audio data; The storage module is used for storing the standard audio data and the Bluetooth communication equipment configuration file; The application layer module comprises at least one AI processing module which is in communication connection with the algorithm engine module and is used for acquiring the standard audio data from the storage module for business processing after receiving the notification.
- 8. The system of claim 7, wherein the algorithm engine module comprises: The configuration inquiring unit is used for inquiring corresponding audio basic format parameters and audio processing parameters from the configuration file according to the equipment type identifier; The format conversion unit is used for resampling and/or recoding the audio data according to the difference between the audio basic format parameter and a preset standard format; The gain processing unit is used for carrying out gain amplification and amplitude limiting on the audio data according to the gain factors; and the noise reduction processing unit is used for reducing noise of the audio data by adopting spectral subtraction according to the noise reduction factors.
- 9. The system according to claim 7 or 8, wherein the algorithm engine module further comprises a data encapsulation unit for adding a data header containing the uniform format parameters and the tail format identification to the processed audio data and combining the data header into a standard audio data set.
- 10. The system of claim 7, wherein the bluetooth telephony device profile is stored in the memory module, and further comprising a configuration update interface for receiving configuration update information from a network to update the profile.
Description
Bluetooth call audio processing method and device Technical Field The invention relates to the technical field of audio processing, in particular to a Bluetooth call audio processing method and device. Background With the deep application of the AI large model in the vehicle-mounted voice interaction scene, the requirement for high-quality standardized voice input data is increasingly increased. The bluetooth communication device is used as an important voice input source, and the audio data output by the bluetooth communication device has significant differences in terms of sampling rate, channel number, coding format, noise characteristics, volume gain and the like due to various brands and models. In the prior art, each AI large model within an in-vehicle system typically needs to interface and process raw audio data from different bluetooth devices independently. The distributed processing mode causes that each AI application needs to be internally provided with complex audio preprocessing logic to adapt to various devices, thus not only causing heavy development and maintenance burden, but also being difficult to ensure the accuracy and stability of the subsequent AI analysis result due to inconsistent processing logic and non-uniform parameter configuration, and severely restricting the overall efficiency and user experience of the vehicle-mounted AI system. Disclosure of Invention The invention provides a Bluetooth call audio processing method and device, which aim to solve the defects in the prior art, reduce the development complexity and the integration cost of an AI application side, and remarkably improve the system stability, the processing efficiency and the reliability of overall voice interaction experience in a multi-model collaborative processing scene. In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: in one aspect, the present invention provides a method for processing bluetooth call audio, including: Receiving raw audio data from at least one bluetooth telephony device; packaging the original audio data and the corresponding Bluetooth communication equipment type identifier to form an audio data packet; Based on a pre-stored Bluetooth communication equipment configuration file, corresponding processing configuration information is obtained according to equipment type identifiers in the audio data packet, wherein the processing configuration information at least comprises audio format parameters and processing parameters; performing standardization processing on the original audio data according to the processing configuration information to generate standard audio data conforming to a unified format, and And notifying at least one AI application of the application layer of the storage position of the standard audio data. Specifically, the Bluetooth communication device configuration file is preset with configuration information items for a plurality of different Bluetooth communication device types respectively, each configuration information item comprises a device type identifier, corresponding audio basic format parameters and audio processing parameters, the audio basic format parameters comprise at least one of sampling rate, channel number and bit width, and the audio processing parameters comprise gain factors and noise reduction factors. Specifically, the normalizing the original audio data according to the processing configuration information includes: Resampling and/or recoding the original audio data according to a comparison result of the audio basic format parameter and a preset standard audio format, wherein the standard audio format comprises a standard sampling rate, a standard channel number and a standard bit width; Performing gain amplification processing on the audio data according to the gain factors, and performing amplitude limiting processing on the gain data to limit the amplitude of the gain data within a preset numerical range; and according to the noise reduction factors, performing noise reduction processing on the audio data by adopting spectral subtraction. The noise reduction processing is specifically performed by using a formula of |Y (k) |= |X (k) | -NoiseFactor x|N (k) |, wherein |Y (k) | is an enhanced amplitude spectrum, |X (k) | is an amplitude spectrum of noisy speech, |N (k) | is an estimated noise amplitude spectrum, noiseFactor is the noise reduction factor, and the noise amplitude spectrum |N (k) | is estimated in an initial silence period after conversation starts. Further, the normalization process further includes: Adding a data head for the processed audio data, wherein the data head comprises parameter information of the unified format and a data tail format identifier corresponding to the equipment type identifier; the standard audio data set to which the data header and the corresponding trailer data are added is stored to a memory. Specifically, the bluetooth telephony device profile su