CN-122024767-A - Automatic identification method, device and equipment for audio data format and readable storage medium

CN122024767ACN 122024767 ACN122024767 ACN 122024767ACN-122024767-A

Abstract

An automatic identification method, device, equipment and readable storage medium for audio data format. The method comprises the steps of respectively determining consistency scores corresponding to at least two hypothesis formats, wherein determining the consistency score corresponding to each hypothesis format comprises the steps of analyzing original audio data according to the hypothesis format to obtain analysis results, extracting features from the analysis results, converting the analysis results into target formats to obtain converted audio data, extracting the features from the converted audio data, calculating the consistency scores of the features extracted twice, and taking the hypothesis format corresponding to the highest consistency score as the real format of the original audio data. The application can automatically and accurately infer the real format of the original audio data in the DSP under the condition of not depending on any external head information or manual work.

Inventors

ZHAO SHIQI
JIN HUIJIE
YANG LING
Xu Lantu

Assignees

湖北芯擎科技股份有限公司

Dates

Publication Date: 20260512
Application Date: 20260415

Claims (10)

1. An automatic identification method for an audio data format, characterized in that the automatic identification method for an audio data format comprises the following steps: Determining a consistency score corresponding to at least two hypothesis formats, respectively, wherein each hypothesis format is determined Corresponding consistency scoring Comprising the following steps: according to the hypothesized format Analyzing the original audio data to obtain an analysis result ; From the analysis result Extracting features from ; Will analyze the result Converting to target format to obtain converted audio data ; From converting audio data Extracting features from ; Wherein the features are And features The method comprises the steps of completely consistent characteristic components, wherein the characteristic components comprise at least one of statistical characteristic components, correlation characteristic components and quantization noise characteristic components, the statistical characteristic components comprise mean values, variances, peaks and root mean square, the correlation characteristic components comprise inter-channel correlation coefficients, and the quantization noise characteristic components comprise quantization noise variances and histogram distribution differences; Calculation of And (3) with Similarity between corresponding feature components in (a); All the similarity are weighted and summed to obtain a consistency score ; The hypothetical format corresponding to the highest consistency score is used as the true format of the original audio data.
2. The method of claim 1, wherein the hypothetical format includes hypothetical data types and hypothetical audio channel layouts, and wherein different hypothetical formats have different hypothetical data types and/or hypothetical audio channel layouts.
3. The method of claim 1, wherein the raw audio data is audio data internal to the DSP for input to an audio algorithm module.
4. The automatic audio data format recognition method according to any one of claims 1 to 3, further comprising, after the hypothetical format corresponding to the highest consistency score is taken as the true format of the original audio data: determining a format conversion path according to the real format and the target format; And converting the original audio data into a target format according to the format conversion path, and inputting the conversion result into an audio algorithm module.
5. An automatic audio data format recognition apparatus, characterized in that the automatic audio data format recognition apparatus comprises: A hypothesis analysis module for determining the consistency scores corresponding to at least two hypothesis formats, wherein each hypothesis format is determined Corresponding consistency scoring Comprising the following steps: according to the hypothesized format Analyzing the original audio data to obtain an analysis result ; From the analysis result Extracting features from ; Will analyze the result Converting to target format to obtain converted audio data ; From converting audio data Extracting features from ; Wherein the features are And features The method comprises the steps of completely consistent characteristic components, wherein the characteristic components comprise at least one of statistical characteristic components, correlation characteristic components and quantization noise characteristic components, the statistical characteristic components comprise mean values, variances, peaks and root mean square, the correlation characteristic components comprise inter-channel correlation coefficients, and the quantization noise characteristic components comprise quantization noise variances and histogram distribution differences; Calculation of And (3) with Similarity between corresponding feature components in (a); All the similarity are weighted and summed to obtain a consistency score ; And the identification module is used for taking the hypothesis format corresponding to the highest consistency score as the real format of the original audio data.
6. The apparatus of claim 5, wherein the hypothetical format includes a hypothetical data type and a hypothetical audio channel layout, and wherein different hypothetical formats have different hypothetical data types and/or hypothetical audio channel layouts.
7. The automatic audio data format recognition device of claim 5, wherein the raw audio data is audio data internal to the DSP for input to an audio algorithm module.
8. The automatic audio data format recognition device according to any one of claims 5 to 7, further comprising a conversion module for: determining a format conversion path according to the real format and the target format; And converting the original audio data into a target format according to the format conversion path, and inputting the conversion result into an audio algorithm module.
9. An automatic audio data format recognition device, characterized in that it comprises a processor, a memory, and an automatic audio data format recognition program stored on the memory and executable by the processor, wherein the automatic audio data format recognition program, when executed by the processor, implements the steps of the automatic audio data format recognition method according to any one of claims 1 to 4.
10. A computer-readable storage medium, wherein an audio data format automatic recognition program is stored on the computer-readable storage medium, wherein the audio data format automatic recognition program, when executed by a processor, implements the steps of the audio data format automatic recognition method according to any one of claims 1 to 4.

Description

Automatic identification method, device and equipment for audio data format and readable storage medium Technical Field The present application relates to the field of audio data processing technologies, and in particular, to an automatic identification method, apparatus, device and computer readable storage medium for an audio data format. Background In the application scenario where audio algorithms are integrated with a Digital Signal Processor (DSP), adaptation of the audio data format is a critical and ubiquitous problem. Generally, audio algorithms tend to employ floating point (float) data types and Planar (Planar) layouts for data processing in order to achieve greater dynamic range and avoid operation overflow, while DSP hardware often employs fixed point fraction (fract) data types and Interleaved (Interleaved) layouts for data transmission and storage in order to pursue higher operational efficiency and lower power consumption. In the prior art, in order to solve the format difference, the following scheme is generally adopted: Manually writing conversion codes, wherein a developer manually realizes format conversion and reverse conversion logic before and after algorithm processing. The translation code needs to be custom developed for specific target DSP platform characteristics (e.g., data type, layout, alignment requirements, etc.). However, the above prior art solutions have the following significant drawbacks: the algorithm developer must go deep into the data format, memory layout and instruction set of the target DSP platform to correctly implement the data conversion. When an algorithm needs to be migrated to a different DSP platform, the conversion logic needs to be rewritten from zero, resulting in a large number of repeated development works. The prior system can not automatically acquire the original data format input into the algorithm module in the DSP during operation, and the format information of the original data format must be obtained by a developer through consulting a document or manually debugging in advance and hard-coded configuration is carried out in codes. Once the source of the input data is uncertain or the DSP from different manufacturers, the algorithm module cannot automatically judge the data type and layout. Disclosure of Invention The application provides an automatic identification method, an automatic identification device, automatic identification equipment and a computer readable storage medium for an audio data format, which can solve the technical problem that the data format of original audio data in a DSP cannot be automatically identified in the prior art. In a first aspect, an embodiment of the present application provides an automatic identification method for an audio data format, where the automatic identification method for an audio data format includes: Determining a consistency score corresponding to at least two hypothesis formats, respectively, wherein each hypothesis format is determined Corresponding consistency scoringComprising the following steps: according to the hypothesized format Analyzing the original audio data to obtain an analysis result; From the analysis resultExtracting features from; Will analyze the resultConverting to target format to obtain converted audio data; From converting audio dataExtracting features from; Wherein the features areAnd featuresThe method comprises the steps of completely consistent characteristic components, wherein the characteristic components comprise at least one of statistical characteristic components, correlation characteristic components and quantization noise characteristic components, the statistical characteristic components comprise mean values, variances, peaks and root mean square, the correlation characteristic components comprise inter-channel correlation coefficients, and the quantization noise characteristic components comprise quantization noise variances and histogram distribution differences; Calculation of And (3) withSimilarity between corresponding feature components in (a); All the similarity are weighted and summed to obtain a consistency score The hypothetical format corresponding to the highest consistency score is used as the true format of the original audio data. In combination with the first aspect, in one embodiment, the hypothetical format comprises a hypothetical data type and a hypothetical audio channel layout, wherein different hypothetical formats have different hypothetical data types and/or hypothetical audio channel layouts. With reference to the first aspect, in one implementation, the raw audio data is audio data internal to the DSP for input to the audio algorithm module. With reference to the first aspect, in one implementation manner, after the assumed format corresponding to the highest consistency score is used as the real format of the original audio data, the method further includes: determining a format conversion path according to the real format an