CN-121996811-A - Music style identification and classification method and system

CN121996811ACN 121996811 ACN121996811 ACN 121996811ACN-121996811-A

Abstract

The invention relates to the technical field of music information retrieval and discloses a method and a system for identifying and classifying music styles, wherein the method comprises the steps of collecting target audio and multi-style reference audio data; the method comprises the steps of carrying out music acoustic feature extraction and combination on audio data to construct a target music acoustic feature vector, searching and matching the most applicable target music style classifier type through an optimization algorithm based on a preset standard feature vector matrix, carrying out classification calculation on the target feature vector by utilizing a matched classifier to obtain target music style classification data containing a prediction style label and probability thereof, and finally carrying out credibility evaluation on classification results and generating a report. The system comprises an audio signal processing module, a characteristic processing and matching module and a music style classification executing module. The method improves the robustness and the discriminant of the feature layer, and lays a reliable data foundation for high-precision music style classification.

Inventors

LIN NANXI

Assignees

闽南师范大学

Dates

Publication Date: 20260508
Application Date: 20260410

Claims (10)

1. A method for identifying and classifying a style of music, the method comprising the steps of: s1, collecting target audio data and multi-style reference audio data; S2, performing sampling rate measurement processing on the target audio data and the multi-style reference audio data, and generating target audio sampling rate data and reference audio sampling rate data; S3, carrying out sampling rate consistency judgment processing on the target audio and the reference audio, generating audio sampling rate consistency judgment data, and directly executing the step S5 when the sampling rates are consistent; s4, when the sampling rates are inconsistent, sampling rate adjustment processing is carried out on the target audio data, and target audio adjustment data are generated; S5, based on the target audio adjustment data, the target audio adjustment data is unmodified original target audio data, and music acoustic feature parameter extraction and combination processing are carried out to construct a target music acoustic feature vector; s6, based on the target audio adjustment data, performing type matching processing on the music style classifier, and generating target music style classifier type characteristic data; S7, performing music style classification processing based on the target music acoustic feature vector, the target music style classifier type feature data and a music style tag library to generate target music style classification data; And S8, performing reliability evaluation processing on the music style classification result according to the target music style classification data, and generating a music style classification reliability evaluation report.
2. The method for identifying and classifying a music style according to claim 1, wherein the step of collecting the target audio data and the multi-style reference audio data in S1 comprises the steps of: s11, acquiring a music file and an audio stream to be identified from an audio database through audio acquisition equipment as target audio data; S12, acquiring an audio file set covering a plurality of predefined music styles from a pre-constructed music style sample library as multi-style reference audio data, wherein the music styles comprise three or more of classics, jazz, rock, pop, electrons, ballad, hip hop, blue tune and villages.
3. The method according to claim 1, wherein the step of performing the sample rate measurement processing of the target audio data and the multi-style reference audio data in S2, and generating the target audio sample rate data and the reference audio sample rate data comprises the steps of: S21, respectively importing the target audio data and the multi-grid reference audio data into an audio analysis platform; S22, extracting audio sampling rate information corresponding to the target audio data from the audio analysis platform by adopting a metadata analysis algorithm and a time-frequency analysis algorithm, generating target audio sampling rate data, extracting audio sampling rate information corresponding to the multi-style reference audio data, and generating reference audio sampling rate data.
4. The method for identifying and classifying a music style according to claim 1, wherein the step S3 of performing a sampling rate consistency judgment process of the target audio and the reference audio to generate audio sampling rate consistency judgment data, and directly performing the step S5 when the sampling rates are consistent includes the steps of: s31, acquiring the target audio sampling rate data and the reference audio sampling rate data; S32, comparing the target audio sampling rate data with the reference audio sampling rate data in a numerical value manner, and generating audio sampling rate consistency judging data according to a numerical value comparison result; When the comparison of the target audio sampling rate data and the reference audio sampling rate data is successful, the sampling rate is consistent, the audio sampling rate consistency judgment data is output as a consistency mark, and the step S5 is directly executed; and when the comparison of the target audio sampling rate data and the reference audio sampling rate data is unsuccessful, and the sampling rate is inconsistent, outputting the audio sampling rate consistency judging data as an inconsistent mark.
5. The method for identifying and classifying a music style according to claim 1, wherein when the sampling rates are inconsistent in S4, performing the sampling rate adjustment processing on the target audio data, and generating the target audio adjustment data includes the steps of: And S41, when the audio sampling rate consistency judging data is an inconsistent mark, adopting an audio resampling algorithm to adjust the sampling rate of the target audio data according to the reference audio sampling rate data, and generating target audio adjustment data.
6. The method for identifying and classifying a music style according to claim 1, wherein the step of performing the music acoustic feature parameter extraction and combination processing based on the target audio adjustment data in S5 to construct the target music acoustic feature vector comprises the steps of: S51, extracting musical acoustic characteristics from the target audio data and the target audio adjustment data with consistent sampling rates, wherein the extracted acoustic characteristics comprise time domain characteristics, frequency domain characteristics and time-frequency domain characteristics, the time domain characteristics comprise zero-crossing rate, short-time energy and amplitude envelope, the frequency domain characteristics comprise Mel frequency cepstrum coefficients, spectrum centroid, spectrum bandwidth and spectrum roll-off points, and the time-frequency domain characteristics comprise chromatograms and Mel spectrograms; S52, normalizing and splicing all the extracted acoustic features of the music to construct a multidimensional and unified digital feature representation, namely a target acoustic feature vector of the music.
7. The method for identifying and classifying a music style according to claim 1, wherein the step of performing a music style classifier type matching process based on the target audio adjustment data in S6, and generating the target music style classifier type feature data includes the steps of: S61, establishing standard music acoustic feature vector matrixes corresponding to different music style classification algorithms, wherein feature space centers and representative feature vectors obtained by learning various classifiers on a standard training set are stored in the matrixes, and the different music style classification algorithms comprise three or more of a support vector machine, a random forest, a gradient lifting decision tree, a convolutional neural network, a cyclic neural network and an attention mechanism model; s62, performing similarity calculation and matching on the target music acoustic feature vector and each standard feature vector in the standard music acoustic feature vector matrix, searching out music style classification algorithm type information corresponding to the standard feature vector which is most matched with the target music acoustic feature vector, generating target music style classifier type feature data through data identification, and executing the specific operation steps of generating the target music style classifier type feature data as follows: S621, initializing, namely setting the maximum iteration times of an optimization algorithm, randomly initializing the position of an intelligent optimization algorithm population in the search space of the standard music acoustic feature vector matrix, wherein each individual position corresponds to a candidate classifier type; S622, in the exploration stage, simulating global exploration behaviors of the intelligent optimization algorithm population in a search space, calculating and updating a new position of each individual according to the current position, the random step length and the currently known optimal solution position of each individual, wherein the new position represents a potential optimal classifier type selection, and calculating a matching fitness value of a classifier corresponding to the new position to the target music acoustic feature vector; Wherein the fitness value is matched Is defined by the following formula: ; Wherein, the To match the fitness value, the degree of matching of the candidate classifier type with the target audio is measured, the higher the value thereof is, the better the matching degree is, For the target musical acoustic feature vector, For a certain standard feature vector in the standard musical acoustic feature vector matrix, To calculate the cosine similarity function between the two vectors, To calculate the euclidean distance function between two vectors, Is an adjustable weight coefficient between 0 and 1, balances the influence of similarity and distance, To adjust the positive coefficient of the influence intensity of the distance term, Is a natural exponential function; s623, in the development stage, simulating local development behaviors of the intelligent optimization algorithm population in a search space, finely searching the individuals near the current position of the individuals, calculating a finer new position, and evaluating the matching fitness value of the classifier corresponding to the position; S624, after each position update in the exploration stage and the development stage, comparing the matching fitness value of the new position and the old position, when the fitness of the new position is better, replacing the old position with the new position, and recording the optimal individual position and the classifier type corresponding to the optimal individual position in the current population; s625, repeatedly executing the steps S622 to S624 until the preset maximum iteration times are reached, and finally outputting the type of the music style classifier corresponding to the optimal individual position, and marking the type of the music style classifier as the characteristic data of the type of the target music style classifier.
8. The method for identifying and classifying a music style according to claim 1, wherein the step of generating the target music style classification data in S7 based on the target music acoustic feature vector, the target music style classifier type feature data, and a music style tag library includes the steps of: s71, carrying out data combination on the target music acoustic feature vector and the target music style classifier type feature data to construct a music style classification input data packet; s72, loading a corresponding music style classifier model by the audio analysis platform according to the type characteristic data of the target music style classifier; S73, inputting the target music acoustic feature vector into a loaded music style classifier model, and calculating probability distribution of the feature vector belonging to each predefined style in a music style tag library by the classifier model; And S74, selecting the music style label with the highest probability as a final classification result, and generating classification data comprising the target audio identification, the predicted music style label and the probability thereof, namely target music style classification data.
9. The method for identifying and classifying a musical style according to claim 1, wherein S8 comprises the steps of: S81, acquiring the target music style classification data, particularly a prediction probability value thereof; s82, obtaining a similarity score of a standard feature vector which is obtained in the matching process of the S6 and is most matched with the target music acoustic feature vector; s83, based on the predicted probability value and the similarity score, carrying out fusion calculation by combining a preset reliability calculation rule to generate a comprehensive reliability score; the specific way to calculate the integrated confidence score C is defined by the following formula: ; Wherein, the In order to integrate the confidence score(s), For the predicted highest probability value derived from the target music style classification data, For the similarity score of the standard feature vector that best matches the target musical acoustic feature vector, And Respectively is And And (2) weight coefficient of , And Respectively obtained by statistics in historical data And Is used for the reference mean value of (c), The punishment coefficient for adjusting the degree of negative influence of the result dispersion on the credibility is square root operation; S84, judging the credibility level of the classification result according to the threshold value interval where the comprehensive credibility score is located, and formatting to generate a music style classification credibility evaluation report containing the target audio identification, the classification result, the credibility score and the level and the evaluation description.
10. A music style recognition and classification system for implementing a music style recognition and classification method according to any one of claims 1-9, the system comprising: The audio signal processing module is used for acquiring target and reference audio data through the audio acquisition and processing unit, analyzing audio sampling rate information by the sampling rate measuring unit, generating consistency judging data through the sampling rate consistency judging unit, and calling the sampling rate adjusting unit to process and output adjusted target audio data when the consistency is judged to be inconsistent; The feature processing and matching module is used for receiving the output of the audio signal processing module, calculating the time domain, frequency domain and time domain features of the audio through the acoustic feature extraction unit, integrating the acoustic feature vector with the feature vector construction unit to generate a target music acoustic feature vector, and searching and outputting optimal target music style classifier type feature data according to a matching fitness value through the classifier matching unit by combining a pre-stored standard feature vector matrix and an optimization algorithm; the music style classification execution module receives the output of the feature processing and matching module, packages the feature vector and the classifier type data through the classification input construction unit, utilizes the classifier call and calculation unit to execute style classification reasoning, generates target music style classification data, analyzes the reliability of the classification result according to the comprehensive reliability score through the reliability evaluation unit, and generates and outputs a music style classification reliability evaluation report.

Description

Music style identification and classification method and system Technical Field The invention relates to the technical field of music information retrieval, in particular to a music style identification and classification method and a system. Background Music information retrieval is a interdisciplinary research area that involves extracting, analyzing and retrieving information from music, which is intended to study the computational tools, systems and knowledge representations required to find, organize, analyze and interact with music-related data. At present, when automatic identification and classification of music styles are performed, sampling rate differences exist between collected target audio and audio of a reference audio library due to different sources, and when effective and self-adaptive preprocessing is not performed uniformly, features are directly extracted on the basis of differentiated data, reference deviation exists in the extracted acoustic features, and further the accuracy of all subsequent analysis is affected. Therefore, a method and a system for identifying and classifying music styles are provided to solve the above problems. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a music style identification and classification method and a system, which solve the problems that the acquired target audio and the audio of a reference audio library in the background art have sampling rate difference due to different sources and influence the accuracy of all subsequent analyses. In order to achieve the above purpose, the invention provides a method and a system for identifying and classifying music styles, wherein the method comprises the following steps: s1, collecting target audio data and multi-style reference audio data; S2, performing sampling rate measurement processing on the target audio data and the multi-style reference audio data, and generating target audio sampling rate data and reference audio sampling rate data; S3, carrying out sampling rate consistency judgment processing on the target audio and the reference audio, generating audio sampling rate consistency judgment data, and directly executing the step S5 when the sampling rates are consistent; s4, when the sampling rates are inconsistent, sampling rate adjustment processing is carried out on the target audio data, and target audio adjustment data are generated; S5, based on the target audio adjustment data, the target audio adjustment data is unmodified original target audio data, and music acoustic feature parameter extraction and combination processing are carried out to construct a target music acoustic feature vector; s6, based on the target audio adjustment data, performing type matching processing on the music style classifier, and generating target music style classifier type characteristic data; S7, performing music style classification processing based on the target music acoustic feature vector, the target music style classifier type feature data and a music style tag library to generate target music style classification data; And S8, performing reliability evaluation processing on the music style classification result according to the target music style classification data, and generating a music style classification reliability evaluation report. Preferably, the step of collecting the target audio data and the multi-style reference audio data in step S1 includes the following steps: s11, acquiring a music file and an audio stream to be identified from an audio database through audio acquisition equipment as target audio data; S12, acquiring an audio file set covering a plurality of predefined music styles from a pre-constructed music style sample library as multi-style reference audio data, wherein the music styles comprise three or more of classics, jazz, rock, pop, electrons, ballad, hip hop, blue tune and villages. Preferably, the step of performing sample rate measurement processing on the target audio data and the multi-style reference audio data in S2, and generating target audio sample rate data and reference audio sample rate data includes the following steps: S21, respectively importing the target audio data and the multi-grid reference audio data into an audio analysis platform; S22, extracting audio sampling rate information corresponding to the target audio data from the audio analysis platform by adopting a metadata analysis algorithm and a time-frequency analysis algorithm, generating target audio sampling rate data, extracting audio sampling rate information corresponding to the multi-style reference audio data, and generating reference audio sampling rate data. Preferably, in the step S3, a sampling rate consistency determination process of the target audio and the reference audio is performed, audio sampling rate consistency determination data is generated, and when the sampling rates are consistent, the step S5 is directly performed, including the foll