CN-121997055-A - Singing matching method, singing matching device, electronic device, storage medium and program product

CN121997055ACN 121997055 ACN121997055 ACN 121997055ACN-121997055-A

Abstract

The embodiment of the disclosure provides a singing matching method, a singing matching device, electronic equipment, a storage medium and a program product. The method comprises the steps of obtaining a training set of a singer, extracting first singing characteristics of each singing content sample in the training set under a target singing attribute, wherein the training set is a set of singing content samples adopted when the singer is obtained through training, determining second singing characteristics of the singer under the target singing attribute based on the first singing characteristics of each singing content sample, and determining singing matching degree between the singer and the content to be singed according to the second singing characteristics and third singing characteristics of the content to be singed under the target singing attribute. By utilizing the technical scheme, the embodiment of the disclosure can enrich the determining mode of the singing matching degree between the singer and the content to be singed.

Inventors

YANG CIYUAN
GENG ZIHAO
WANG XIAOYAO
HONG CHANGQING
WU XIAOQING

Assignees

北京字跳网络技术有限公司

Dates

Publication Date: 20260508
Application Date: 20241101

Claims (14)

1. A singing matching method, comprising: Acquiring a training set of a singer, and extracting first singing characteristics of each singing content sample in the training set under a target singing attribute, wherein the training set is a set of singing content samples adopted when the singer is obtained through training; Determining second singing characteristics of the singer under the target singing attribute based on the first singing characteristics of the singing content samples; And determining the singing matching degree between the singer and the content to be singed according to the second singing characteristic and the third singing characteristic of the content to be singed under the target singing attribute.
2. The method of claim 1, wherein the target performance attribute comprises a gamut attribute, wherein the determining a second performance characteristic of the singer under the target performance attribute based on the first performance characteristic of the respective performance content sample comprises: according to the first singing characteristics of each singing content sample, counting a first occurrence value of each attribute value of the target singing attribute in each singing content sample; generating an appearance frequency curve of the target singing attribute according to the first appearance frequency value; and determining a first attribute value range of the singer under the target singing attribute as a second singing characteristic of the singer under the target singing attribute based on derivative values of all attribute values in the appearance frequency curve.
3. The method of claim 2, further comprising, prior to said generating the appearance count curve for the target performance attribute from the first appearance count value: Determining an effective occurrence number value and an ineffective occurrence number value in the first occurrence number values based on a preset attribute value range, wherein an attribute value corresponding to the effective occurrence number value is located in the preset attribute value range, and an attribute value corresponding to the ineffective occurrence number value is located outside the preset attribute value range; obtaining a target occurrence number value meeting a preset condition in each invalid occurrence number value, and determining an effective occurrence number value smaller than the target occurrence number value as an invalid occurrence number value; the generating the appearance frequency curve of the target singing attribute according to the first appearance frequency value comprises the following steps: and generating an appearance frequency curve of the target singing attribute according to the remaining effective appearance frequency values.
4. The method of claim 2, wherein the determining a first range of attribute values for the singer under the target singing attribute based on derivative values at each attribute value in the occurrence curve comprises: acquiring a minimum attribute value and a maximum attribute value of which the absolute value of the derivative in the appearance frequency curve is greater than or equal to a preset threshold value based on the derivative value of each attribute value in the appearance frequency curve; And determining a first attribute value range of the singer under the target singing attribute according to the minimum attribute value and the maximum attribute value.
5. The method of claim 2, wherein the third performance characteristic is a second attribute value range of the content to be sung under the target performance attribute, and wherein the determining a performance match between the singer and the content to be sung according to the second performance characteristic and the third performance characteristic of the content to be sung under the target performance attribute comprises: acquiring a sub-attribute value range overlapped between the first attribute value range and the second attribute value range; And calculating the duty ratio of the attribute value in the second attribute value range, which is positioned in the sub-attribute value range, as the singing matching degree of the singer and the content to be singed under the target singing attribute.
6. The method of claim 5, further comprising, prior to said obtaining a sub-attribute value range that overlaps between said first attribute value range and said second attribute value range: Extracting dry sound data of the content to be singed from original singing data of the content to be singed, and counting second occurrence numerical values of each attribute value of the target singing attribute in the content to be singed based on the dry sound data; and determining a second attribute value range of the content to be singed under the target singing attribute according to the second occurrence number value.
7. The method of claim 1, wherein the target singing attribute comprises a wind attribute comprising a plurality of attribute types, and the second singing feature is a type match between the singer and each of the attribute types.
8. The method of claim 7, wherein the determining a performance match between the singer and the content to be singed based on the second performance characteristic and a third performance characteristic of the content to be singed under the target performance attribute comprises: For each attribute type of the target singing attribute, calculating a geometric average value between a first type matching degree of the singer and a second type matching degree of the content to be singed as the singing matching degree of the singer and the content to be singed under the attribute type, wherein the first type matching degree is the type matching degree between the singer and the attribute type, and the second type matching degree is the type matching degree between the content to be singed and the attribute type.
9. The method of claim 1, wherein the target performance attribute comprises a tone color attribute, and wherein the determining a performance match between the singer and the content to be performed based on the second performance characteristic and a third performance characteristic of the content to be performed under the target performance attribute comprises: and calculating the similarity between the second singing feature and the third singing feature of the content to be singed under the target attribute, and taking the similarity as the singing matching degree of the singer and the content to be singed under the target singing attribute.
10. The method according to any one of claims 1-9, wherein the singer is a candidate singer in a singer set, and after determining the singing matching degree between the singer and the content to be singed according to the second singing feature and the third singing feature of the content to be singed under the target singing attribute, further comprising: selecting at least one candidate singer from the singer set according to the singing matching degree to serve as a target singer of the content to be singed; And carrying out sound conversion on the original singing data of the content to be singed based on the target singer to obtain the target singing data of the content to be singed.
11. A singing matching apparatus, comprising: the feature extraction module is used for acquiring a training set of a singer and extracting first singing features of each singing content sample in the training set under a target singing attribute, wherein the training set is a set of singing content samples adopted when the singer is obtained through training; The feature determining module is used for determining second singing features of the singer under the target singing attribute based on the first singing features of the singing content samples; And the matching degree determining module is used for determining the singing matching degree between the singer and the content to be singed according to the second singing characteristic and the third singing characteristic of the content to be singed under the target singing attribute.
12. An electronic device, comprising: At least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the singing matching method of any of claims 1-10.
13. A computer readable storage medium storing computer instructions for causing a processor to implement the singing matching method of any of claims 1-10 when executed.
14. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the singing matching method of any of claims 1-10.

Description

Singing matching method, singing matching device, electronic device, storage medium and program product Technical Field The embodiment of the disclosure relates to the technical field of computers, in particular to a singing matching method, a singing matching device, electronic equipment, a storage medium and a program product. Background In music technology, singing voice conversion (Song Voice Conversion, SVC) is a product of artificial intelligence and computer technology that is used to convert singing voice in a singing track into singing voice of another singer. However, in the related art, when the singing voice conversion is performed, the determination manner of the singing matching degree between the singer and the song is single, and the accuracy of the determined singing matching degree is poor. Disclosure of Invention The embodiment of the disclosure provides a singing matching method, a singing matching device, electronic equipment, a storage medium and a program product, which enrich the determining mode of singing matching degree and improve the accuracy of the determined singing matching degree. In a first aspect, an embodiment of the present disclosure provides a singing matching method, including: Acquiring a training set of a singer, and extracting first singing characteristics of each singing content sample in the training set under a target singing attribute, wherein the training set is a set of singing content samples adopted when the singer is obtained through training; Determining second singing characteristics of the singer under the target singing attribute based on the first singing characteristics of the singing content samples; And determining the singing matching degree between the singer and the content to be singed according to the second singing characteristic and the third singing characteristic of the content to be singed under the target singing attribute. In a second aspect, an embodiment of the present disclosure further provides a singing matching apparatus, including: the feature extraction module is used for acquiring a training set of a singer and extracting first singing features of each singing content sample in the training set under a target singing attribute, wherein the training set is a set of singing content samples adopted when the singer is obtained through training; The feature determining module is used for determining second singing features of the singer under the target singing attribute based on the first singing features of the singing content samples; And the matching degree determining module is used for determining the singing matching degree between the singer and the content to be singed according to the second singing characteristic and the third singing characteristic of the content to be singed under the target singing attribute. In a third aspect, an embodiment of the present disclosure further provides an electronic device, including: One or more processors; A memory for storing one or more programs, The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the singing matching method as described in embodiments of the present disclosure. In a fourth aspect, the embodiments of the present disclosure further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a singing matching method as described in the embodiments of the present disclosure. In a fifth aspect, the disclosed embodiments also provide a computer program product, which when executed by a computer, causes the computer to implement the singing matching method as described in the disclosed embodiments. The singing matching method, device, electronic equipment, storage medium and program product provided by the embodiment of the disclosure are used for obtaining a training set of singers, extracting first singing characteristics of each singing content sample in the training set under a target singing attribute, wherein the training set is a set of singing content samples adopted when the singers are obtained through training, determining second singing characteristics of the singers under the target singing attribute based on the first singing characteristics of each singing content sample, and determining singing matching degree between the singers and the content to be singed according to the second singing characteristics of the singers under the target singing attribute and third singing characteristics of the content to be singed under the target singing attribute. According to the technical scheme, the singing characteristics of the singer are determined based on the singing characteristics of each singing content sample in the training set of the singer, manual listen to is not needed, marking is carried out, determining modes of the singing characteristics of the singer can be enriched, time consumed for determining the singing chara