CN-122006254-A - Self-adaptive adjustment method for personalized character modeling

CN122006254ACN 122006254 ACN122006254 ACN 122006254ACN-122006254-A

Abstract

The invention discloses a self-adaptive adjustment method, equipment and a computer readable storage medium for personalized character modeling, wherein the method comprises the steps of processing music audio data, extracting static theme characteristics and music time sequence emotion vectors; the method comprises the steps of obtaining and processing player voice signals to extract static age characteristics and real-time player state vectors, carrying out weighted fusion on music time sequence emotion vectors and real-time player state vectors, generating fusion condition vectors by combining static theme characteristics and static age characteristics, inputting the fusion condition vectors into a preset conditional generation model to generate modeling parameter vectors, and applying the modeling parameter vectors to a basic role modeling model of a player to adjust basic role modeling of the player in real time. The invention can realize deep personalized immersion experience with time sequence dynamic while guaranteeing identity recognition of players.

Inventors

ZHENG LINLIN
ZHU WEI
YU SHUYANG

Assignees

深圳葫乐科技有限公司

Dates

Publication Date: 20260512
Application Date: 20251125

Claims (10)

1. A method for adaptively adjusting a personalized character model, comprising: processing music audio data, extracting static theme characteristics representing global style of music, and music time sequence emotion vectors representing real-time fluctuation of music; Acquiring and processing player voice signals to extract static age characteristics characterizing physiological attributes of a player and real-time player state vectors characterizing real-time emotional states of the player; Carrying out weighted fusion on the music time sequence emotion vector and the real-time player state vector, and combining the static theme characteristics and the static age characteristics to generate a fusion condition vector; inputting the fusion condition vector into a preset conditional generation model to generate a modeling parameter vector; the build parameter vector is applied to a player's base character build model to adjust the player's base character build in real-time.
2. The personalized character modeling adaptive tuning method according to claim 1, wherein processing the music audio data, extracting static theme features characterizing a global style of music, and music time sequence emotion vectors characterizing real-time fluctuation of music, comprises: extracting a mel spectrogram of the music audio data; processing the Mel spectrogram through a pretrained convolutional recurrent neural network to generate the music time series emotion vector, and Metadata of the music audio data is processed through a pre-trained natural language processing model to generate the static theme features.
3. The personalized character modeling adaptive tuning method according to claim 2, wherein processing the mel-spectrogram through a pre-trained convolutional recurrent neural network to generate the musical time-series emotion vector comprises: Extracting a time sequence feature sequence in a preset time window based on the Mel spectrogram; Based on the time sequence feature sequence, extracting local features through a convolution layer of the convolution cyclic neural network, and capturing time sequence dependency relations through the cyclic layer of the convolution cyclic neural network to generate intermediate emotion representations; and mapping the intermediate emotion representation through a full connection layer based on the intermediate emotion representation to generate the music time sequence emotion vector, wherein the music time sequence emotion vector comprises a arousal degree dimension value and a pleasure degree dimension value.
4. The personalized character modeling adaptive tuning method of claim 1, wherein obtaining and processing the player speech signal to extract static age characteristics characterizing physiological attributes of the player and real-time player state vectors characterizing real-time emotional states of the player comprises: Extracting mel frequency cepstrum coefficients, fundamental frequencies and formant frequencies of the player voice signals; Processing the mel-frequency cepstral coefficients, the fundamental frequency, and the formant frequencies through a pre-trained deep neural network to generate the static age characteristic, and And extracting real-time acoustic features of the player voice signals in a rolling time window, and processing the real-time acoustic features through a pre-trained convolution two-way long-short-term memory network to generate the real-time player state vector.
5. The personalized character modeling adaptive tuning method according to claim 4, wherein extracting real-time acoustic features of the player's voice signal within a rolling time window and processing the real-time acoustic features through a pre-trained convolutional two-way long-short term memory network to generate the real-time player state vector comprises: calculating spectrogram features and prosodic features of the player's speech signal over the rolling time window to generate the real-time acoustic features; Based on the real-time acoustic characteristics, extracting a local mode through a convolution layer of the convolution two-way long-short-term memory network, and fusing a front time sequence context and a rear time sequence context through the two-way long-term memory layer of the convolution two-way long-term memory network to generate real-time emotion embedding; and weighting the real-time emotion embedding by an attention mechanism based on the real-time emotion embedding to generate the real-time player state vector, wherein the real-time player state vector comprises a arousal degree dimension value and a pleasure degree dimension value.
6. The personalized character modeling adaptive adjustment method of claim 1, wherein the weighting fusion of the music time series emotion vector and the real-time player state vector, and the combination of the static theme feature and the static age feature, generates a fusion condition vector, comprises: Calculating a music contribution degree weight of the music time sequence emotion vector and a player contribution degree weight of the real-time player state vector through a preset weight controller based on the real-time player state vector, wherein the weight controller performs smoothing processing on a music feature sequence and a player state sequence in a historical time sequence window so as to enable the music contribution degree weight and the player contribution degree weight to have time continuity; multiplying the musical time series emotion vector by the musical contribution weight based on the musical contribution weight to generate a weighted musical time series emotion vector; multiplying the real-time player state vector by the player contribution weight to generate a weighted real-time player state vector; and splicing the weighted music time sequence emotion vector, the weighted real-time player state vector, the static theme feature and the static age feature to generate the fusion condition vector.
7. The adaptive adjustment method of personalized character modeling according to claim 1, wherein inputting the fusion condition vector into a preset conditional generation model to generate a modeling parameter vector comprises: sampling a noise vector from a standard normal distribution based on the fusion condition vector; compressing, by an encoder of the conditional generative model, the fusion condition vector and the noise vector to generate a potential representation; Reconstructing the potential representation by a decoder of the conditional generative model to generate the styling parameter vector comprising at least one of garment spontaneous light intensity, garment spontaneous light color, garment material UV rolling speed, hair high light intensity, make-up concentration, and accessory visibility.
8. The personalized character modeling adaptive adjustment method according to claim 1, wherein applying the modeling parameter vector to a player's base character modeling model to adjust the player's base character modeling in real time comprises: Extracting parameter values in the modeling parameter vector; Setting the parameter values as material properties and component properties of the basic role modeling model through a preset shader script of a rendering engine so as to adjust the material, light effect, dressing concentration and accessory display and hidden of the basic role modeling model; And rendering the basic role modeling model through the rendering engine based on the adjusted basic role modeling model so as to realize real-time adjustment of the player basic role modeling.
9. A personalized character modeling adaptive adjustment apparatus comprising a memory, a processor, and a personalized character modeling adaptive adjustment program stored on the memory and operable on the processor, wherein the processor implements the personalized character modeling adaptive adjustment method according to any one of claims 1-8 when executing the personalized character modeling adaptive adjustment program.
10. A computer readable storage medium, wherein a personalized character modeling adaptive adjustment program is stored on the computer readable storage medium, and the personalized character modeling adaptive adjustment program, when executed by a processor, implements the personalized character modeling adaptive adjustment method according to any one of claims 1 to 7.

Description

Self-adaptive adjustment method for personalized character modeling Technical Field The present invention relates to the field of virtual character design technologies, and in particular, to a method, an apparatus, and a computer readable storage medium for adaptively adjusting a personalized character model. Background In the technical fields of somatosensory games, virtual character interactions, immersive entertainment experiences and the like, the demands of users on individualization and dynamization of virtual character figures are increasing. Currently, mainstream virtual social platforms or gaming systems typically provide users with a rich pool of character apparel, hairstyles, and make-up. The user performs appearance collocation on the virtual roles controlled by the user mainly through manual selection, shopping mall purchase or specific task unlocking completion. This approach forms the technical basis for the personalization of existing roles. In order to improve the dynamics of the experience, some attempts have been made in the industry. In particular, one common path is music-driven animation generation, such as generating dance video synchronized with music from static images through a multimodal framework, but the solution focuses mainly on real-time rendering of action sequences, ignoring dynamic adjustment of character poses such as materials or makeup, resulting in splitting of music and visual poses. Another path is audio-guided avatar customization, such as changing accessories according to virtual environment context, but such customization is often static, and once generated, is applied fixedly throughout the conversation, failing to respond to musical emotion fluctuations or user real-time input. In summary, in the prior art, there are significant gaps in the aspects of dynamic fusion of music and modeling, character continuity maintenance, and the like, and fine and self-adaptive adjustment of basic modeling of a user cannot be realized. Therefore, how to implement a dual-dynamic and personalized modeling self-adaptive adjustment method capable of responding to the time sequence change of music and real-time emotion of a player on the premise of keeping the continuity of the basic character image of the player is a technical problem to be solved by the technicians in the field. Disclosure of Invention The embodiment of the application provides a self-adaptive adjustment method for personalized character modeling, which aims to solve the technical problems that in the prior art, virtual character modeling and music atmosphere are split, continuity of basic images of players is damaged, and the adaptation effect is static and single. In order to achieve the above object, an embodiment of the present application provides a method for adaptively adjusting a personalized character model, including: processing music audio data, extracting static theme characteristics representing global style of music, and music time sequence emotion vectors representing real-time fluctuation of music; Acquiring and processing player voice signals to extract static age characteristics characterizing physiological attributes of a player and real-time player state vectors characterizing real-time emotional states of the player; Carrying out weighted fusion on the music time sequence emotion vector and the real-time player state vector, and combining the static theme characteristics and the static age characteristics to generate a fusion condition vector; inputting the fusion condition vector into a preset conditional generation model to generate a modeling parameter vector; the build parameter vector is applied to a player's base character build model to adjust the player's base character build in real-time. In one embodiment, processing music audio data, extracting static theme features characterizing a global style of music, and music timing emotion vectors characterizing real-time fluctuations of music, includes: extracting a mel spectrogram of the music audio data; processing the Mel spectrogram through a pretrained convolutional recurrent neural network to generate the music time series emotion vector, and Metadata of the music audio data is processed through a pre-trained natural language processing model to generate the static theme features. In one embodiment, processing the mel-frequency spectrogram through a pre-trained convolutional recurrent neural network to generate the musical time-series emotion vector comprises: Extracting a time sequence feature sequence in a preset time window based on the Mel spectrogram; Based on the time sequence feature sequence, extracting local features through a convolution layer of the convolution cyclic neural network, and capturing time sequence dependency relations through the cyclic layer of the convolution cyclic neural network to generate intermediate emotion representations; and mapping the intermediate emotion representation through a full connection la