CN-122019800-A - Management system for digital processing of sound image files and retrieval method thereof

CN122019800ACN 122019800 ACN122019800 ACN 122019800ACN-122019800-A

Abstract

The invention relates to the technical field of file management, in particular to a management system for digital processing of sound image files, which comprises a data acquisition monitoring module, a content perception processing module, a data generation management module, a hierarchical storage and preservation module and a retrieval module which are sequentially connected, wherein the data acquisition monitoring module comprises a vibration sensor, a temperature and humidity sensor and a signal analyzer, the content perception processing module comprises a content perception analysis unit, the hierarchical storage and preservation module predicts the heat of the files based on an ARIMA time sequence model and realizes file integrity verification and dynamic storage classification through MERKLETREE, and the retrieval module supports voice content retrieval through Word2Vec Word embedding vectors.

Inventors

YANG JIAN
ZHANG ZHIDONG
WANG CHAO
WANG WEI
LIU XIAO
ZHANG XIAOHONG
SUN YU
PAN XIAOMIN
ZHANG AILING

Assignees

金川集团股份有限公司

Dates

Publication Date: 20260512
Application Date: 20251226

Claims (6)

1. A management system for digital processing of sound image files is characterized by comprising a data acquisition and monitoring module, a content perception processing module, a data generation management module, a hierarchical storage and preservation module and a retrieval module which are sequentially connected, The data acquisition monitoring module comprises a vibration sensor, a temperature and humidity sensor and a signal analyzer, wherein acquired data are fused with a Kalman filtering algorithm, and parameters of acquisition equipment are dynamically adjusted; the content perception processing module comprises a content perception analysis unit, an audio processing strategy is matched based on XGBoost decision tree models, and environmental noise reduction is performed through a convolutional neural network CNN; The data generation management module is used for constructing an event knowledge graph by combining a Whisper voice transcription model, a BERT entity tag extraction model and a graph neural network GNN; the hierarchical storage and preservation module predicts the file heat based on the ARIMA time sequence model and realizes file integrity verification and dynamic storage classification through MERKLETREE; The retrieval module supports voice content retrieval through Word2Vec Word embedded vectors.
2. The management system for digitized processing of audio-video files of claim 1 wherein said data acquisition and monitoring module comprises the following steps: a1, calculating a state estimation value through a Kalman filtering algorithm, wherein the formula is as follows: wherein x is a medium state, kk is Kalman gain, zk is a sensor observation value, and H is an observation matrix; a2, carrying out framing treatment on the audio and video signals, extracting frequency spectrum characteristics, inputting an LSTM model to predict normal signal distribution, triggering abnormal early warning and recording a fault time stamp if the Euclidean distance between the current frame and a predicted value exceeds a threshold value.
3. The management system for digitized processing of sound image files of claim 1 wherein said content aware analysis unit processes said sound image files by: The method comprises the steps of extracting a voice activity detection duty ratio and a signal-to-noise ratio from audio, extracting a motion vector and a color histogram from video, taking the motion vector and the color histogram as input feature vectors of a XGBoost decision tree model, carrying out audio noise reduction by adopting a double-channel CNN model, wherein training data are noise-containing audio and clean audio pairs, a loss function is a weighted sum of mean square error and perception loss, repairing video scratches on the basis of a U-Net structure in an antagonistic network, and optimizing a generator by minimizing texture difference of a repairing area and a neighborhood.
4. The management system for digitized processing of audio-video files of claim 1 wherein said data generation management module is implemented by: Performing end-to-end voice recognition on long audio by using a Whisper model of a Transformer architecture, outputting a text with a timestamp, performing TF-IDF weighting on the transcribed text to extract keywords, performing semantic analysis by combining with a BERT model, generating an entity tag, extracting triples from metadata, generating event node vectors by GNN embedding, and constructing an event knowledge graph.
5. The system according to claim 1, wherein the hierarchical storage and preservation module predicts file heat based on ARIMA time series model, dynamically allocates heat/cold storage levels, and predicts the model as follows: And calculating a hash value of the file by MERKLETREE blocks, periodically comparing the current hash with the original hash, triggering a Rabin-Karp algorithm to locate a damaged block if the current hash is inconsistent with the original hash, and automatically repairing the damaged block by RAID6 redundant storage.
6. A retrieval method for digital processing of an audio-visual file based on the management system for digital processing of an audio-visual file according to claim 1, comprising the steps of: converting the transcribed text into word embedded vectors through the retrieval module, calculating cosine similarity after a user inputs query words, returning a time stamp of the matched segment, predicting a user interaction track by applying a particle filtering algorithm, and optimizing rendering performance.

Description

Management system for digital processing of sound image files and retrieval method thereof Technical Field The invention relates to the technical field of file management, in particular to a management system for digital processing of sound image files and a retrieval method thereof. Background The audio-video files are used as important carriers for recording historical events, cultural activities and social transitions, and widely exist in archives, broadcast television stations, universities and other institutions, and the media form of the audio-video files comprises analog storage carriers such as video tapes, magnetic tapes, films and the like. Along with the development of information technology, the digitization of audio-video files has become a key path for breaking through the traditional storage limitation and realizing long-term storage and open utilization, and is also a core requirement for cultural heritage inheritance and information resource sharing. However, the traditional digital processing mode has a plurality of technical bottlenecks in the core links of acquisition, processing, management, storage, utilization and the like, and is difficult to meet the current comprehensive requirements of high efficiency, high quality of processing results, intellectualization of file management and sustainable long-term storage in the digital process, and the specific pain points are as follows: Firstly, in the data acquisition and monitoring link, the prior art relies on manual operation or fixed parameter equipment to perform analog signal transcription, the real-time sensing and dynamic regulation capability of medium physical states (such as tape tension and film humidity) and signal quality in the acquisition process is lacking, irreversible data loss is easily caused by signal distortion or medium damage due to medium aging and environment fluctuation, secondly, in the content processing link, the conventional system adopts a single noise reduction algorithm to uniformly process audios and videos, the conventional system cannot adaptively match processing strategies according to audio types (such as speaking, music and environmental sounds), important voice information is erroneously filtered or noise suppression is incomplete in the noise reduction process, the usability of a digitized file is affected, and in the file management link, most systems still rely on a manual indexing mode to write, classify and label the audios and videos, a large amount of manpower cost and long period are consumed, the problems of high subjectivity and different indexing standards are also caused, the structural degree of file metadata is low, accurate association and efficient retrieval are difficult to realize, and finally, in the storage link and the conventional storage scheme adopts a fixed-level architecture, the dynamic storage resource is not wasted, and the file access is not wasted. In view of the foregoing, a full-flow solution integrating multi-source sensing, intelligent algorithm, knowledge modeling and system cooperation is needed to realize the high efficiency, the high precision and the sustainable of the digital processing of the audio-video file. Disclosure of Invention The invention aims to provide a management system for digital processing of an audio-video file and a retrieval method thereof, which are used for solving the technical problems of unstable acquisition, low processing efficiency, poor retrieval accuracy and high storage cost in the traditional digital processing of the audio-video file. In order to achieve the above purpose, the application provides a technical scheme that the application provides a management system for digital processing of sound image files, which comprises a data acquisition monitoring module, a content perception processing module, a data generation management module, a hierarchical storage and preservation module and a retrieval module which are sequentially connected, wherein, The data acquisition monitoring module comprises a vibration sensor, a temperature and humidity sensor and a signal analyzer, wherein acquired data are fused with a Kalman filtering algorithm, and parameters of acquisition equipment are dynamically adjusted; the content perception processing module comprises a content perception analysis unit, an audio processing strategy is matched based on XGBoost decision tree models, and environmental noise reduction is performed through a convolutional neural network CNN; The data generation management module is used for constructing an event knowledge graph by combining a Whisper voice transcription model, a BERT entity tag extraction model and a graph neural network GNN; the hierarchical storage and preservation module predicts the file heat based on the ARIMA time sequence model and realizes file integrity verification and dynamic storage classification through MERKLETREE; The retrieval module supports voice content retrieval through Word2