CN-122027858-A - Method and system for acquiring network video and audio information, performing track-dividing processing and generating media by combining artificial intelligent output

CN122027858ACN 122027858 ACN122027858 ACN 122027858ACN-122027858-A

Abstract

A method for obtaining network video information, executing track dividing process and generating media by combining artificial intelligence output includes obtaining a network video information by a device server of video playing device or a generating artificial intelligence server connected to said device server by signal, separating multiple track dividing information from said network video information by a sampling module to generate at least one generating information, outputting said generating information as a generating media, playing said generating media by said device server in a playing interface of said video playing device. Therefore, the real-time sampling can be performed on the network video and audio information to obtain various audio, lyrics, images and other track dividing information, and then the generated information of multiple tracks is added through the generated artificial intelligence, so that a brand new generated medium containing newly added audio, newly added video, newly added characters and the like is rapidly output.

Inventors

WU JIANZHOU

Assignees

吴建州

Dates

Publication Date: 20260512
Application Date: 20241112

Claims (20)

1. A method for obtaining network video and audio information, performing track-splitting processing and generating media by combining artificial intelligence output, comprising the steps of: A device server of a video playing device or a generating artificial intelligence server connected with the device server through signals obtains a network video information, and then the device server or the generating artificial intelligence server separates a plurality of track dividing information from the network video information through a sampling module; The generation type artificial intelligent server selects part or all of the track dividing information, the selected track dividing information is aligned and positioned on a time axis, and the generation type artificial intelligent server generates at least one generation information according to the selected track dividing information, and the generation information is aligned and positioned on the time axis; The generation type artificial intelligent server mixes the selected track dividing information and the generation information, outputs the selected track dividing information and the generation information as a generation medium and transmits the generation medium back to the device server; The device server plays the generated media in a playing interface of the video and audio playing device.
2. The method of claim 1, wherein the device server obtains a connection code corresponding to the network video information, the device server or the generated artificial intelligence server analyzes the connection code by an application program interface technology, and obtains the network video information from an application program interface server of a streaming platform according to the connection code by the application program interface technology, or downloads the network video information from the streaming platform to an electronic device in signal connection with the device server or the generated artificial intelligence server, and the device server or the generated artificial intelligence server obtains the network video information from the electronic device.
3. The method of claim 2, wherein when the device server or the generated artificial intelligence server obtains the network video information from the application interface server, the video playing device has a search ordering interface signal to connect with the device server, the device server sends a video search request to the application interface server according to a keyword using the application interface technology, the application interface server performs search according to the video search request to obtain a search result corresponding to the keyword, the device server obtains a plurality of connection codes of the search result from the application interface server using the application interface technology, and each connection code corresponds to different network video information, the search ordering interface orders the connection codes in the video playing device to display the search result.
4. The method of claim 3, wherein the search sorting interface displays the search result, and then a user selects at least one link code, the search sorting interface adds the selected link code to a playlist, and the device server or the generated artificial intelligence server sequentially obtains the link code from the playlist.
5. The method of claim 3, wherein each of the plurality of streaming platforms has a single search mode and an integrated search mode, wherein the search sequencing interface has a single search mode and an integrated search mode, wherein when the search sequencing interface operates in the single search mode, the device server sends the video search request to the application interface server of one of the streaming platforms to display the search result of the one of the streaming platforms in the video playback device, and when the search sequencing interface operates in the integrated search mode, the device server sends the video search request to the application interface servers of all of the streaming platforms to display the search result of all of the streaming platforms in the video playback device, and wherein the device server integrates and displays the link codes from different ones of the streaming platforms but corresponding to the same network video information.
6. The method of claim 3, wherein the device server excludes the link code corresponding to the network video information having an invalid status before the search result is displayed by the search ordering interface.
7. The method of claim 2, wherein when the device server or the generated artificial intelligence server obtains the network video information from the application interface server, the video playing device has a database signal connected to the device server, and when the number of times that one of the link codes is obtained by the device server or the generated artificial intelligence server reaches a threshold, the device server stores the one of the link codes in the database for the device server or the generated artificial intelligence server to obtain again.
8. The method of claim 2, wherein when the device server or the generated artificial intelligence server obtains the network video information from the application interface server, the device server or the generated artificial intelligence server verifies whether the streaming platform is on before obtaining the network video information from the streaming platform, if the streaming platform is on, the device server or the generated artificial intelligence server continues to obtain the network video information, and if the streaming platform is not on, the device server displays an interface corresponding to the streaming platform in the video playing device.
9. The method of claim 1, wherein the artificial intelligence server adjusts the sound size of at least some of the selected pieces of track-divided information or performs an additional change on at least some of the selected pieces of track-divided information before outputting the generated media, the additional change including one of a change in a track, a change in a tempo, a change in a melody, and a change in a pitch.
10. The method of claim 1, wherein the sampling module comprises a pre-neural network for sampling the track-divided information from the network video and audio information according to a frequency characteristic, wherein the pre-neural network is one of a multi-sound source separation model, an audio separation network, a voice separation model, a sound separation model, and a dual-path conversion network.
11. The method for acquiring network video information and performing track-splitting processing and generating media by combining artificial intelligence output according to claim 1, wherein the track-splitting information comprises one of track-splitting audio, track-splitting image, main angle image, lyrics text, a pair of white text and side white text, the generating information comprises one of audio generation, image generation, animation generation and text generation, the sampling module comprises an object identification model and an edge detection algorithm, so as to acquire the main angle image from the network video information, and the object identification model is one of YOLO8, vision Master, vision AI and Amazon Rekognition, and the edge detection algorithm is one of Canny edge detection algorithm, laplacian edge detection algorithm and Sobel edge detection algorithm.
12. A system for obtaining network audiovisual information to perform a track-splitting process and generate media in combination with artificial intelligence output, comprising: The video playing device is provided with a device server and a playing interface connected with the device server through signals; A sampling module connected with the device server in signal mode, and The device server or the generating artificial intelligent server acquires network video and audio information, and then the device server or the generating artificial intelligent server separates a plurality of track information from the network video and audio information through the sampling module; The generation type artificial intelligent server selects part or all of the track dividing information, the selected track dividing information is aligned and positioned on a time axis, and the generation type artificial intelligent server generates at least one generation information according to the selected track dividing information, and the generation information is aligned and positioned on the time axis; The generation type artificial intelligent server mixes the selected track dividing information and the generation information, outputs the selected track dividing information and the generation information as a generation medium and transmits the generation medium back to the device server; The device server plays the generated media in the play interface.
13. The system of claim 12, wherein there is a streaming platform having an application interface server signal connected to the device server, the device server obtaining a connection code corresponding to the network video information, the device server or the generated artificial intelligence server resolving the connection code with an application interface technology and obtaining the network video information from the application interface server according to the connection code, or there is an electronic device signal connected to the streaming platform and the device server, downloading the network video information from the streaming platform to the electronic device, the device server or the generated artificial intelligence server obtaining the network video information from the electronic device.
14. The system of claim 13, wherein when the device server or the generated artificial intelligence server obtains the network video information from the application interface server, the video playing device has a search ordering interface signal connected to the device server, the device server sends a video search request to the application interface server according to a keyword using the application interface technology, the application interface server performs a search to obtain a search result corresponding to the keyword, the device server obtains a plurality of connection codes corresponding to the search result from the application interface server using the application interface technology, each connection code corresponds to different network video information, and the search ordering interface orders the connection codes in the video playing device to display the search result.
15. The system of claim 14, wherein the search ranking interface displays the search result, and then a user selects at least one link code, the search ranking interface adds the selected link code to a playlist, and the device server or the generated artificial intelligence server sequentially obtains the link code from the playlist.
16. The system of claim 14, wherein the system comprises a plurality of streaming platforms, each streaming platform has a single search mode and an integrated search mode, the search sequencing interface has a single search mode and an integrated search mode, when the search sequencing interface operates in the single search mode, the device server sends the video search request to the application interface server of one of the streaming platforms to display the search result of the one of the streaming platforms in the video playback device, and when the search sequencing interface operates in the integrated search mode, the device server sends the video search request to the application interface servers of all of the streaming platforms to display the search result of all of the streaming platforms in the video playback device, and the device server integrates and displays the link codes from different streaming platforms but corresponding to the same network video information.
17. The system of claim 14, wherein the device server excludes the link code corresponding to the network video information having an invalid status before the search result is displayed by the search ordering interface.
18. The system of claim 13, wherein when the device server or the generated artificial intelligence server obtains the network video information from the application interface server, the video playing device has a database signal connected to the device server, and when the number of times that one of the link codes is obtained by the device server or the generated artificial intelligence server reaches a threshold, the device server stores the one of the link codes in the database for the device server or the generated artificial intelligence server to obtain again.
19. The system of claim 13, wherein when the device server or the generated artificial intelligence server obtains the network video information from the application interface server, the video playing device has a start interface signal connected to the device server and the start interface corresponds to the streaming platform, the device server or the generated artificial intelligence server verifies the streaming platform before obtaining the network video information from the streaming platform, if the streaming platform is already started, the device server or the generated artificial intelligence server continues to obtain the network video information, and if the streaming platform is not started, the device server displays the start interface corresponding to the streaming platform in the video playing device.
20. The system for capturing network video and audio information and generating media in connection with artificial intelligence output according to claim 12, wherein the generated artificial intelligence server adjusts a sound size of at least a portion of the selected track-divided information or performs an additional change on at least a portion of the selected track-divided information before outputting the generated media, the additional change including one of a track change, a tempo change, a melody change, and a pitch change.

Description

Method and system for acquiring network video and audio information, performing track-dividing processing and generating media by combining artificial intelligent output Technical Field The invention relates to a method and a system for acquiring network video and audio information to execute track separation processing and generating media by combining artificial intelligence output, in particular to a method and a system for executing instant classification acquisition on network video and audio information to acquire a plurality of track separation information, automatically generating corresponding generation information by combining generation type artificial intelligence according to the selected track separation information, and superposing and outputting the generated information to form new video and audio media. Background For the audio track separation processing technology of the film, a music matching method, a device, an electronic apparatus and a computer readable storage medium are provided in chinese patent publication No. CN 116939323 a. The technical content of the patent is that a music matching interface of music to be matched is displayed, wherein the music matching interface comprises a matching control, and an audio-video interface is displayed in response to triggering operation of the matching control, wherein the audio-video interface comprises a target audio track of the music to be matched and a target video set matched with the target audio track, and the target audio track is at least one audio track obtained after the audio track of the music to be matched is separated. However, in the foregoing patent, the track separation is mainly performed on the music to be matched to obtain a target track of the music to be matched, where the target track includes a target person sound track, a target accompaniment sound track, a target Bei Siyin track and a target drum point sound track, and according to the target drum point sound track, a first target drum point sequence and a second target drum point sequence are obtained, and the terminal screens out at least one target music from the candidate music and uses a video to be matched including the target music as a target video. The aforementioned patent has the following limitations: 1. The matching of the target music and the target video is performed depending on the target drum point track. 2. The generated track that matches the target track cannot be automatically generated. 3. Only can separate out a plurality of target tracks from the music to be matched, and operations such as object identification, edge detection, principal angle acquisition and the like in the image cannot be performed on the video source comprising the image. 4. The new matching image, matching animation or matching film cannot be automatically generated according to the target track selected on demand. Disclosure of Invention Therefore, in order to solve the limitations of the above patent, the present invention is directed to a method and a system for obtaining network video and audio information to perform track-dividing processing and generating media by combining artificial intelligence output. The invention provides a method for acquiring network video and audio information to execute track separation processing and generating media by combining artificial intelligence, which comprises the steps that a device server of a video and audio playing device or a generation type artificial intelligence server connected with the device server through signals acquires the network video and audio information, the device server or the generation type artificial intelligence server separates a plurality of track separation information from the network video and audio information through a sampling module, the generation type artificial intelligence server selects part or all of the track separation information, the selected track separation information is aligned and positioned on a time axis, the generation type artificial intelligence server generates at least one generation information according to the selected track separation information, the generation information is aligned and positioned on the time axis, the generation type artificial intelligence server mixes the selected track separation information and the generation information, outputs the generated information as a generation media and returns the generation media to the device server, and the device server plays the generation media in a playing interface of the video and audio playing device. The invention further provides a system for acquiring network video and audio information, performing track separation processing and generating media by combining artificial intelligence, which comprises a video and audio playing device, a sampling module, a generation type artificial intelligence server and a generation type artificial intelligence server, wherein the video and audio playing device is pro