EP-4736073-A1 - OVER-FITTING A TRAINING SET FOR A MACHINE LEARNING (ML) MODEL FOR A SPECIFIC GAME AND GAME SCENE FOR ENCODING

EP4736073A1EP 4736073 A1EP4736073 A1EP 4736073A1EP-4736073-A1

Abstract

Techniques are described for over-training (e.g., 1002) a ML model on multiple gameplay videos of individual scenes of a computer game to better configure the model to reconstruct or enhance portions of the computer game at a receiver as the computer game is received over a streamlining network. Reconstruction (904) of individual missing slices (810) of a frame is contemplated such that a frame missing a slice need not be entirely discarded.

Inventors

ARYA, Deepali
CHEN, ERIC HSUMING
WANG, JASON
LEE, HUNG-JU

Assignees

Sony Interactive Entertainment Inc.

Dates

Publication Date: 20260506
Application Date: 20240910

Claims (20)

1. An apparatus comprising: at least one processor assembly configured to: over-fit at least one machine learning (ML) model by training the ML model on plural ground truth gameplay video recordings of a computer game to produce a trained ML model; and use the trained ML model to reconstruct at least a portion of at least one frame of video from the computer game during streaming of the computer game to a receiver over a network.
2. The apparatus of Claim 1, wherein the processor assembly is configured to: train the ML model on plural gameplay video recordings of plural individual scenes in the computer game; and signal a scene indication to the trained ML model along with the frame of video to be reconstructed.
3. The apparatus of Claim 1, wherein the portion comprises an individual slice of the frame.
4. The apparatus of Claim 3, wherein the slice is a first slice and the frame is a first frame, and the processor assembly is configured to: use the trained ML model to reconstruct missing compressed domain information of the first slice and only the first slice for use in presenting a second frame referencing the first frame.
5. The apparatus of Claim 4, wherein the compressed domain information comprises motion vectors.
6. The apparatus of Claim 3, wherein the slice is a first slice and the frame is a first frame and the processor assembly is configured to: use a slice and only a slice from a frame prior to the first frame to reconstruct missing compressed domain information of the first slice for use in presenting a second frame referencing the first frame.
7. The apparatus of Claim 1, wherein the portion of the frame of video is received with a first quality, and the processor assembly is configured to: use the trained ML model to reconstruct the portion of the frame of video by enhancing the first quality to be a second quality.
8. An apparatus comprising: at least one computer medium that is not a transitory signal and that comprises instructions executable by at least one processor assembly to: identify a missing or low quality slice of a frame of video; and use a machine learning (ML) model to enhance at least the slice.
9. The apparatus of Claim 8, wherein the instructions are executable to: use the ML model to enhance at least the slice by reconstructing the slice.
10. The apparatus of Claim 8, wherein the instructions are executable to: use the ML model to enhance at least the slice by enhancing quality of the slice.
11. The apparatus of Claim 8, wherein the frame is a first frame and the instructions are executable to: use the ML model to reconstruct missing compressed domain information of the slice for use in presenting a second frame referencing the first frame.
12. The apparatus of Claim 11, wherein the compressed domain information comprises motion vectors.
13. The apparatus of Claim 8, wherein the frame is a first frame and the instructions are executable to: use a slice from a frame prior to the first frame to reconstruct missing compressed domain information of the slice for use in presenting a second frame referencing the first frame.
14. The apparatus of Claim 8, wherein the video comprises at least one computer game.
15. A method, comprising: identifying at least a portion of a frame of video received over a computer network is missing or is low quality; and reconstructing or enhancing at least the portion using a machine learning (ML) model over-trained on the video.
16. The method of Claim 15, wherein the video comprises at least one computer game.
17. The method of Claim 15, wherein the portion of the frame is identified as missing, and the method comprises reconstructing the portion using the ML model.
18. The method of Claim 15, wherein the portion of the frame is identified as low quality, and the method comprises enhancing the portion using the ML model.
19. The method of Claim 15, wherein the frame is a first frame and the method comprises: using the ML model to reconstruct missing compressed domain information of the slice for use in presenting a second frame referencing the first frame.
20. The method of Claim 19, wherein the compressed domain information comprises motion vectors.

Description

OVER-FITTING A TRAINING SET FOR A MACHINE LEARNING (ML) MODEL FOR A SPECIFIC GAME AND GAME SCENE FOR ENCODING FIELD The present application relates to technically inventive, non-routine solutions that are necessarily rooted in computer technology and that produce concrete technical improvements, and more specifically to over-fitting a training set for a machine learning (ML) model for a specific game and game scene for encoding. BACKGROUND In video streaming over a network such as computer game streaming, frames and/or portions of a frame such as slices may be lost entirely or may arrive with low quality. SUMMARY As understood herein, not only does the above problem deleteriously affect presentation of the frame itself, but also potentially the presentation of other frames that may reference the missing portions. Accordingly, an apparatus includes at least one processor assembly configured to overfit at least one machine learning (ML) model by training the ML model on plural ground truth gameplay video recordings of a computer game to produce a trained ML model. The processor assembly is configured to use the trained ML model to reconstruct at least a portion of at least one frame of video from the computer game during streaming of the computer game to a receiver over a network. In some embodiments the processor assembly can be configured to train the ML model on plural gameplay video recordings of plural individual scenes in the computer game, and signal a scene indication to the trained ML model along with the frame of video to be reconstructed. In example embodiments, the portion of the frame can be an individual slice of the frame. The frame can be a first frame and the processor assembly can be configured to use the trained ML model to reconstruct missing compressed domain information of the first slice and only the first slice for use in presenting a second frame referencing the first frame. For example, the compressed domain information may include motion vectors. In other implementations the processor assembly can be configured to use a slice and only a slice from a frame prior to the first frame to reconstruct missing compressed domain information of the first slice for use in presenting a second frame referencing the first frame. In non-limiting examples, the portion of the frame of video can be received with a first quality, and the processor assembly can be configured to use the trained ML model to reconstruct the portion of the frame of video by enhancing the first quality to be a second quality. In another aspect, an apparatus includes at least one computer medium that is not a transitory signal and that in turn includes instructions executable by at least one processor assembly to identify a missing or low quality slice of a frame of video, and use a machine learning (ML) model to enhance at least the slice. In another aspect, a method includes identifying at least a portion of a frame of video received over a computer network is missing or is low quality, and reconstructing or enhancing at least the portion using a machine learning (ML) model over-trained on the video. The details of the present disclosure, both as to its structure and operation, can be best understood in reference to the accompanying drawings, in which like reference numerals refer to like parts, and in which: BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a block diagram of an example system including an example in consistent with present principles; Figure 2 illustrates an example encoder-decoder system; Figure 3 illustrates example overfitting logic in example flow chart format; Figure 4 illustrates example transmission logic in example flow chart format; Figure 5 illustrates example receiver logic in example flow chart format; Figure 6 illustrates an example streaming service with three example computer games; Figure 7 illustrates an example overfitting training set consistent with Figure 6; Figure 8 illustrates a missing frame slice in a streaming system; Figure 9 illustrates example logic in example flow chart format to enhance a low quality slice at the decoder; Figure 10 illustrates example training logic for the machine learning (ML) model used in Figure 9; Figure 11 illustrates example logic in example flow chart format to process slices while reconstructing missing compression data; and Figures 12-14 illustrate example logic in example flow chart format for training respective ML models use in the example of Figure 12. DETAILED DESCRIPTION This disclosure relates generally to computer ecosystems including aspects of consumer electronics (CE) device networks such as but not limited to computer game networks. A system herein may include server and client components which may be connected over a network such that data may be exchanged between the client and server components. The client components may include one or more computing devices including game consoles such as Sony PlayStation® or a game console ma