Search

EP-4740139-A1 - PREDICTION MODEL MANAGEMENT

EP4740139A1EP 4740139 A1EP4740139 A1EP 4740139A1EP-4740139-A1

Abstract

There are proposed methods, devices, and computer program products for prediction model management. In the method, gradient information associated with the prediction model is obtained based on sample data for a time slot in a predetermined time period. An offset of the time slot in the predetermined time period is acquired. A step size is determined for updating a parameter of the prediction model based on the gradient information, the offset, and historical gradient information that is determined based on historical sample data for a group of historical time slots before the time slot. With these implementations, the whole training procedure may be divided into multiple time period and each time period may further include multiple time slots. During each time period, the offset may be used to control the importance of the historical gradient information and the gradient information in determining the step size.

Inventors

  • XIN, Meng
  • WANG, Silun
  • ZHANG, YU

Assignees

  • Lemon Inc.

Dates

Publication Date
20260513
Application Date
20240618

Claims (20)

  1. 1. A method for managing a prediction model, comprising: obtaining gradient information associated with the prediction model based on sample data for a time slot in a predetermined time period, acquiring an offset of the time slot in the predetermined time period; and determining a step size for updating a parameter of the prediction model based on the gradient information, the offset, and historical gradient information that is determined based on historical sample data for a group of historical time slots before the time slot.
  2. 2. The method according to claim 1, wherein determining the step size comprises: determine a weight for the historical gradient information based on the offset; and generating the step size based on the gradient information, the historical gradient information, and the weight for the historical gradient information.
  3. 3. The method according to claim 2, wherein the weight is within a predefined area and increases with the offset.
  4. 4. The method according to claim 2, wherein generating the step size comprises: determining an intermediate parameter associated with the time slot based on the gradient information and a weighted historical gradient information that is determined based on the historical gradient information and the weight; and creating the step size based on the intermediate parameter and the gradient information.
  5. 5. The method according to claim 4, wherein creating the step size comprises: obtaining an attenuation factor for the intermediate parameter based on the offset, and determining the step size based on the gradient information and an attenuated intermediate parameter that is determined based on the intermediate parameter and the attenuation factor.
  6. 6 The method according to claim 1, wherein obtaining the gradient information comprises: obtaining a prediction for a label portion in the sample data based on a data portion in the sample data and the prediction model; determining a loss between the prediction for the label portion and the label portion, and acquiring the gradient information based on a gradient of the loss and the parameter of the prediction model.
  7. 7 The method according to claim 6, wherein the data portion represents features associated with a user and an object, the label portion represents an event between the user and the object, and the predetermined time period has a length of one or more days.
  8. 8. The method according to claim 1, further comprising determining the historical gradient information by: obtaining respective gradient information based on respective historical sample data for the group of historical time slots before the time slot; and acquiring the historical gradient information based on the obtained respective gradient information.
  9. 9. The method according to claim 8, wherein acquiring the historical gradient information comprises: determining respective squares of respective gradient information associated with the respective historical time slots in the group of historical time slots, the group of historical time slots being within the predefined time period; and determining the historical gradient information based on a sum of the respective squares.
  10. 10. The method according to claim 1, further comprising: updating the parameter of the prediction model with the step size.
  11. 11. An electronic device, comprising a computer processor coupled to a computer-readable memory unit, the memory unit comprising instructions that when executed by the computer processor implements a method for managing a prediction model, the method comprising: obtaining gradient information associated with the prediction model based on sample data for a time slot in a predetermined time period; acquiring an offset of the time slot in the predetermined time period; and determining a step size for updating a parameter of the prediction model based on the gradient information, the offset, and historical gradient information that is determined based on historical sample data for a group of historical time slots before the time slot.
  12. 12. The device according to claim 11, wherein determining the step size comprises: determine a weight for the historical gradient information based on the offset; and generating the step size based on the gradient information, the historical gradient information, and the weight for the historical gradient information.
  13. 13. The device according to claim 12, wherein the weight is within a predefined area and increases with the offset.
  14. 14. The device according to claim 12, wherein generating the step size comprises: determining an intermediate parameter associated with the time slot based on the gradient information and a weighted historical gradient information that is determined based on the historical gradient information and the weight; and creating the step size based on the intermediate parameter and the gradient information.
  15. 15. The device according to claim 14, wherein creating the step size comprises: obtaining an attenuation factor for the intermediate parameter based on the offset; and determining the step size based on the gradient information and an attenuated intermediate parameter that is determined based on the intermediate parameter and the attenuation factor.
  16. 16 The device according to claim 1 1 , wherein obtaining the gradient information comprises: obtaining a prediction for a label portion in the sample data based on a data portion in the sample data and the prediction model; determining a loss between the prediction for the label portion and the label portion; and acquiring the gradient information based on a gradient of the loss and the parameter of the prediction model.
  17. 17. The device according to claim 16, wherein the data portion represents features associated with a user and an object, the label portion represents an event between the user and the object, and the predetermined time period has a length of one or more days, and the method further comprises: updating the parameter of the prediction model with the step size.
  18. 18. The device according to claim 11, wherein the method further comprises determining the historical gradient information by: obtaining respective gradient information based on respective historical sample data for the group of historical time slots before the time slot; and acquiring the historical gradient information based on the obtained respective gradient information.
  19. 19. The device according to claim 18, wherein acquiring the historical gradient information comprises: determining respective squares of respective gradient information associated with the respective historical time slots in the group of historical time slots, the group of historical time slots being within the predefined time period; and determining the historical gradient information based on a sum of the respective squares.
  20. 20. A non-transitory computer program product, the non-transitory computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by an electronic device to cause the electronic device to perform a method for managing a prediction model, the method comprising: obtaining gradient information associated with the prediction model based on sample data for a time slot in a predetermined time period, acquiring an offset of the time slot in the predetermined time period; and determining a step size for updating a parameter of the prediction model based on the gradient information, the offset, and historical gradient information that is determined based on historical sample data for a group of historical time slots before the time slot.

Description

PREDICTION MODEL MANAGEMENT CROSS-REFERENCE [0001] This application claims the benefit of U.S. Patent Application No. 18/219,158 filed on July 7, 2023, entitled “PREDICTION MODEL MANAGEMENT”, which is hereby incorporated by reference in its entirety. FIELD [0002| The present disclosure generally relates to prediction model management, and more specifically, to methods, devices and computer program products for prediction model management based on a periodic reset during a training procedure. BACKGROUND [0003] Nowadays, the machine learning technique has been widely used in data processing. For example, in a recommendation environment, objects such as an article, an advertisement, a message, an audio, a video, a game and so on may be provided to users. Then, the users may subscript a channel in which the article is provided, buy a product that is recommended in the advertisement, and so on. At this point, events (such as a subscripting event, a buying event, and the like) between the user and corresponding obj ects may be detected. There have been proposed solutions for training a prediction model with sample data associated with the users, the objects, and the events, and then the prediction model may be used for outputting a trend of events between users and the objects in the future. However, the prediction model is gradually trained by historical data that covers a long time duration and cannot accurately reflect recent data distributions in the training data. At this point, how to make the prediction model learn knowledges from the most recent data becomes a hot focus. SUMMARY [0004] In a first aspect of the present disclosure, there is provided a method for managing a prediction model. In the method, gradient information associated with the prediction model is obtained based on sample data for a time slot in a predetermined time period. An offset of the time slot in the predetermined time period is acquired. A step size is determined for updating a parameter of the prediction model based on the gradient information, the offset, and historical gradient information that is determined based on historical sample data for a group of historical time slots before the time slot. [0005] In a second aspect of the present disclosure, there is provided an electronic device. The electronic device comprises: a computer processor coupled to a computer-readable memory unit, the memory unit comprising instructions that when executed by the computer processor implements a method according to the first aspect of the present disclosure. [0006] In a third aspect of the present disclosure, there is provided a computer program product, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by an electronic device to cause the electronic device to perform a method according to the first aspect of the present disclosure. [0007] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS [0008] Through the more detailed description of some implementations of the present disclosure in the accompanying drawings, the above and other objects, features and advantages of the present disclosure will become more apparent, wherein the same reference generally refers to the same components in the implementations of the present disclosure. [0009] Fig. 1 illustrates an example environment for prediction model management according to the machine learning technique; [0010] Fig. 2 illustrates an example diagram of sample data for various time slots in a predetermined time period according to implementations of the present disclosure; [0011] Fig. 3 illustrates an example diagram for managing a prediction model according to implementations of the present disclosure; [0012] Fig. 4 illustrates an example diagram of sample data according to implementations of the present disclosure; [0013] Fig. 5 illustrates an example diagram for determining a step size for updating a parameter of a prediction model according to implementations of the present disclosure; [0014] Fig. 6 illustrates an example flowchart for a method for updating a prediction model according to implementations of the present disclosure; [0015] Fig. 7 illustrates an example flowchart of a method for managing a prediction model according to implementations of the present disclosure; and [0016] Fig. 8 illustrates a block diagram of a computing device in which various implementations of the present disclosure can be implemented. DETAILED DESCRIPTION [0017] Principle of the present disclosure will now be described with reference to some implementations. It is to