US-20260127444-A1 - NETWORK TRAINING, AND ASSOCIATED NETWORKS, MODELS, SYSTEMS, DEVICES, METHODS, AND COMPUTER-READABLE MEDIA

US20260127444A1US 20260127444 A1US20260127444 A1US 20260127444A1US-20260127444-A1

Abstract

Various embodiments relate to network training, and associated systems and non-transitory computer-readable media. In some embodiments, a system may include a classifier configured to receive a first dataset and a second dataset. The classifier may include a backbone having a number of layers and configured to generate an output based on at least one of the first dataset or the second dataset. The system further includes a first classification head configured to receive the output of the backbone and generate a first loss function. The system also includes a second classification head configured to receive the output of the backbone and to generate a second loss function.

Inventors

Weijun Tan

Assignees

LiveView Technologies, LLC

Dates

Publication Date: 20260507
Application Date: 20241105

Claims (20)

1 . A system, comprising: a classifier configured to receive a first dataset and a second, different dataset, the classifier including: a backbone having a number of layers and configured to generate an output based on at least one of the first dataset or the second, different set; a first classification head configured to receive the output of the backbone and generate a first loss function; and a second, different classification head configured to receive the output of the backbone and to generate a second, different loss function.
2 . The system of claim 1 , further comprising a mobile surveillance unit including the classifier.
3 . The system of claim 1 , wherein the classifier is configured to train the backbone and the first classification head based on the first loss function.
4 . The system of claim 1 , wherein the classifier is configured to train the backbone and the second, different classification head based on the second, different loss function.
5 . The system of claim 1 , wherein the classifier is configured to train the second, different classification head and the first classification head substantially simultaneously.
6 . The system of claim 1 , wherein the classifier is configured to train the second, different classification head and the first classification head separately during different phases.
7 . The system of claim 1 , wherein the first dataset comprises a personal attribute recognition (PAR) dataset and the second, different dataset comprises a personal protection equipment (PPE) dataset.
8 . A method, comprising: generating first data via a backbone of classifier responsive to receipt of a first dataset; calculating a first loss function via a first classification head of the classifier responsive to the first data; training the first classification head based on the first loss function; generating second data via the backbone responsive to receipt of a second dataset; calculating a second loss function via a second classification head of the classifier responsive to the second data; and training the second classification head based on the second loss function.
9 . The method of claim 8 , wherein: generating the first data comprises generating the first data responsive to receive of a personal attribute recognition (PAR) dataset; calculating the first loss function via the first classification head comprises calculating a PAR loss function via a PAR classification head; and training the first classification head based on the first loss function comprises training the PAR classification head based on the PAR loss function.
10 . The method of claim 9 , wherein: generating the second data comprises generating the second data responsive to receive of a personal protection equipment (PPE) dataset; calculating the second loss function via the second classification head comprises calculating a PPE loss function via a PPE classification head; and training the second classification head based on the second loss function comprises training the PPE classification head based on the PPE loss function.
11 . The method of claim 8 , further comprising combining the first loss function and the second loss function to generate a prediction.
12 . The method of claim 8 , further comprising one of: removing the first classification head and deploying the classifier; or removing the second classification head and deploying the classifier.
13 . The method of claim 8 , further comprising receiving an image at the backbone and generating a prediction based on at least one of the trained first classification head or the trained second classification head.
14 . The method of claim 8 , wherein training the second classification head comprises training the second classification head while training the first classification head.
15 . The method of claim 8 , wherein: training the second classification head comprises training the second classification head during one phase; and training the first classification head comprises training the first classification head during another, different phase.
16 . A non-transitory computer-readable media having computer instructions stored thereon that, in response to being executed by a processing device of a system, cause the system to perform or control performance of operations comprising: generate first data responsive to a first dataset; calculate a first loss function via a first classification head responsive to the first data; train the first classification head based on the first loss function; generate second data responsive to receipt of a second dataset; calculate a second loss function via a second classification head responsive to the second data; and train the second classification head based on the second loss function.
17 . The non-transitory computer-readable media of claim 16 , wherein: training the first classification head comprises training the first classification head during one phase; and training the second classification head comprises training the second classification head during another, different phase.
18 . The non-transitory computer-readable media of claim 16 , the operations further comprising generate a prediction based on at least one of the first loss function or the second loss function.
19 . The non-transitory computer-readable media of claim 16 , the operations further comprising: receiving an image; and responsive to the image, generating a prediction based on at least one of the trained first classification head or the trained second classification head.
20 . The non-transitory computer-readable media of claim 16 , wherein training the second classification head comprises training the second classification head while training the first classification head.

Description

TECHNICAL FIELD This disclosure relates generally to network training. More specifically, this disclosure relates to training of multi-head neural networks with a number of datasets, and to related networks, models, systems, devices, methods, and computer-readable media. BACKGROUND Improved processing power, better algorithms, and the availability of big data are facilitating the implementation of machine learning functionality into a variety of different applications. Machine learning is an enabling technology for the revolution currently underway in artificial intelligence, driving advances in fields such as object detection, image classification, speech recognition, natural language processing, and many more. Machine learning models receive an input and generate an output (e.g., a predicted output), based on the received input. Some machine learning models are deep models that employ multiple layers to generate an output for a received input. For example, a deep neural network is a deep machine learning model that includes an output layer and one or more hidden layers that each apply a non-linear transformation to a received input to generate an output. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1A depicts two networks, each of which is configured for receiving an input and generating a predictive output. FIG. 1B depicts an example network including a feature extraction phase and a classification phase, according to various embodiments of the disclosure. FIG. 2 depicts an example system including a model and a classifier, in accordance with various embodiments of the disclosure. FIG. 3 illustrates an example classifier, according to various embodiments of the disclosure. FIG. 4 depicts an example classifier for receiving multiple datasets, according to various embodiments of the disclosure. FIG. 5 illustrates an example classifier, according to various embodiments of the disclosure. FIG. 6 depicts an example classifier including a personal attribute recognition classification head, in accordance with various embodiments of the disclosure. FIG. 7 depicts an example classifier including a personal protection equipment classification head, in accordance with various embodiments of the disclosure. FIG. 8 illustrates an example system including a classifier and a post processor, according to various embodiments of the disclosure. FIG. 9 is another illustration of a system including a classifier and a post processor, according to various embodiments of the disclosure. FIG. 10 is a flowchart of an example method of training a network, according to various embodiments of the disclosure. FIG. 11 depicts an example system including a unit, in accordance with various embodiments of the disclosure. FIG. 12 depicts another example system including a mobile unit, in accordance with various embodiments of the disclosure. FIG. 13 illustrates another example system, according to one or more embodiments of the disclosure. DETAILED DESCRIPTION Referring in general to the accompanying drawings, various embodiments of the disclosure are illustrated to show example embodiments related to network training and associated networks including classifiers. It should be understood that the drawings presented are not meant to be illustrative of actual views of any particular portion of an actual circuit, device, system, or structure, but are merely representations that are employed to more clearly depict various embodiments of the disclosure. The following provides a more detailed description of the present disclosure and various representative embodiments thereof. In this description, functions may be shown in block diagram form in order not to obscure the present disclosure in unnecessary detail. Additionally, block definitions and partitioning of logic between various blocks is exemplary of a specific implementation. It will be readily apparent to one of ordinary skill in the art that the present disclosure may be practiced by numerous other partitioning solutions. For the most part, details concerning timing considerations and the like have been omitted where such details are not necessary to obtain a complete understanding of the present disclosure and are within the abilities of persons of ordinary skill in the relevant art. Various embodiments of the disclosure relate to networks including classifiers, and associated training methods. More specifically, various embodiments relate to classifiers of a network including one or more classification heads, and associated training thereof via the use of multiple (e.g., two) datasets. It is noted that each of the terms “network” and “classifier” may also include and/or be referred to as a “model” or a “system.” Embodiments of the disclosure will now be explained with reference to the accompanying drawings. Conventional machine learning neural networks may be trained via a single dataset. For example, FIG. 1A depicts two networks, each of which receiving an input and generating a predictiv