EP-4742171-A2 - DEEP LEARNING SYSTEM

EP4742171A2EP 4742171 A2EP4742171 A2EP 4742171A2EP-4742171-A2

Abstract

A machine learning system is provided to enhance various aspects of machine learning models. In some aspects, a substantially photorealistic three-dimensional (3D) graphical model of an object is accessed and a set of training images of the 3D graphical mode are generated, the set of training images generated to add imperfections and degrade photorealistic quality of the training images. The set of training images are provided as training data to train an artificial neural network.

Inventors

MOLONEY, David Macdara
PALLA, Alessandro
BUCKLEY, Léonie Raideen
RODRIGUEZ MARTÍN DE LA SIERRA, Luis M.
MÁRQUEZ RODRÍGUEZ-PERAL, Carlos
BRICK, Cormac M.
BYRNE, Jonathan David
XU, XIAOFAN
PEÑA CARRILLO, Dexmont Alejandro
PARK, MI SUN

Assignees

MOVIDIUS LTD.

Dates

Publication Date: 20260513
Application Date: 20190521

Claims (14)

A method, comprising: generating one or more synthetic training samples, a synthetic training sample comprising a synthetic image; generating one or more photorealistic training samples by degrading a quality of the one or more synthetic training samples; forming a training set with the one or more photorealistic training samples; and training a neural network using the training set, wherein the neural network comprises a first network and a second network, and after the training, the first network and the second network are to process two different input images for estimating a degree of similarity between the two different input images.
The method of claim 1, wherein degrading the quality of the one or more synthetic training samples comprises: applying a filter on the one or more one or more synthetic training samples.
The method of any one of the preceding claims, wherein a photorealistic training sample has a photorealistic resolution that is lower than a resolution of a synthetic training sample from which the photorealistic training sample is generated.
The method of any one of the preceding claims, wherein the quality of the one or more synthetic training samples comprises a brightness, level of noise, or contrast.
The method of any one of the preceding claims, wherein the first network and the second network have same weights.
The method of any one of the preceding claims, wherein the synthetic image comprises a three-dimensional model of an object, and degrading the quality of the one or more synthetic training samples comprises: degrading a quality of the synthetic image based on one or more materials of the object.
The method of any one of the preceding claims, wherein the first network is to generate a first output from one of the two different images, the second network is to generate a second output from another one of the two different images, and the neural network is to determine the degree of similarity based on the first output and the second output.
One or more non-transitory computer-readable media storing instructions executable to perform a method according to any one of the preceding claims.
An apparatus, comprising: a computer processor for executing computer program instructions; and one or more non-transitory computer-readable media storing computer program instructions executable by the computer processor to perform operations comprising: generating one or more synthetic training samples, a synthetic training sample comprising a synthetic image, generating one or more photorealistic training samples by degrading a quality of the one or more synthetic training samples, forming a training set with the one or more photorealistic training samples, and training a neural network using the training set, wherein the neural network comprises a first network and a second network, and after the training, the first network and the second network are to process two different input images for estimating a degree of similarity between the two different input images.
The apparatus of claim 9, wherein degrading the quality of the one or more synthetic training samples comprises: applying a filter on the one or more one or more synthetic training samples.
The apparatus of claim 9 or 10, wherein a photorealistic training sample has a photorealistic resolution that is lower than a resolution of a synthetic training sample from which the photorealistic training sample is generated.
The apparatus of any one of the claims 9 to 11, wherein the first network and the second network have same weights.
The apparatus of any one of the claims 9 to 12, wherein the synthetic image comprises a three-dimensional model of an object, and degrading the quality of the one or more synthetic training samples comprises: degrading a quality of the synthetic image based on one or more materials of the object.
The apparatus of any one of the claims 9 to 13, wherein the first network is to generate a first output from one of the two different images, the second network is to generate a second output from another one of the two different images, and the neural network is to determine the degree of similarity based on the first output and the second output.

Description

RELATED APPLICATIONS This application claims benefit to U.S. Provisional Patent Application Serial No. 62/675,601, filed May 23, 2018 and incorporated by reference herein in its entirety. TECHNICAL FIELD This disclosure relates in general to the field of computer systems and, more particularly, to machine learning systems. BACKGROUND The worlds of computer vision and graphics are rapidly converging with the emergence of Augmented Reality (AR), Virtual Reality (VR) and Mixed-Reality (MR) products such as those from MagicLeap™, Microsoft™ HoloLens™, Oculus™ Rift™, and other VR systems such as those from Valve™ and HTC™ . The incumbent approach in such systems is to use a separate graphics processing unit (GPU) and computer vision subsystem, which run in parallel. These parallel systems can be assembled from a pre-existing GPU in parallel with a computer vision pipeline implemented in software running on an array of processors and/or programmable hardware accelerators. BRIEF DESCRIPTION OF THE DRAWINGS Various objects, features, and advantages of the disclosed subject matter can be more fully appreciated with reference to the following detailed description of the disclosed subject matter when considered in connection with the following drawings, in which like reference numerals identify like elements. The accompanying figures are schematic and are not intended to be drawn to scale. For purposes of clarity, not every component is labelled in every figure. Nor is every component of each embodiment of the disclosed subject matter shown where illustration is not necessary to allow those of ordinary skill in the art to understand the disclosed subject matter. FIG. 1 illustrates a conventional augmented or mixed reality rendering system;FIG. 2 illustrates a voxel-based augmented or mixed reality rendering system in accordance with some embodiments;FIG. 3 illustrates the difference between dense and sparse volumetric representations in accordance with some embodiments;FIG. 4 illustrates a composite view of a scene in accordance with some embodiments;FIG. 5 illustrates the level of detail in an example element tree structure in accordance with some embodiments;FIG. 6 illustrates applications which can utilize the data-structure and voxel data of the present application in accordance with some embodiments;FIG. 7 illustrates an example network used to recognize 3D digits in accordance with some embodiments;FIG. 8 illustrates multiple classifications performed on the same data structure using implicit levels of detail in accordance with some embodiments;FIG. 9 illustrates operation elimination by 2D convolutional neural networks in accordance with some embodiments;FIG. 10 illustrates the experimental results from analysis of example test images in accordance with some embodiments;FIG. 11 illustrates hardware for culling operations in accordance with some embodiments;Fig. 12 illustrates a refinement to the hardware for culling operations in accordance with some embodiments;FIG. 13 illustrates hardware in accordance with some embodiments;FIG. 14 illustrates an example system employing an example training set generator in accordance with at least some embodiments;FIG. 15 illustrates an example generation of synthetic training data in accordance with at least some embodiments;FIG. 16 illustrates an example Siamese network in accordance with at least some embodiments;FIG. 17 illustrates example use of a Siamese network to perform an autonomous comparison in accordance with at least some embodiments;FIG. 18 illustrates an example voxelization of a point cloud in accordance with at least some embodiments;FIG. 19 is a simplified block diagram of an example machine learning model in accordance with at least some embodiments;FIG. 20 is a simplified block diagram illustrating aspects of an example training of a model in accordance with at least some embodiments;FIG. 21 illustrates an example robot using a neural network to generate a 3D map for navigation in accordance with at least some embodiments;FIG. 22 is a block diagram illustrating an example machine learning model for use with inertial measurement data in accordance with at least some embodiments;FIG. 23 is a block diagram illustrating an example machine learning model for use with image data in accordance with at least some embodiments;FIG. 24 is a block diagram illustrating an example machine learning model combining aspects of the models in the examples of FIGS. 22 and 23;FIGS. 25A-25B are graphs illustrating results of a machine learning model similar to the example machine learning model of FIG. 24;FIG. 26 illustrates an example system including an example neural network optimizer in accordance with at least some embodiments;FIG. 27 is a block diagram illustrating an example optimization of a neural network model in accordance with at least some embodiments;FIG. 28 is a table illustrating example results generated and used during an optimization of an example neural ne