US-12620210-B2 - Machine-learning techniques for detecting artifact pixels in images
Abstract
Method and systems for of using a machine-learning model to detect predicted artifacts at a target image resolution are provided. A machine-learning model trained to detect artifact pixels in images at a target image resolution is accessed. An image depicting at least part of the biological sample at an initial image resolution can be converted at the target image resolution. The machine-learning model is applied to the converted image to identify one or more artifact pixels from the converted image. Method and systems for training the machine-learning model to detect predicted artifacts at the target image resolution are also provided.
Inventors
- Qinle Ba
- Jim F. Martin
- Karel J. Zuiderveld
- Uwe Horchner
Assignees
- VENTANA MEDICAL SYSTEMS, INC.
Dates
- Publication Date
- 20260505
- Application Date
- 20240319
Claims (20)
- 1 . A method comprising: accessing an image depicting at least part of a biological sample; applying an image pre-processing algorithm to the image to generate a pre-processed image, wherein the pre-processed image includes a plurality of labeled pixels, and wherein each labeled pixel of the plurality of labeled pixels is associated with a label predicting whether the pixel accurately depicts a corresponding point or region of the at least part of the biological sample; applying a machine-learning model to the pre-processed image to identify one or more labeled pixels from the plurality of labeled pixels, wherein the one or more labeled pixels are predicted to have been erroneously labeled by the image pre-processing algorithm; modifying a label of each of the one or more labeled pixels; generating a training image that includes at least the one or more labeled pixels with the modified labels; and outputting the training image.
- 2 . The method of claim 1 , wherein the label further identifies a type of artifact, wherein the pixel is further predicted to depict at least part of an artifact corresponding to the type of artifact.
- 3 . The method of claim 1 , further comprising: applying a blur threshold to each labeled pixel of the plurality of labeled pixels; determining, based on applying the blur threshold, that an additional labeled pixel of the plurality of labeled pixels has been erroneously labeled; and modifying a label corresponding to the additional labeled pixel.
- 4 . The method of claim 3 , wherein the blur threshold is determined based on performance of a downstream algorithm on a set of z-axis images depicting the at least part of a biological sample across a depth dimension.
- 5 . The method of claim 1 , wherein the image pre-processing algorithm includes image segmentation, morphological operation, image thresholding, image filtering, image contrast enhancement, blur detection, or combination thereof.
- 6 . The method of claim 1 , wherein the label further predicts whether the pixel depicts at least part of an artifact associated with a particular artifact type.
- 7 . The method of claim 6 , wherein the particular artifact type includes a blurry region, a tissue fold, and a foreign object.
- 8 . A method comprising: accessing an image depicting at least part of a biological sample; applying an image pre-processing algorithm to the image to generate a pre-processed image, wherein the pre-processed image includes a plurality of labeled pixels, and wherein each labeled pixel of the plurality of labeled pixels is associated with a label predicting whether the pixel accurately depicts a corresponding point or region of the at least part of the biological sample; applying a machine-learning model to the pre-processed image to identify one or more labeled pixels from the plurality of labeled pixels, wherein the one or more labeled pixels are predicted to have been erroneously labeled by the image pre-processing algorithm; modifying a label of each of the one or more labeled pixels; generating a training image that includes at least the one or more labeled pixels with the modified labels training a machine-learning model that includes a set of convolutional layers to detect one or more artifact pixels in images at a target image resolution, wherein the machine-learning model is configured to apply each convolutional layer of the set of convolutional layers to a feature map representing an input image, wherein an artifact pixel of the one or more artifact pixels is predicted to not accurately depict a point or region of the at least part of the biological sample, and wherein training the machine-learning model includes: for each labeled pixel of the plurality of labeled pixels of the training image: determining a first loss of the labeled pixel at a first image resolution by applying a first convolution layer of the set of convolutional layers to a first feature map representing the training image at the first image resolution, determining a second loss of the labeled pixel at a second image resolution by applying a second convolution layer of the set of convolutional layers to a second feature map representing the training image at a second image resolution, wherein the second image resolution has a higher image resolution relative to the first image resolution, determining a total loss for the labeled pixel based on the first loss and the second loss, and determining, based on the total loss, that the machine-learning model has been trained to detect the one or more artifact pixels at the target image resolution; and outputting the machine-learning model.
- 9 . The method of claim 8 , further comprising converting the training image to a greyscale training image, wherein the machine-learning model is trained using the greyscale training image.
- 10 . The method of claim 8 , further comprising converting the plurality of labeled pixels of the training image from a first color space to a second color space to generate a modified training image, wherein the machine-learning model is trained using the modified training image.
- 11 . The method of claim 8 , wherein the total loss is determined based on a sum of the first loss and the second loss.
- 12 . The method of claim 8 , wherein the total loss is determined based on an average between the first loss and the second loss.
- 13 . The method of claim 8 , wherein the target image resolution is the first image resolution.
- 14 . A method comprising: accessing an image depicting at least part of a biological sample, wherein the image is at a first image resolution; accessing a machine-learning model trained to detect artifact pixels in images at a second image resolution, wherein the first image resolution has a higher image resolution relative to the second image resolution, wherein the machine-learning model is trained by: accessing an image depicting at least part of a biological sample, applying an image pre-processing algorithm to the image to generate a pre-processed image, wherein the pre-processed image includes a plurality of labeled pixels, and wherein each labeled pixel of the plurality of labeled pixels is associated with a label predicting whether the pixel accurately depicts a corresponding point or region of the at least part of the biological sample, applying the machine-learning model to the pre-processed image to identify one or more labeled pixels from the plurality of labeled pixels, wherein the one or more labeled pixels are predicted to have been erroneously labeled by the image pre-processing algorithm, modifying a label of each of the one or more labeled pixels, generating a training image that includes at least the one or more labeled pixels with the modified labels, for each labeled pixel of the plurality of labeled pixels of the training image: determining a first loss of the labeled pixel at a first image resolution by applying a first convolution layer of a set of convolutional layers to a first feature map representing the training image at the first image resolution, determining a second loss of the labeled pixel at the second image resolution by applying a second convolution layer of the set of convolutional layers to a second feature map representing the training image at a second image resolution, wherein the second image resolution has a higher image resolution relative to the first image resolution, determining a total loss for the labeled pixel based on the first loss and the second loss, and determining, based on the total loss, that the machine-learning model has been trained to detect one or more artifact pixels at a target image resolution; and converting the image to generate a converted image that depicts the at least part of the biological sample at the second image resolution; applying the machine-learning model to the converted image to identify one or more artifact pixels from the converted image, wherein an artifact pixel of the one or more artifact pixels is predicted to not accurately depict a point or region of the at least part of the biological sample; and generating an output that includes the one or more artifact pixels.
- 15 . The method of claim 14 , further comprising: applying a blur threshold to each labeled pixel of the plurality of labeled pixels; determining, based on applying the blur threshold, that an additional labeled pixel of the plurality of labeled pixels has been erroneously labeled; and modifying a label corresponding to the additional labeled pixel.
- 16 . The method of claim 15 , wherein the blur threshold is determined based on performance of a downstream algorithm on a set of z-axis images depicting the at least part of a biological sample across a depth dimension.
- 17 . The method of claim 14 , further comprising: converting the training image to a greyscale training image, wherein the machine-learning model is trained using the greyscale training image; and converting the plurality of labeled pixels of the training image from a first color space to a second color space to generate a modified training image, wherein the machine-learning model is trained using the modified training image.
- 18 . The method of claim 14 , wherein the output is an image mask that includes the one or more artifact pixels, the method further comprising: overlaying the image mask on the image to distinguish a set of pixels in the image from the one or more artifact pixels; and applying a cell-classification model to the set of pixels.
- 19 . The method of claim 14 , wherein the output identifies a quantity of the one or more artifact pixels.
- 20 . The method of claim 14 , further comprising using the output to adjust one or more scanning parameters of a scanning device.
Description
CROSS-REFERENCES TO RELATED APPLICATIONS This application is a continuation of International Application No. PCT/US2022/046096, filed on Oct. 7, 2022, which claims priority to U.S. Provisional Patent Application No. 63/256,328, entitled “Machine-Learning Techniques For Detecting Artifact Pixels In Images,” filed on Oct. 15, 2021, each of which are hereby incorporated by reference in their entirety for all purposes. BACKGROUND OF THE INVENTION Immunohistochemistry (IHC) assays enable the visualization and quantification of biomarker location, which play a critical role in both cancer diagnostics and oncology research. In addition to the “gold-standard” DAB (3,3′-diaminobenzidine)-based IHC assays, recent years have seen advances in both brightfield multiplex IHC assays and multiple fluorescent IHC assays. These multiplex IHC assays can be used to, among others, identify multiple biomarkers in the same slide image. Such assays not only improve efficiency in identifying biomarkers in a single slide, but also facilitate the identification of additional properties associated with such biomarkers (e.g., co-localized biomarkers). Quality control of slide images can be performed to improve performance and reduce errors in digital pathology analyses. In particular, quality control allows the digital pathology analyses to accurately detect diagnostic or prognostic biomarkers from the slide images. Quality control may include, among others, detecting and excluding pixels of the slide images that are predicted to depict one or more image artifacts. The artifacts can include tissue folds, foreign objects, blurry image portions, and any other distortions that prevent an accurate depiction of a corresponding region of the biological sample. For example, a tissue fold present at the biological sample may cause one or more portions of the image to be blurry. These artifacts can likely contribute to errors or inaccurate results in subsequent digital pathology analyses. For example, artifacts detected in a slide image can result in the digital pathology analysis miscounting the number of detected cells, misidentifying a set of tumor cells as being normal, etc. In effect, the artifacts can contribute to an inaccurate diagnosis for a subject associated with the slide image. SUMMARY In some embodiments, a method of generating training data for training a machine-learning model to detect predicted artifacts in an image is provided. The method can include accessing an image depicting at least part of a biological sample. The method can also include applying an image pre-processing algorithm to the image to generate a pre-processed image. In some instances, the pre-processed image includes a plurality of labeled pixels. Each labeled pixel of the plurality of labeled pixels can be associated with a label predicting whether the pixel accurately depicts a corresponding point or region of the at least part of the biological sample. The method can also include applying a machine-learning model to the pre-processed image to identify one or more labeled pixels from the plurality of labeled pixels. In some instances, the one or more labeled pixels are predicted to have been erroneously labeled by the image pre-processing algorithm. The method can also include modifying a label of each of the one or more labeled pixels. The method can also include generating a training image that includes at least the one or more labeled pixels with the modified labels. The method can also include outputting the training image. In some embodiments, a method of training a machine-learning model to detect predicted artifacts in an image at a target image resolution is provided. The method can include accessing a training image depicting at least part of a biological sample. In some instances, the training image includes a plurality of labeled pixels, in which each labeled pixel of the plurality of labeled pixels is associated with a label predicting whether the pixel accurately depicts a corresponding point or region of the at least part of the biological sample. The method can also include accessing a machine-learning model that includes a set of convolutional layers. In some instances, the machine-learning model is configured to apply each convolutional layer of the set of convolutional layers to a feature map representing an input image. The method can also include training the machine-learning model to detect one or more artifact pixels in images at a target image resolution. In some instances, an artifact pixel of the one or more artifact pixels is predicted to not accurately depict a point or region of the at least part of the biological sample. In some instances, the training includes, for each labeled pixel of the plurality of labeled pixels of the training image: (i) determining a first loss of the labeled pixel at the first image resolution by applying a first convolution layer of the set of convolutional layers to a first feature map representing t