CN-122003677-A - Defending patient re-identification attacks
Abstract
A method for de-identifying a particular type of image indicative of a type of imaged object is disclosed. The method includes providing a first machine learning model for performing an object recognition task and a second machine learning model for performing an image analysis task, determining a first rate of change of a first penalty associated with the first machine learning model for the image using the first machine learning model, determining a second rate of change of an output of the second machine learning model for the image using the second machine learning model, determining a perturbation of the image using the first rate and the second rate such that a change of the output of the second machine learning model between the image and the image perturbed by the perturbation is mitigated, and modifying the image using the perturbation.
Inventors
- A. Zarbach
- H. Nicki Shih
Assignees
- 皇家飞利浦有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20240912
- Priority Date
- 20230929
Claims (14)
- 1. A method for de-identifying a particular type of image, the particular type being indicative of a type of imaged object, the method comprising: Providing a first machine learning model for performing object recognition tasks and a second machine learning model for performing image analysis tasks, the first machine learning model having been trained on the particular type of image and the second machine learning model having been trained on the particular type of image; determining a first rate of change of metric loss relative to the image using the first machine learning model; Determining a second rate of change of an output of the second machine learning model relative to the image using the second machine learning model; determining a perturbation of the image using the first ratio and the second ratio such that a change in an output of the second machine learning model between the image and the image perturbed by the perturbation is mitigated; The perturbation is used to modify the image.
- 2. The method of claim 1, wherein the image is provided in a format representing image data, the particular type further indicating the format.
- 3. The method of claim 1 or 2, wherein the image is provided in a format representing image data and metadata describing the imaged object and/or the image, the particular type further being indicative of the format.
- 4. A method according to claim 3, wherein the perturbation is used to modify the image data of the image and/or the metadata of the image.
- 5. A method according to claim 3, wherein the metadata of the image is de-identified using a predefined de-identification method before determining the first and second ratios.
- 6. The method of any of the preceding claims, the first rate of change being a first vector, an element of the first vector representing the image, the second rate of change being a second vector, an element of the second vector representing the image, wherein the perturbation is a vector rejection of the first vector from the second vector, the vector rejection being scaled by a scaling factor.
- 7. The method of any of the preceding claims 1 to 5, wherein the perturbation comprises a first perturbation that increases the metric loss and a second perturbation that reduces the loss of the second machine learning model, wherein the modifying comprises modifying a first portion of the image using the first perturbation and modifying a second portion of the image using the second perturbation.
- 8. The method of any of the preceding claims, wherein the image is represented by an array, wherein the perturbation is an array of the same size as the array of the image, wherein the modifying of the image comprises scaling each element of the array of the image by perturbing a corresponding value of the array.
- 9. The method of any of the preceding claims, the first machine learning model being a re-recognition machine learning model configured to perform the object recognition task using the particular type of image and one or more reference images as inputs.
- 10. The method of any preceding claim, the first rate of change of the metric loss being a gradient of the metric loss, the second rate of change of the output being a gradient of the output.
- 11. The method according to any of the preceding claims, the specific type being indicative of a medical image type.
- 12. The method of any of the preceding claims, the particular type indicating an image format, wherein the image format comprises a digital imaging in medicine and a communication DICOM format.
- 13. A computer program product comprising a computer readable medium having computer readable code embodied therein, the computer readable code being configured such that when executed by a suitable computer or processor causes the computer or processor to perform the method of any of claims 1 to 12.
- 14. An apparatus comprising at least one processor, and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus to at least perform the method of any one of claims 1-12.
Description
Defending patient re-identification attacks Technical Field The present invention relates to a system, related apparatus and method, computer program element and computer readable medium for processing data. Background Person re-identification in camera images or video streams is an active research area with many applications, in particular in the security area. Recently, such techniques have been successfully used with chest X-ray images, such as reported by K Packh ä user et al in "Deep learning-based patient re-identification is able to exploit the biometric nature of medical chest X-ray data"(, NATURE, scientific reports Vol 12:14851, 2022). The authors use a twin neural network (SNN) with contrast loss to generate a low-dimensional map of the input data. Using this setup, the web learning maps images from the same patient to nearby points in the low-dimensional space, while images from different patients are mapped to distant points. Using different data sets, they can consistently achieve very high re-recognition scores (precision@1 between 0.882 and 0.996). The more detailed analysis also indicates strong robustness with respect to the presence of anomalies and the time span between acquisitions. These results indicate that common de-identification techniques for medical images, covering digital imaging and communications in medicine (DICOM) tags or burned in annotations, may not be sufficient to protect the identity of the patient. Disclosure of Invention Accordingly, improved methods and systems for defending against re-recognition attacks, particularly with respect to personal data (e.g., medical, etc.), may be desired. The object of the invention is achieved by the subject matter of the independent claims, wherein further embodiments are incorporated in the dependent claims. It should be noted that the following described aspects of the invention apply equally well for the relevant apparatus and method, the computer program element and the computer readable medium. Example embodiments provide a method for de-identifying a particular type of image (referred to as a target image), the particular type indicating a type of imaged object, the method comprising providing a first machine learning model for performing an object identification task and a second machine learning model for performing an image analysis task, the first machine learning model having been trained on the particular type of image, the second machine learning model having been trained on the particular type of image, determining a first rate of change of metric loss relative to the target image using the first machine learning model, determining a second rate of change of an output of the second machine learning model relative to the target image using the second machine learning model, determining a perturbation of the target image using the first rate and the second rate such that a change in the output of the second machine learning model between the target image perturbed by the perturbation and the target image is mitigated, modifying the target image using the perturbation. Example embodiments provide a computer program product comprising a computer readable medium having computer readable code embodied therein, the computer readable code configured such that when executed by a suitable computer or processor causes the computer or processor to perform the method. Example embodiments provide an apparatus comprising at least one processor, and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus to at least perform the method. Drawings Examples of the invention will now be described with reference to the following drawings, which are not drawn to scale unless otherwise indicated, wherein: FIG. 1 is a flow chart of a method for de-recognition of an image according to one example of the present subject matter; FIG. 2 illustrates a block diagram of a data processing system; FIG. 3 shows a block diagram of a data processing component configured to defend against a re-recognition attack; FIG. 4 illustrates the operation of the re-identification defender component of FIG. 3 in accordance with one example of the present subject matter, and Fig. 5 shows a flow chart of a computer-implemented method for defending against a re-identification attack. Detailed Description As used herein, the terms "first," "second," and the like, are used as labels for nouns following them, and do not imply any type of ordering (e.g., spatial, temporal, logical), unless it is explicitly defined. The first machine learning model may be configured to perform an object recognition task. The first machine learning model may have been trained on a particular type of image to minimize a first loss function of the first machine learning model. The first machine learning model may be a third party model with a known or assumed loss function or may be a proprietary model specifically developed for this pur