CN-122029512-A - Data-to-voice interactive feedback

CN122029512ACN 122029512 ACN122029512 ACN 122029512ACN-122029512-A

Abstract

A first aspect of the invention relates to a method for representing biomedical image data, the method comprising the steps of providing biomedical image data, assigning a region of interest (R) of the biomedical image data to a physical vibration model (116) of an acoustic model, mapping data features of the region of interest (R) to parameters of the physical vibration model (116), and generating a sound output (112) of the acoustic model based on vibrations of the physical vibration model (116). Another aspect of the invention relates to a computer program. Yet another aspect of the invention relates to a system (20), the system (20) for generating a sound output representing biomedical image data.

Inventors

Nasir Navab
Sassan Martinfar
Shelvin Dehegani
Michael. Sommerferg

Assignees

慕尼黑工业大学,巴伐利亚自由州的代表机构

Dates

Publication Date: 20260512
Application Date: 20240924
Priority Date: 20231006

Claims (20)

1. A method for representing biomedical image data, the method comprising the steps of: providing biomedical image data; Assigning a region of interest (R) of the biomedical image data to a physical vibration model (116) of an acoustic model; mapping data characteristics of the region of interest (R) to parameters of the physical vibration model (116), and An acoustic output (112) of the acoustic model is generated based on vibrations of the physical vibration model (116).
2. The method of claim 1, wherein the biomedical image data comprises one or more of OCT scan (preferably OCT a scan), CT scan, PET scan, SPECT scan, and MRI scan.
3. The method according to claim 1 or 2, wherein the biomedical image data is or comprises an OCT B scan (106), and/or wherein the region of interest is or comprises an OCT a scan (104).
4. A method according to any of the preceding claims 1 to 3, further comprising the step of: Determining a local attenuation map (108) from the provided biomedical image data, and When assigning a region of interest (R) of the biomedical image data, the region of interest (R) is assigned as the region of interest (R) of the local attenuation map (108) such that when mapping data features, the data features of the region of interest (R) of the local attenuation map (108) are mapped to parameters of the physical vibration model (116).
5. The method of claim 4, wherein when determining the local attenuation map (108), local attenuation of at least one OCT a scan (104) is performed.
6. The method according to any of the preceding claims, wherein the biomedical image data is provided in real time and/or wherein the sound output is generated in real time.
7. A method according to any of the preceding claims, wherein at least one data feature is isomorphically mapped to at least one parameter.
8. The method according to any of the preceding claims, wherein the physical vibration model (116) comprises two or more nodes (N), preferably coupled in sequence.
9. The method of claim 8, wherein an external force (F) is applied to at least one node (N) of the physical vibration model (116).
10. The method according to claim 9, wherein the external force (F) is a parameter of the physical vibration model (116), and/or wherein the external force (F) corresponds to a virtual instrument.
11. The method according to any one of claims 8 to 10, wherein at least one node (N) is or comprises a listening node (L), wherein the sound output (112) of the acoustic model is based on vibrations of the listening node (L).
12. The method of any of the preceding claims, wherein the parameters of the physical vibration model (116) include one or more of one or more spring constants (k), one or more masses (m), and one or more damping coefficients (c).
13. The method according to any of the preceding claims, wherein the physical vibration model (116) comprises at least one fixed node (M), preferably two or more fixed nodes (M), the fixed nodes (M) being configured to have zero displacement when generating the sound output (112).
14. The method according to any of the preceding claims, wherein the region of interest (R) is segmented into segments (S) when mapping data features to parameters of the physical vibration model (116) or before mapping data features to parameters of the physical vibration model (116), wherein preferably at least one data feature of each segment (S) is mapped to at least one parameter of the physical vibration model (116).
15. The method according to claim 14, wherein a first one of the sections (S) starts at the occurrence of a unique structure, and/or wherein the first one of the sections (S) at least partially comprises a unique structure, preferably comprising an ERM region of the biomedical image data.
16. The method according to claim 14 or 15, wherein the region of interest (R) is segmented according to a Fibonacci sequence or an exponential sequence.
17. The method according to any one of claims 14 to 16, wherein each segment (S) is represented by one or more nodes (N) of the physical vibration model (116), preferably by two nodes (N) of the physical vibration model (116).
18. The method according to any one of claims 14 to 17, wherein a maximum value (V) of the data characteristic of each segment (S) is mapped to a parameter of the physical vibration model (116) when mapping the data characteristic to the parameter of the physical vibration model (116).
19. The method of any of the preceding claims, wherein generating the acoustic output (112) of the acoustic model comprises concatenating acoustic particles (114), each acoustic particle (114) corresponding to a different physical vibration model (116) of a respective different region of interest (R).
20. A computer program comprising executable instructions which, when executed by a processor, cause the processor to implement the method of any one of the preceding claims.

Description

Data-to-voice interactive feedback Technical Field The invention belongs to the field of data analysis, and particularly relates to a data-to-sound interactive feedback method. The method is implemented by a method for generating sound output from biomedical image data and by a related computer program and a related system. The method may also be used to train a machine learning algorithm. Background Interpretation and analysis of data sets by human users is typically limited to visual perception, whether by direct visual inspection of the data or by or visualization of the data on a graphical representation device (e.g., screen, etc.). This places a limitation on the user that visual perception cannot be used for other purposes when analyzing the (visualized) data. In some fields, such as medical fields, this is a major limitation. For example, during a surgical treatment, a surgeon may have to choose between focusing his/her visual perception on a data representation and/or data interpretation device (e.g., a screen displaying scanner or sensor results) or on the body of the patient receiving the surgical treatment. For example, in retinal surgery, delicate anatomy needs to be manipulated with accuracy on the order of microns, while surgical microscopes typically provide the primary visual access to the surgical site at the back of the eye. An example of such complex surgery is the dissection of the pre-retinal membrane (ERM), a thin film that is on average 60 μm thick that forms on top of the retina and may lead to reduced central vision if left untreated. The synthetic dye is typically applied to the retina prior to peeling the membrane with microsurgical forceps. This dye stains the Inner Limiting Membrane (ILM) and enables the surgeon to visually distinguish ERM located on top of it. A challenging, but critical aspect of subsequent procedures is the identification of the appropriate area to grasp the membrane and initiate the dissection. Although ERM is usually attached to the retina, a gap as small as 45±8 μm may occur between ERM and retina, especially in patients where a larger ERM hardens and pulls on the retina. In most cases, it is not possible to accurately visualize and quantify the subtle rise in ERM through microscopic views. However, optical Coherence Tomography (OCT) has been demonstrated to provide the required micron-scale resolution. In conjunction with surgical microscopy, intra-operative OCT (iost) has shown benefits for surgical decisions and surgical outcomes in ERM dissection surgery. However, a fundamental problem with iOCT in combination with surgical microscopes is finding the appropriate channel for simultaneous presentation of multi-modal information. In currently available systems, the microscope view and iOCT images are typically presented side-by-side. However, moving the line of sight between different visual monitors during such subtle tasks increases the cognitive load of the surgeon and affects their proprioception, potentially interfering with the surgical procedure and slowing the surgical speed. Thus, there is a need to develop a more advanced and intuitive way of providing multimodal information without distraction during critical surgical phases. Disclosure of Invention According to the invention, this problem is overcome with a method according to claim 1, a computer program according to claim 20 and a system according to claim 21. The dependent claims relate to particularly preferred embodiments of the invention. A first aspect of the invention relates to a method for representing biomedical image data, the method comprising the steps of: providing biomedical image data; Assigning a region of interest of the biomedical image data to a physical vibration model of an acoustic model; Mapping data characteristics of the region of interest to parameters of the physical vibration model, and A sound output of the acoustic model is generated based on vibrations of the physical vibration model. The invention allows information about biomedical image data, in particular data characteristics, to be presented as audible sound output. Thus, a user, such as a surgeon, does not need to view on a display to receive the information. Thus, cognitive load is reduced and user decisions are improved. The biomedical image data may include one or more biomedical images. Biomedical image data may be or may include one-dimensional data, two-dimensional data, three-dimensional data, and/or higher-dimensional data. In one or more preferred examples, the biomedical image data includes one or more of OCT scans, CT scans, PET scans, SPECT scans, and MRI scans. The OCT scan may be or may include an OCT B scan and/or an OCT a scan. In one or more specific examples, the biomedical image data is or includes an OCT B scan and/or the region of interest is or includes an OCT a scan. Providing biomedical image data may include capturing and/or obtaining one or more biomedical images. In some embodime