US-12626612-B2 - System and method of tactile based display (image) adaptation of videoconference proceedings

US12626612B2US 12626612 B2US12626612 B2US 12626612B2US-12626612-B2

Abstract

A conferencing system and method detects one or more two-dimensional (2D) images in presented 2D data and/or objects shown during a videoconference and renders them on a participant's three-dimensional (3D) tactile screen. The system includes a conferencing server that is in communication with one or more participant devices. An image recognition engine is in communication with the conferencing server and identifies the one or more 2D images. An imaging processor is in communication with the image recognition engine, and may be in communication with a camera that recognizes objects shown during the videoconference. The imaging processor creates a 3D rendering of the 2D images and/or the objects and directly or indirectly sends them to the participant's 3D tactile screen.

Inventors

Birendra Kumar Sahu
Logendra Naidoo

Assignees

MITEL NETWORKS CORPORATION

Dates

Publication Date: 20260512
Application Date: 20230807

Claims (19)

1 . A conferencing system for detecting one or more two-dimensional (2D) images in conference data and rendering the images on a three-dimensional (3D) tactile screen, wherein the conferencing system comprises: a conferencing server that is in communication with one or more participant devices and that is configured to (a) play a 2D conference that includes 2D data comprising the one or more 2D images, and (b) make the 2D data available to one or more participant devices; an image recognition engine in communication with, or that is part of, the conferencing server and that electronically identifies the one or more 2D images and their position in relation to text in the 2D data; an imaging processor in communication with the conferencing server and with the image recognition engine, and that includes imaging software configured to create a 3D rendering of the one or more 2D images contained in the 2D data; an image description engine in communication with the imaging processor and/or the conferencing server, wherein the imaging description engine is programmed to recognize each of the one or more 2D images and generate a tactile description of each; and a tactile 3D screen in communication with the imaging processor and the imaging description engine, wherein the tactile 3D screen is configured to receive the 3D rendering generated by the imaging processor, receive the tactile description generated by the imaging description engine, project the 3D rendering and the tactile description so they can be touched and perceived by a conference participant, and provide greater definition of the 3D rendering in response to touch commands by the conference participant on the 3D screen conference participant device.
2 . The conferencing system of claim 1 that further includes a video camera in communication with the imaging processor, wherein the video camera is configured to detect an object within a defined video conferencing field and communicate an image of the object to the imaging processor, wherein the imaging processor creates a 3D rendering of the object image, and the 3D rendering of the object image is transmitted to the tactile 3D screen to that it can be touched and perceived by the conference participant.
3 . The conferencing system of claim 2 , wherein the 3D rendering of the object is transmitted to the image description engine, which is configured to recognize the object, generate a tactile description of it, and transmit the tactile description to the tactile 3D screen of the participant device so that the tactile description can be touched and perceived by the conference participant.
4 . The conferencing system of claim 1 , wherein the tactile description is interposed within a Braille rendering of text presented in the 2D data.
5 . The conferencing system of claim 2 , wherein the object is recognized by the imaging processor using Auto ML Vision Object Detection.
6 . The conferencing system of claim 1 that further includes a text-to-Braille processor that converts text in the 2D data to Braille and the Braille is displayed on the tactile 3D screen so that the conference participant can touch and perceive it.
7 . The conferencing system of claim 6 , wherein the tactile 3D screen includes a refreshable Braille display.
8 . The conferencing system of claim 1 that further includes a database to store the 2D data and computerized speech of the 2D data text.
9 . The conferencing system of claim 1 that further includes a scanner to scan a 2D image and transfer the scanned 2D image to the imaging processor, which creates a 3D rendering of the 2D scanned image and is configured to transmit the 3D rendering to the tactile 3D screen where it can be touched and perceived by the conference participant.
10 . A computerized method for detecting one or more 2D images in 2D data and/or one or more 3D objects, converting each of the one or more 2D images and the one or more 3D objects to 3D renderings, and transmitting the 3D renderings to a 3D tactile screen, wherein the method comprises the steps of: a conferencing server operating a 2D conference file that includes 2D data, wherein the 2D data comprises the one or more 2D images therein, and the conferencing server making the 2D data available to one or more conference devices, wherein each conference device is unique to a particular conference participant; utilizing an imaging processor in communication with the conferencing server, creating a 3D rendering of at least one of the one or more 2D images embedded in the 2D data, and transmitting the 3D rendering to a tactile 3D screen; utilizing an image description engine in communication with the imaging processor (a) recognizing the 3D rendering or the at least one of the one or more 2D images, (b) generate a tactile image description of the at least one of the one or more 2D images or the 3D rendering, and (c) transmit the tactile image description to the tactile 3D screen; and utilizing the tactile 3D screen in communication with the imaging processor and the imaging description engine, receiving the 3D rendering from the imaging processor and receiving the tactile image description from the imaging description engine, and projecting them so they can be touched and perceived by the particular conference participant, and providing greater definition of the 3D rendering in response to touch commands by the conference participant on the 3D screen conference participant device.
11 . The method of claim 10 , wherein the imaging description engine includes image caption generator software that analyzes the 2D image or the 3D rendering and generates a tactile image description of it.
12 . The method of claim 10 that further includes the step of the tactile 3D screen refreshing when a new slide of 2D data is presented.
13 . The method of claim 10 , wherein (a) a first graphical user interface on the participation device for the unique conference participant includes a control that permits the unique conference participant to choose the duration for which the 3D rendering and the tactile image description remain on the tactile 3D screen, and (b) a second graphical user interface on the conferencing server that includes a second control that permits a conference host to choose the duration for which the 3D rendering and the tactile image description remain on the tactile 3D screen.
14 . The method of claim 10 , wherein one or more 3D objects are recognized by a video camera with a video conferencing field and that further includes the step of a conference host adjusting the video conferencing field to eliminate extraneous objects, wherein the step of adjusting is based on the location of the one or more 3D objects, the size of the one or more 3D objects, and/or the movement of the 3D objects.
15 . The method of claim 10 , wherein the participant device includes a GUI configured to permit the unique conference participant to provide comments via Braille.
16 . The method of claim 15 , wherein the unique conference participant can also input commands via Braille device utilizing the GUI of the participant device.
17 . A conferencing system for detecting one or more 2D images in 2D data and rendering the images on a three-dimensional (3D) tactile screen, wherein the conferencing system comprises: a conferencing server that is in communication with one or more participant devices and that is configured to (a) play a 2D videoconference that includes the 2D data comprising the one or more 2D images, and (b) make the 2D data available to one or more participant devices; an imaging processor in communication with the conferencing server and that includes imaging software configured to create a 3D rendering of the one or more 2D images contained in the 2D data; an image description engine in communication with the imaging processor and/or the conferencing server, wherein the imaging description engine is programmed to recognize the each of the one or more 2D images and generate a tactile description of each; a tactile 3D screen in communication with the imaging processor and the imaging description engine, wherein the tactile 3D screen is configured to receive the 3D rendering from the imaging processor and receive the tactile description from the imaging description engine and project them so they can be touched and perceived by a conference participant; a conference device having a GUI configured to permit a unique conference participant to provide comments via Braille and input commands via Braille; and one or more other conference devices each unique to one or more other conference participants, wherein the comments or commands of the unique conference participant create a vibratory motion on the one or more other conference devices so as to be recognized by the one or more other conference participants.
18 . The conferencing system of claim 17 , wherein the comment or command includes the name of the unique conference participant who sent it.
19 . The conferencing system of claim 17 , wherein emphasized portions of the conference text are presented with a vibratory motion on the one or more conference devices.

Description

BACKGROUND Most available collaboration tools support video conferencing and due to work-from-home options available, video conferencing usage has increased. There are a variety of collaboration tools that can be used by visually-impaired people (which includes people who have low vision). These include audio-conferencing tools, text-to-speech software, text-based communication tools, screen reader software, and online collaboration platforms. However, none of these tools has a feature that assists visually-impaired people to understand and interact with visual images or objects (versus braille text being supplied). There are existing tools that can read text that is displayed on a screen and convert the text to braille, but this is limited to text and does not help in detecting and converting to a three-dimensional (3D) rendering of objects or of two-dimensional (2D) images in a videoconference and/or within a defined videoconferencing field. Including such a feature in a videoconferencing tool would provide visually impaired people on expanded perception of the videoconference. Some collaboration tools are available for visually-impaired people and examples include: (1) Audio conferencing tools, which allow users to participate in audio-only conference calls. These include Zoom, Skype, and Google Meet. (2) Text-to-speech software. This type of software converts written text into spoken words, which can be helpful for people who are visually impaired when reading written documents. Examples include NaturalReader and TextAloud. (3) Text-based communication tools such as Slack and Microsoft Teams, which allow users to communicate with each other using text-based messages. This can be easier for people who are visually impaired as compared to video conferencing tools. (4) Screen reader software, which speaks the contents of a computer screen to users who are visually impaired. Examples include JAWS and NVDA. (5) Online collaboration platforms such as Google Docs and Microsoft Office 365, which allow multiple users to work on the same document or spreadsheet in real-time. This can be useful for people who are visually impaired and must collaborate with others. SUMMARY Systems and methods according to this disclosure provide, along with the text being displayed on a videoconference screen being converted to braille for visually-impaired participants, the detection of 2D images, and optionally objects (which may be 3D), and displaying (or rendering) them on a tactile 3D screen of the visually-impaired participant. Such a system and method can also generate a caption/description of the 2D image or of the object, wherein the caption/description is displayed in braille. The systems and methods herein thus involve turning the imagery of a videoconference into tactile 3D objects, wherein a visually-impaired person can feel the shapes of things such as faces, 2D images, and/or objects (which may be 3D). The systems and methods of this disclosure provide one or more of (1) an automated recognition and identification of a 2D image or of an object that is being displayed, and providing a 3D rendering and optionally an identification of the 3D rendering in braille. This 2D image/object information can be analyzed and detected using object identification (e.g., AutoML Vision Object Detection), and then represented in braille on a participant device. A method/system according to this disclosure may also convert videoconference text to braille on a participant's device. BRIEF DESCRIPTION OF THE DRAWINGS The subject matter of the present disclosure is particularly pointed out and distinctly claimed in the concluding portion of this specification. A more complete understanding of the present disclosure, however, may best be obtained by referring to the detailed description and claims when considered in connection with the drawing figures, wherein like numerals denote like elements and wherein: FIG. 1 is an exemplary system according to aspects of this disclosure. FIG. 2 is an exemplary application of the system of FIG. 1 converting a 2D image into a 3D rendering on a 3D screen. FIG. 3 is an example of an application of the system of FIG. 1 providing a Braille description of an image. FIG. 4 illustrates participants in a videoconference. FIG. 5 illustrates a 3D rendering of the participants of FIG. 4. FIG. 6 illustrates a method according to aspects of this disclosure. FIG. 7 illustrates a method according to aspects of this disclosure. It will be appreciated that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of illustrated embodiments of the present invention. DETAILED DESCRIPTION Visually-impaired users can join a videoconference via their preferred collaboration tool, e.g., MiTeam Meetings, Zoom, Microsoft Teams, et