US-20260129398-A1 - METHOD AND DEVICE FOR TRANSFERRING SPEECH THROUGH VIRTUAL SPACE

US20260129398A1US 20260129398 A1US20260129398 A1US 20260129398A1US-20260129398-A1

Abstract

An electronic device is provided. The electronic device includes a processor, memory storing instructions, a display, and a speaker, wherein the instructions, when executed by the processor, cause the electronic device to receive, from other electronic device connected via communication, first acoustic data comprising voice data, obtain second acoustic data by reducing or eliminating, from the first acoustic data, an acoustic characteristic according to a physical space around the other electronic device, display a virtual object corresponding to the other electronic device through the display, identify a position and a heading direction of the other electronic device, obtain a voice output by adjusting the second acoustic data based on the identified position and the identified heading direction of the other electronic device, and reproduce the obtained speech output through the speaker.

Inventors

Myoungwoo NAM
Hyunsoo Kim
Nagyeom YOO
Jeaguk SHIM
Yoonho LEE
Sunghoon Yim

Assignees

SAMSUNG ELECTRONICS CO., LTD.

Dates

Publication Date: 20260507
Application Date: 20260102
Priority Date: 20230720

Claims (20)

1 . An electronic device comprising: a processor; memory storing instructions; a display; and a speaker, wherein the instructions, when executed by the processor, cause the electronic device to: receive, from other electronic device connected via communication, first acoustic data comprising voice data, obtain second acoustic data by reducing or eliminating, from the first acoustic data, an acoustic characteristic according to a physical space around the other electronic device, display a virtual object corresponding to the other electronic device through the display, identify a position and a heading direction of the other electronic device, obtain a voice output by adjusting the second acoustic data based on the identified position and the identified heading direction of the other electronic device, and reproduce the obtained voice output through the speaker.
2 . The electronic device of claim 1 , wherein the instructions, when executed by the processor, cause the electronic device to attenuate a high-pitched component of the second acoustic data based on a speaking angle between a first reference direction from the other electronic device to the electronic device and the heading direction of the other electronic device.
3 . The electronic device of claim 1 , wherein the instructions, when executed by the processor, cause the electronic device to obtain the second acoustic data by preserving an acoustic characteristic of the first acoustic data, based on a space, where the electronic device and the other electronic device are located, being constructed in correspondence with the physical space around the other electronic device.
4 . The electronic device of claim 1 , wherein the instructions, when executed by the processor, cause the electronic device to generate a voice output having an acoustic characteristic according to a physical space around the electronic device, based on a space, where the electronic device and the other electronic device are located, being constructed in correspondence with the physical space around the electronic device.
5 . The electronic device of claim 1 , wherein the instructions, when executed by the processor, cause the electronic device to obtain the second acoustic data by eliminating, from the first acoustic data, the acoustic characteristic according to the physical space around the other electronic device, based on a space, where the electronic device and the other electronic device are located, being constructed independently from the physical space around the other electronic device.
6 . The electronic device of claim 1 , wherein the instructions, when executed by the processor, cause the electronic device to: obtain, from the electronic device, third acoustic data comprising voice data; determine the acoustic characteristic according to the physical space around the electronic device; obtain fourth acoustic data by reducing or eliminating, from the third acoustic data, the acoustic characteristic according to the physical space around the electronic device; and transmit the obtained fourth acoustic data to the other electronic device.
7 . The electronic device of claim 1 , wherein the instructions, when executed by the processor, cause the electronic device to determine the acoustic characteristic according to the physical space around the electronic device from image data of the physical space around the electronic device, based on obtaining third acoustic data.
8 . The electronic device of claim 1 , wherein the speaker comprises: a first speaker, and a second speaker, and wherein the instructions, when executed by the processor, cause the electronic device to determine a first volume for the first speaker and a second volume for the second speaker, based on the position of the other electronic device and the heading direction of the other electronic device in a space where the electronic device and the other electronic device are located.
9 . The electronic device of claim 1 , wherein the speaker comprises: a first speaker, and a second speaker, and wherein the instructions, when executed by the processor, cause the electronic device to adjust a first volume for the first speaker and a second volume for the second speaker, based on at least one of a speaking angle between a first reference direction and the heading direction of the other electronic device, or a listening angle between a second reference direction that is opposite to the first reference direction and a heading direction of the electronic device.
10 . The electronic device of claim 1 , wherein the speaker comprises: a first speaker, and a second speaker, and wherein the instructions, when executed by the processor, cause the electronic device to: determine a first rotation direction of the heading direction of the other electronic device with respect to a first reference direction from the other electronic device to the electronic device to be one of a clockwise direction or a counterclockwise direction, determine a second rotation direction of the heading direction of the electronic device with respect to a second reference direction that is opposite to the first reference direction to be one of a clockwise direction or a counterclockwise direction, adjust a first volume for the first speaker and a second volume for the second speaker to have a first volume difference, based on the first rotation direction being equal to the second rotation direction, and adjust the first volume and the second volume to have a second volume difference that is less than the first volume difference, based on the first rotation direction being different from the second rotation direction.
11 . The electronic device of claim 1 , wherein the instructions, when executed by the processor, cause the electronic device to adjust a volume at which the voice output is reproduced, based on at least a portion of the physical space around the electronic device being equal to at least a portion of the physical space around the other electronic device.
12 . A method performed by an electronic device, the method comprising: receiving, from other electronic device connected via communication, first acoustic data comprising voice data; obtaining second acoustic data by reducing or eliminating, from the first acoustic data, an acoustic characteristic according to a physical space around the other electronic device; displaying a virtual object corresponding to the other electronic device through a display; identifying a position and a heading direction of the other electronic device; obtaining a voice output by adjusting the second acoustic data based on the identified position and the identified heading direction of the other electronic device; and reproducing the obtained voice output through a speaker.
13 . The method of claim 12 , wherein the obtaining of the voice output comprises attenuating a high-pitched component of the second acoustic data based on a speaking angle between a first reference direction from the other electronic device to the electronic device and the heading direction of the other electronic device.
14 . The method of claim 12 , wherein the obtaining of the second acoustic data comprises obtaining the second acoustic data by preserving an acoustic characteristic of the first acoustic data, based on a space, where the electronic device and the other electronic device are located, being constructed in correspondence with the physical space around the other electronic device.
15 . The method of claim 12 , further comprising: generating a voice output having an acoustic characteristic according to a physical space around the electronic device, based on a space, where the electronic device and the other electronic device are located, being constructed in correspondence with the physical space around the electronic device.
16 . The method of claim 12 , further comprising: obtaining the second acoustic data by eliminating, from the first acoustic data, the acoustic characteristic according to the physical space around the other electronic device, based on a space, where the electronic device and the other electronic device are located, being constructed independently from the physical space around the other electronic device.
17 . The method of claim 12 , further comprising: obtaining, from the electronic device, third acoustic data comprising voice data; determining the acoustic characteristic according to the physical space around the electronic device; obtaining fourth acoustic data by reducing or eliminating, from the third acoustic data, the acoustic characteristic according to the physical space around the electronic device; and transmitting the obtained fourth acoustic data to the other electronic device.
18 . The method of claim 12 , further comprising: determining the acoustic characteristic according to the physical space around the electronic device from image data of the physical space around the electronic device, based on obtaining third acoustic data.
19 . One or more non-transitory computer-readable storage media storing one or more computer programs including computer-executable instructions that, when executed by one or more processors of an electronic device individually or collectively, cause the electronic device to perform operations, the operations comprising: receiving, from other electronic device connected via communication, first acoustic data comprising voice data; obtaining second acoustic data by reducing or eliminating, from the first acoustic data, an acoustic characteristic according to a physical space around the other electronic device; displaying a virtual object corresponding to the other electronic device through a display; identifying a position and a heading direction of the other electronic device; obtaining a voice output by adjusting the second acoustic data based on the identified position and the identified heading direction of the other electronic device; and reproducing the obtained voice output through a speaker.
20 . The one or more non-transitory computer-readable storage media of claim 19 , the operations further comprising: attenuating a high-pitched component of the second acoustic data based on a speaking angle between a first reference direction from the other electronic device to the electronic device and the heading direction of the other electronic device.

Description

CROSS-REFERENCE TO RELATED APPLICATION(S) This application is a continuation application, claiming priority under 35 U.S.C. § 365(c), of an International application No. PCT/KR2024/007052, filed on May 24, 2024, which is based on and claims the benefit of a Korean patent application number 10-2023-0094758, filed on Jul. 20, 2023, in the Korean Intellectual Property Office, and of a Korean patent application number 10-2023-0127047, filed on Sep. 22, 2023, in the Korean Intellectual Property Office, the disclosure of each of which is incorporated by reference herein in its entirety. BACKGROUND 1. Field The disclosure relates to a technology for transferring speech through a virtual space. 2. Description of Related Art Recently, virtual reality (VR), augmented reality (AR), and mixed reality (MR) technologies utilizing computer graphics technology have been developed. Here, the virtual-reality technology refers to technology of establishing a virtual space which does not exist in the real world by using a computer and making the virtual space feel real, and augmented-reality or mixed-reality technology refers to technology of adding information generated by a computer to the real world, that is, technology of combining a virtual world with the real world and enabling a real-time interaction with a user. Among these technologies, AR and MR technologies are utilized in conjunction with technologies in various fields (e.g., broadcast technology, medical technology, game technology, etc.). Representative examples of integrating the augmented-reality technology and using the augmented-reality technology in the broadcast technology field are a smoothly changing weather map in front of a weather caster who delivers a weather forecast on television (TV) or an advertisement image, which does not exist in a stadium, inserted into a screen in a sports broadcast and broadcasted as if the advertisement image is real. A representative service for providing a user with AR or MR is the “metaverse.” The metaverse is a compound word of ‘meta’ meaning virtual or abstract and ‘universe’ meaning a world, which refers to three-dimensional virtual reality. The metaverse is a more advanced concept than a typical virtual reality environment and provides an augmented-reality environment which absorbs virtual reality, such as a web and the Internet, in the real world. The above information is presented as background information only to assist with an understanding of the disclosure. No determination has been made, and no assertion is made, as to whether any of the above might be applicable as prior art with regard to the disclosure. SUMMARY Aspects of the disclosure are to address at least the above-mentioned problems and/or disadvantages and to provide at least the advantages described below. Accordingly, an aspect of the disclosure is to provide a technology for transferring speech through a virtual space. Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments. In accordance with an aspect of the disclosure, an electronic device is provided. The electronic device includes a processor, memory storing instructions, a display, and a speaker, wherein the instructions, when executed by the processor, cause the electronic device to receive, from other electronic device connected via communication, first acoustic data including voice data, obtain second acoustic data by reducing or eliminating, from the first acoustic data, an acoustic characteristic according to a physical space around the other electronic device, display a virtual object corresponding to the other electronic device through the display, identify a position and a heading direction of the other electronic device, obtain a voice output by adjusting the second acoustic data based on the identified position and the identified heading direction of the other electronic device, and reproduce the obtained voice output through the speaker. In accordance with another aspect of the disclosure, a method performed by an electronic device is provided. The method includes receiving, from other electronic device connected via communication, first acoustic data including voice data, obtaining second acoustic data by reducing or eliminating, from the first acoustic data, an acoustic characteristic according to a physical space around the other electronic device, displaying a virtual object corresponding to the other electronic device through a display, identifying a position and a heading direction of the other electronic device, obtaining a voice output by adjusting the second acoustic data based on the identified position and the identified heading direction of the other electronic device, and reproducing the obtained voice output through a speaker. In accordance with another aspect of the disclosure, one or more non-transitory computer-readable storage medi