EP-4248647-B1 - PROCESSING AND DISTRIBUTION OF AUDIO SIGNALS IN A MULTI-PARTY CONFERENCING ENVIRONMENT

EP4248647B1EP 4248647 B1EP4248647 B1EP 4248647B1EP-4248647-B1

Inventors

MALAN, D. HUGO

Dates

Publication Date: 20260506
Application Date: 20211110

Claims (15)

A method for distributing audio signals among a plurality of communication devices (108) that respectively correspond to a plurality of users of a team (102a, 102b, 102c), the method comprising: facilitating, by a telephony controller (200), a one-to-one call over a communication network, between each communication device of the team and a corresponding remote communication device (114) of a remote person who is not part of the team; team (106a, 106b, 106c); distributing, by the telephony controller, to each communication device of the team, the outbound communication signal of each of the other communication devices of the team, in addition to the inbound communication signal of the corresponding remote communication device in one-to-one call with the communication device of the team, while isolating from the communication device of the team the inbound communication signals of the other remote communication devices in one-to-one call with the other communication devices of the team; applying, by the telephony controller, for each communication device of the team, audio processing involving three-dimensional, 3D, speech localization to each outbound communication signal distributed to the communication device of the team, thus enabling each user of the team to readily differentiate the outbound communication signal of each of the other users of the team; during an audio connection between a (102a) of the plurality of users and a remote person (106a) via the telephony controller, receiving a first outbound signal via an audio processing module, wherein the first outbound signal encodes audio being transmitted to a remote communication device of the remote person from a first communication device of the plurality of communication devices, wherein the first communication device corresponds to the first user; receiving a first inbound signal, wherein the first inbound signal encodes audio being transmitted to the first communication device from the remote communication device; receiving a set of outbound signals from the plurality of communication devices other than the first communication device; generating a first combined signal by combining, via a team combination module, the set of outbound signals with the first inbound signal, wherein the first combined signal excludes inbound signals transmitted to the plurality of communication devices other than the first communication device; and transmitting the first combined signal to the first communication device.
The method of claim 1 further comprising forwarding the first outbound signal to the remote communication device.
The method of claim 1 further comprising: generating a second combined signal by combining the set of outbound signals excluding a second outbound signal, wherein the second outbound signal encodes audio from a second communication device corresponding to a second user; and transmitting the second combined signal to the second communication device.
The method of claim 1 wherein generating the first combined signal includes combining the set of outbound signals with corresponding time delays for a subset of outbound signals included in the first combined signal.
The method of claim 4 wherein the corresponding time delays prevent the set of outbound signals included in the first combined signal from overlapping.
The method of claim 4 further comprising: for each outbound signal of the set of outbound signals included in the first combined signal, adjusting a volume of the outbound signal based on the first inbound signal.
The method of claim 6 wherein adjusting the volume of each outbound signal of the set of outbound signals includes implementing a machine learning algorithm to normalize each outbound signal of the set of outbound signals included in the first combined signal.
The method of claim 1 further comprising: transmitting the first outbound signal to a set of remote communication devices.
The method of claim 1 wherein the first communication device includes: binaural headphones for receiving the first combined signal, and a microphone for transmitting the first outbound signal.
A system for distributing audio signals among a plurality of communication devices that respectively correspond to a plurality of users, the system comprising: at least one processor; and a memory coupled to the at least one processor, wherein the memory stores instructions for execution by the at least one processor, and wherein the instructions include, facilitating, by a telephony controller, controller (200), a one-to-one call over a communication network, between each communication device of the team (102a, 102b, 102c) and a corresponding remote communication device of a remote person (106a,106b, 106c) who is not part of the team; distributing, by the telephony controller, to each communication device of the team, the outbound communication signal of each of the other communication devices of the team, in addition to the inbound communication signal of the corresponding remote communication device in one-to-one call with the communication device of the team, while isolating from the communication device of the team the inbound communication signals of the other remote communication devices in one-to-one call with the other communication devices of the team; applying, by the telephony controller, for each communication device of the team, audio processing involving three-dimensional, 3D, speech localization to each outbound communication signal distributed to the communication device of the team, thus enabling each user of the team to readily differentiate the outbound communication signal of each of the other users of the team; during an audio connection between a first user (102a) of the plurality of users and a remote person (106a) via the telephony controller, receiving a first outbound signal via an audio processing module, wherein the first outbound signal encodes audio being transmitted to the remote person from a first communication device (108) corresponding to the first user; receiving a first inbound signal, wherein the first inbound signal encodes audio being transmitted to the first user from a remote communication device (114) of the remote person; receiving a set of outbound signals from the plurality of communication devices other than the first communication device; generating a first combined signal, via a team combination module, by combining the set of outbound signals with the first inbound signal, wherein the first combined signal excludes inbound signals transmitted to the plurality of communication devices other than the first communication device; and transmitting the first combined signal to the first communication device.
The system of claim 10 wherein the instructions include: transmitting the first outbound signal to the remote communication device corresponding to the remote person.
The system of claim 10 wherein the instructions include: generating a second combined signal by combining the set of outbound signals excluding a second outbound signal, wherein the second outbound signal encodes audio from a second communication device corresponding to a second user, and transmitting the second combined signal to the second communication device.
The system of claim 10 wherein generating the first combined signal includes combining the set of outbound signals with corresponding time delays for a subset of outbound signals included in the first combined signal.
The system of claim 10, wherein the first combined signal excludes inbound signals transmitted to the plurality of communication devices other than the first communication device.
The system of claim 10, wherein the first combined signal includes no one of the inbound signals transmitted to the plurality of communication devices other than the first communication device.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application is a PCT International Application of U.S. Patent Application No. 17/453,949, filed November 8, 2021. This application claims the benefit and priority of U.S. Provisional Application No. 63/115,596, filed November 18, 2020. FIELD The present disclosure relates to teleconference systems and more particularly to telephony systems to process and distribute audio signals in a multi-party conferencing environment. BACKGROUND In a physical office space for a business (e.g., a call center, etc.), employees of the business who work at the office (e.g., staffing recruiters, salespeople, etc.) often benefit from overhearing conversations among their colleagues at the office, as well as one side of the conversations their colleagues are having (e.g., via phone, etc.) with individuals external to the business (e.g., potential recruits, potential clients, etc.). However, when employees work virtually, they lose these important elements of working in the office with their colleagues, including overhearing their colleagues talk. US 2008/159507 A1 discloses a distributed teleconference multichannel architecture. In a distributed call center, one or more employees may work remotely (for example, from home), such that they are physically distanced from other colleagues. The inability to hear conversations among their colleagues and between their colleagues and individuals external to the business can slow mentoring, create friction in spreading information among employees, and prevent beneficial discoveries arising from overheard conversations. For example, a salesperson at the call center might overhear a recruiter stationed nearby at the call center talking to a candidate about the candidate's skills and realize one of the recruiter's clients is looking for these skills. Or, a recruiter at the call center might overhear a salesperson stationed nearby at the call center talking to a client about the client's requirements and realize, based on what the salesperson is saying to the client, that the recruiter recently spoke to a perfect candidate for the client's requirements. Or, in a more indirect fashion, a junior recruiter might overhear what a senior recruiter is saying to potential recruits and learn from the senior recruiter about how to manage a complex client/candidate interaction. Or, a manager might overhear what a salesperson is saying to a potential client and identify a potential coaching opportunity for the salesperson based on how the manager hears the salesperson interact with the potential client. Conventional teleconferencing systems allow a group of colleagues to have a conference call. These systems, however, are typically only useful when the group is discussing internal matters amongst itself and are not suitable for use when one or more of the colleagues desires to separately converse with an individual outside the business. Even within a conference call, it can be difficult to discern which colleague in the group is speaking on the conference call or to otherwise focus on what a particular colleague is saying, especially as the number of colleagues participating in the conference call increases. The background description provided here is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure. SUMMARY A method for distributing audio signals among a plurality of communication devices that respectively correspond to a plurality of users of a team according to claim 1 includes facilitating, by a telephony controller, a one-to-one call over a communication network, between each communication device of the team and a corresponding remote communication device of a remote person who is not part of the team; distributing, by the telephony controller, to each communication device of the team, the outbound communication signal of each of the other communication devices of the team, in addition to the inbound communication signal of the corresponding remote communication device in one-to-one call with the communication device of the team, while isolating from the communication device of the team the inbound communication signals of the other remote communication devices in one-to-one call with the other communication devices of the team; applying, by the telephony controller, for each communication device of the team, audio processing involving three-dimensional, 3D, speech localization to each outbound communication signal distributed to the communication device of the team, thus enabling each user of the team to readily differentiate the outbound communication signal of each of the other users of the team; during an audio connection between a first user o