Search

US-12621546-B2 - Rotating camera and microphone configurations

US12621546B2US 12621546 B2US12621546 B2US 12621546B2US-12621546-B2

Abstract

An apparatus comprising a first part, the first part having: at least one camera configured to capture images, and at least two microphones configured to capture at least two audio signals; and a second part having at least one microphone configured to capture at least one audio signal, wherein one of the first part or the second part is configured to perform a move relative to the other part, wherein the apparatus is configured to: determine a parameter associated with the move; select at least one of: the at least two audio signals, or the at least one audio signal based, at least partially, on the determined parameter; and generate at least one output audio signal based on the parameter associated with the move and the selected at least one of: the at least two audio signals or, the at least one audio signal.

Inventors

  • Miikka Tapani Vilermo
  • Antero Tossavainen
  • Ari Juhani KOSKI
  • Matti Sakari Hamalainen

Assignees

  • NOKIA TECHNOLOGIES OY

Dates

Publication Date
20260505
Application Date
20220908
Priority Date
20191220

Claims (20)

  1. 1 . An apparatus comprising: a first part, the first part having: at least one camera configured to capture images, and at least one microphone configured to capture at least one audio signal; a second part having at least one respective microphone configured to capture at least one respective audio signal, wherein one of the first part or the second part is configured to perform a move relative to the other part; at least one processor; and at least one memory storing instructions that, when executed with the at least one processor, cause the apparatus at least to: determine a parameter associated with the move, wherein the parameter is configured to indicate at least one of: an angle between the first part and the second part resulting from the move, or a distance between the first part and the second part resulting from the move; select at least one of: the at least one audio signal from the first part, or the at least one audio signal from the second part based, at least partially, on the determined parameter; and generate at least one output audio signal based on the parameter associated with the move and the selected at least one of: the at least one audio signal from the first part, or the at least one audio signal from the second part.
  2. 2 . The apparatus of claim 1 , wherein generating the at least one output audio signal comprises the at least one memory storing instructions that, when executed with the at least one processor, cause the apparatus to: determine one or more audio directions based on the selected at least one of: the at least one audio signal from the first part, or the at least one audio signal from the second part; modify the one or more audio directions based on the parameter; and generate the at least one output audio signal based on the one or more modified audio directions.
  3. 3 . The apparatus of claim 1 , wherein the at least one audio signal from the first part and the at least one audio signal from the second part are selected in response to the determined parameter being configured to further indicate an orientation between the first part and the second part resulting from the move.
  4. 4 . The apparatus of claim 1 , wherein the at least one audio signal from the first part is selected in response to the determined parameter comprising a second parameter, wherein the second parameter is configured to indicate an orientation of the at least one camera of the first part relative to the second part.
  5. 5 . The apparatus of claim 4 , wherein generating the at least one output audio signal comprises the at least one memory storing instructions that, when executed with the at least one processor, cause the apparatus to: beamform the at least one selected audio signal from the first part based, at least partially, on the second parameter to generate at least one focused audio signal; and generate the at least one output audio signal based, at least partially, on the at least one focused audio signal.
  6. 6 . The apparatus of claim 1 , wherein the determined parameter is associated with the at least one microphone of the first part and the at least one microphone of the second part substantially being in a same plane, wherein the at least one audio signal from the first part and the at least one audio signal from the second part are selected in response to the determined parameter.
  7. 7 . The apparatus of claim 1 , wherein the at least one audio signal from the first part and the at least one audio signal from the second part are selected, wherein the at least one generated output audio signal comprises at least one spatial audio signal.
  8. 8 . The apparatus of claim 1 , wherein generating the at least one output audio signal comprises the at least one memory storing instructions that, when executed with the at least one processor, cause the apparatus to: select a direction analysis method based, at least partially, on the determined parameter; determine one or more audio directions using the selected direction analysis method; and generate the at least one output audio signal based on the one or more determined audio directions.
  9. 9 . The apparatus of claim 8 , wherein the selected direction analysis method comprises a first direction analysis method in response to the determined parameter, comprising at least an orientation between the first part and the second part resulting from the move, being in at least one predetermined range of orientations, wherein the selected direction analysis method comprises a second direction analysis method in response to the determined parameter, comprising at least the orientation, being outside the at least one predetermined range of orientations, wherein the first direction analysis method is at least partially different from the second direction analysis method.
  10. 10 . The apparatus of claim 1 , wherein the first part or the second part is configured to move relative to a common reference point, wherein the first part and the second part are at least partially physically connected to each other.
  11. 11 . The apparatus of to claim 1 , wherein the move is at least one of: a rotation about an axis in common between the first part and the second part; a pitch and/or yaw and/or roll movement between the first part and the second part; a movement of the first part relative to the second part; or a movement of the second part relative to the first part.
  12. 12 . The apparatus of claim 1 , wherein the at least one microphone of the second part comprises at least three microphones arranged with respect to the second part, wherein generating the at least one output audio signal based on the parameter associated with the move and the selected at least one of: the at least one audio signal of the first part, or the at least one audio signal of the second part comprises the at least one memory storing instructions that, when executed with the at least one processor, cause the apparatus to: obtain a parameter defining an arrangement of the at least three microphones; obtain a parameter defining an orientation of the apparatus; and generate the at least one output audio signal further based on the parameter defining the arrangement of the at least three microphones and the parameter defining the orientation of the apparatus.
  13. 13 . The apparatus of claim 12 , wherein generating the at least one output audio signal further based on the parameter defining the arrangement of the at least three microphones and the parameter defining the orientation of the apparatus comprises the at least one memory storing instructions that, when executed with the at least one processor, cause the apparatus to: generate the at least one output audio signal for at least one frequency band based on the parameter defining the arrangement of the at least three microphones and the parameter defining the orientation of the apparatus.
  14. 14 . The apparatus of claim 1 , wherein the at least one output audio signal comprises at least one of: at least one spatial audio signal; at least one non-spatial audio signal; a mono audio signal; a beamformed audio signal; or a shotgun audio signal.
  15. 15 . A method comprising: providing an apparatus, the apparatus comprising: a first part, the first part having: at least one camera configured to capture images, and at least one microphone configured to capture at least one audio signal; a second part having at least one respective microphone configured to capture at least one respective audio signal, wherein one of the first part or the second part is configured to perform a move relative to the other part; determining a parameter associated with the move, wherein the parameter is configured to indicate at least one of: an angle between the first part and the second part resulting from the move, or a distance between the first part and the second part resulting from the move; selecting at least one of: the at least one audio signal from the first part, or the at least one audio signal from the second part based, at least partially, on the determined parameter; and generating at least one output audio signal based on the parameter associated with the move and the selected at least one of: the at least one audio signal from the first part, or the at least one audio signal from the second part.
  16. 16 . The method of claim 15 , wherein the generating of the at least one output audio signal comprises: determining one or more audio directions based on the selected at least one of: the at least one audio signal from the first part, or the at least one audio signal from the second part; modifying the one or more audio directions based on the parameter; and generating the at least one output audio signal based on the one or more modified audio directions.
  17. 17 . The method of claim 15 , wherein the at least one audio signal from the first part and the at least one audio signal from the second part are selected in response to the determined parameter being configured to further indicate an orientation between the first part and the second part resulting from the move.
  18. 18 . The method of claim 15 , wherein the at least one audio signal from the first part is selected in response to the determined parameter comprising a second parameter, wherein the second parameter is configured to indicate an orientation of the at least one camera of the first part relative to the second part.
  19. 19 . The method of claim 15 , wherein the generating of the at least one output audio signal comprises: selecting a direction analysis method based, at least partially, on the determined parameter; determining one or more audio directions using the selected direction analysis method; and generating the at least one output audio signal based on the one or more determined audio directions.
  20. 20 . A non-transitory computer-readable medium comprising instructions stored thereon for performing at least the following: providing an apparatus, the apparatus comprising: a first part, the first part having: at least one camera configured to capture images, and at least one microphone configured to capture at least one audio signal; a second part having at least one respective microphone configured to capture at least one respective audio signal, wherein one of the first part or the second part is configured to perform a move relative to the other part; determining a parameter associated with the move, wherein the parameter is configured to indicate at least one of: an angle between the first part and the second part resulting from the move, or a distance between the first part and the second part resulting from the move; selecting at least one of: the at least one audio signal from the first part, or the at least one audio signal from the second part based, at least partially, on the determined parameter; and causing generation of at least one output audio signal based on the parameter associated with the move and the selected at least one of: the at least one audio signal from the first part, or the at least one audio signal from the second part.

Description

RELATED APPLICATION This application is a continuation of U.S. patent application Ser. No. 17/121, 925, filed Dec. 15, 2020, which is hereby incorporated by reference in its entirety, and claims priority to GB 1919060.2 filed Dec. 20, 2019. FIELD The present application relates to apparatus and methods for rotating camera and microphone configurations, but not exclusively for rotating camera and microphone configurations within spatial audio capture apparatus. BACKGROUND Spatial audio capture is a rapidly developing field of investigation. Conventionally a capture device has a microphone configuration which is fixed relative to the camera. In such configurations the spatial relationship between the camera or cameras and the microphones is fixed and aligning the spatial audio signal and video images is a simple operation. For example spatial audio which has the ability to determine audio directions in a plane can be captured using a device comprising 3 microphones and to determine audio direction in all directions can be captured using a device comprising 4 microphones. Audio directions can be typically analysed based on level and phase/time differences of microphone signals. The physical configuration affects audio signals coming from different directions differently and different microphone locations cause sound from different directions to arrive at different time to the microphones. The different arrival times TDOA (Time Difference of Arrival) can be used to determine directions using known methods. With the fixed distances and locations of the microphones relative to the camera these directions can be aligned with the camera direction in a simple manner. There may be in some situations a capture device which has the ability to move or rotate the camera relative to the microphones. In such capture devices there is a need to be able to more efficiently handle the audio signals generated, for example to maintain a ‘correct’ alignment otherwise the difference between objects in the video images and the audio directions may be distracting for the user of the playback apparatus. SUMMARY There is provided according to a first aspect an apparatus comprising: a first part, the first part having at least one camera configured to capture images; a second part having at least one microphone configured to capture at least one audio signal, wherein one of the first part or second part is able to move relative to the other part and the apparatus comprising means configured to: determine a parameter associated with the move; generate at least one output audio signal based on the parameter associated with the move and the at least one audio signal. The first part or the second part may be able to move relative to common reference point. The move may be at least one of: a rotation about an axis in common between the first part and the second part; a pitch and/or yaw and/or roll between the first part and the second part; a movement of the first part relative to the second part; and a movement of the second part relative to the first part. The means may be further configured to: multiplex the at least one output audio signal and the images captured by the camera; and output the multiplexed at least one output audio signal and the images captured by the camera. The first part may further have at least one further microphone configured to capture at least one further audio signal, wherein the means configured to generate at least one output audio signal based on the parameter associated with the move and the at least one audio signal may be configured to generate the at least one output audio signal further based on the at least one further audio signal. The means configured to generate the at least one output audio signal further based on the at least one further audio signal may be configured to align the at least one output audio signal and the at least one further audio signal based on the parameter associated with the move. The at least one microphone may comprise at least three microphones arranged with respect to the second part, and the means configured to generate at least one output audio signal based on the parameter associated with the move and the at least one audio signal may be configured to: obtain a parameter defining the arrangement of the at least three microphones; obtain a parameter defining an orientation of the apparatus; and generate the at least one output audio signal further based on the parameter defining the arrangement of the at least three microphones and the parameter defining an orientation of the apparatus. The means configured to generate the at least one output audio signal further based on the parameter defining the arrangement of the at least three microphones and the parameter defining an orientation of the apparatus may be configured to generate the at least one output audio signal for at least one frequency band based on the parameter defining the arrangement of the at least three mi