EP-4513478-B1 - FIRE ALARM VOICE GENERATOR

EP4513478B1EP 4513478 B1EP4513478 B1EP 4513478B1EP-4513478-B1

Inventors

DORANT, Dean
FORTIN, MAXIME
HOUNKPATIN, Arius
ROUSSEAU, WILLIAM

Dates

Publication Date: 20260513
Application Date: 20240822

Claims (15)

A method (500) comprising: a) providing a user interface (201) for configuring an alarm control panel (110) to generate audio voice messages, wherein the user interface (201) displays at least a text field (205) for inputting text, a list for selecting audio configuration settings (213), a first icon (203) for generating text-to-speech, and a second icon (209) for saving; b) obtaining, via the text field (205) in user interface (201), text to be converted to an audio voice message; c) recording the own voice of a user; d) training a neural voice network using the user's recorded voice; e) obtaining, via the list in the user interface (201), audio configuration settings (213) for generating the audio voice message; f) transmitting, to a cloud platform (303) in response to receiving a selection of the first icon (203) via the user interface (201), a text-to-speech request comprising the text to be converted and the audio configuration settings (213); g) obtaining, from the cloud platform (303), a speech response generated using the neural voice network based on the text and the audio configuration settings (213); and h) storing the speech response in an audio file library, in response to receiving a selection of the second icon (209) via the user interface (201), for playback in the alarm control panel (110).
The method (500) of claim 1, further comprising: - outputting the speech response, in response to receiving a selection of a third icon (207) via the user interface (201), wherein the user interface (201) further displays the third icon (207) for playback.
The method (500) of claim 1 or 2, further comprising: - deleting the speech response, in response to receiving a selection of a fourth icon (211) via the user interface (201), wherein the user interface (201) further displays the fourth icon (211) for deletion.
The method (500) of one of claims 1 to 3, wherein the text-to-speech request comprises a authentication key, and wherein the speech response is generated from the cloud platform (303) based on the authentication key being validated by the cloud platform (303), wherein the method (500) preferably further comprises: - displaying an error message based on the authentication key not being validated by the cloud platform (303).
The method (500) of one of claims 1 to 4, wherein the audio configuration settings (213) include at least one of: language, gender, tone, accent, pre-set profile, pitch, speech rate, cloud platform (303) engine, or audio quality setting; and/or wherein the audio configuration settings (213) are obtained by loading a pre-configured audio voice profile.
The method (500) of one of claims 1 to 5, further comprising: - storing the audio configuration settings (213) as an audio voice profile; and - exporting the speech response to an alarm control panel (110).
The method (500) of one of claims 1 to 6, further comprising: - obtaining, via an additional text field (205) via the user interface (201), in response to receiving a selection of a fifth icon (215) via the user interface (201), an additional text to be converted to an audio voice message and additional audio configuration settings (213) for generating the audio voice message, wherein the user interface (201) further displays the additional text field (205) and the fifth icon (215) for adding additional speech; - transmitting a text-to-speech request to a cloud platform (303) in response to receiving an additional selection of the first icon (203) via the user interface (201), the text-to-speech request comprising the additional text to be converted and the additional audio configuration settings (213); - obtaining, from the cloud platform (303), an additional speech response generated using the neural voice network based on the additional text and the additional audio configuration settings (213); and - storing the additional speech response in an audio file library for playback in the alarm control panel (110), in response to receiving a sixth user input via the user interface (201), wherein the user interface (201) comprises a sixth icon for playback.
The method (500) of one of claims 1 to 7, wherein the text to be converted to the audio voice message is obtained by dragging and dropping pre-generated phrases from a second portion of the user interface (201), wherein the user interface (201) further displays the second portion displaying a list of available pre-generated phrases.
The method (500) of one of claims 1 to 8, further comprising: - adding tones to the speech response by selecting the tones from a third portion of the user interface (201), wherein the user interface (201) further displays the third portion displaying a list of available tones.
A computing device (400), comprising: - one or more memories (191), individually or in combination, having instructions; and - one or more processors (405) each coupled to at least one of the one or more memories (191) and configurable to execute the instructions to: a) provide a user interface (201) having a display for configuring an alarm control panel (110) to generate audio voice messages, wherein the user interface (201) displays at least a text field (205) for inputting text, a list for selecting audio configuration settings (213), a first icon (203) for generating text-to-speech, and a second icon (209) for saving, b) obtain, via the text field (205) in user interface (201), text to be converted to an audio voice message, c) record the own voice of a user; d) train a neural voice network using the user's recorded voice; e) obtain, via the list in the user interface (201), audio configuration settings (213) for generating the audio voice message, f) transmit, to a cloud platform (303) in response to receiving a selection of the first icon (203) via the user interface (201), a text-to-speech request comprising the text to be converted and the audio configuration settings (213), g) obtain, from the cloud platform (303), a speech response generated using the neural voice network based on the text and the audio configuration settings (213), and h) store the speech response in an audio file library, in response to receiving a selection of the second icon (209) via the user interface (201), for playback in the alarm control panel (110).
The computing device (400) of claim 10, wherein the one or more processors (405) each coupled to at least one of the one or more memories (191) and configurable to further execute the instructions to: - output the speech response, in response to receiving a selection of a third icon (207) via the user interface (201), wherein the user interface (201) further displays the third icon (207) for playback; and/or wherein the one or more processors (405) each coupled to at least one of the one or more memories (191) and configurable to further execute the instructions to: - delete the speech response, in response to receiving a selection of a fourth icon (211) via the user interface (201), wherein the user interface (201) further displays the fourth icon (211) for deletion; and/or wherein the text-to-speech request comprises a authentication key, and wherein the speech response is generated from the cloud platform (303) based on the authentication key being validated by the cloud platform (303), an wherein the one or more processors (405) each coupled to at least one of the one or more memories (191) and configurable to further execute the instructions to: - display an error message based on the authentication key not being validated by the cloud platform (303).
The computing device (400) of claim 10 or 11, wherein the audio configuration settings (213) include at least one of: language, gender, tone, accent, pre-set profile, pitch, speech rate, cloud platform (303) engine, or audio quality setting; and/or wherein the audio configuration settings (213) are obtained by loading a pre-configured audio voice profile.
The computing device (400) of one of claims 10 to 12, wherein the one or more processors (405) each coupled to at least one of the one or more memories (191) and configurable to further execute the instructions to: - store the audio configuration settings (213) as an audio voice profile, and - export the speech response to an alarm control panel (110).
The computing device (400) of one of claims 10 to 13, wherein the one or more processors (405) each coupled to at least one of the one or more memories (191) and configurable to further execute the instructions to: - obtain, via an additional text field (205) via the user interface (201), in response to receiving a selection of a fifth icon (215) via the user interface (201), an additional text to be converted to an audio voice message and additional audio configuration settings (213) for generating the audio voice message, wherein the user interface (201) further displays the additional text field (205) and the fifth icon (215) for adding additional speech; - transmit a text-to-speech request to a cloud platform (303) in response to receiving an additional selection of the first icon (203) via the user interface (201), the text-to-speech request comprising the additional text to be converted and the additional audio configuration settings (213); - obtain, from the cloud platform (303), an additional speech response generated using the neural voice network based on the additional text and the additional audio configuration settings (213); and - store the additional speech response in an audio file library for playback in the alarm control panel (110), in response to receiving a sixth user input via the user interface (201), wherein the user interface (201) comprises a sixth icon for playback.
A computer program product configured to configure audio voice messages, the computer program product comprising one or more non-transitory computer-readable media, having instructions stored thereon that when executed by one or more processors (405) cause the one or more processors (405), individually or in combination, to perform a method (500) comprising: a) providing a user interface (201) having a display for configuring an alarm control panel (110) to generate audio voice messages, wherein the user interface (201) displays at least a text field (205) for inputting text, a list for selecting audio configuration settings (213), a first icon (203) for generating text-to-speech, and a second icon (209) for saving; b) obtaining, via the text field (205) in user interface (201), text to be converted to an audio voice message; c) recording the own voice of a user; d) training a neural voice network using the user's recorded voice; e) obtaining, via the list in the user interface (201), audio configuration settings (213) for generating the audio voice message; f) transmitting, to a cloud platform (303) in response to receiving a selection of the first icon (203) via the user interface (201), a text-to-speech request comprising the text to be converted and the audio configuration settings (213); g) obtaining, from the cloud platform (303), a speech response generated using a neural voice network based on the text and the audio configuration settings (213); and h) storing the speech response in an audio file library, in response to receiving a selection of the second icon (209) via the user interface (201), for playback in the alarm control panel (110).

Description

BACKGROUND Technical Field The described aspects relate to configuring auto-generated audio voice messages from text using a neural voice network for use in voice alarm notification devices of an alarm system, and to a related method. Introduction Aspects of the present disclosure relate generally to programming voice alarm notification devices, and more particularly, to generating audio voice messages from text using a neural voice network. The video Potter Signal: "Potter's Integrated Voice System", discloses a fire alarm system enabling operators to add customized voice messages generated by neural text-to-speech. It shows a graphical user interface with icons for adding, playing, deleting voice messages. SUMMARY The following presents a simplified summary of one or more aspects in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later. An example aspect includes a method comprising providing a user interface for configuring an alarm control panel to generate audio voice messages, wherein the user interface displays at least a text field for inputting text, a list for selecting audio configuration settings, a first icon for generating text-to-speech, and a second icon for saving. The method further includes obtaining, via the text field in user interface, text to be converted to an audio voice message. Additionally, the method further includes obtaining, via the list in the user interface, audio configuration settings for generating the audio voice message. Additionally, the method further includes transmitting, to a cloud platform in response to receiving a selection of the first icon via the user interface, a text-to-speech request comprising the text to be converted and the audio configuration settings. Additionally, the method further includes obtaining, from the cloud platform, a speech response generated using a neural voice network based on the text and the audio configuration settings. Additionally, the method further includes storing the speech response in an audio file library, in response to receiving a selection of the second icon via the user interface, for playback in the alarm control panel. Another example aspect includes an apparatus comprising one or more memories and one or more processors coupled with the one or more memories and configured, individually or in combination, to perform the following operations. The one or more processors are configured to provide a user interface for configuring an alarm control panel to generate audio voice messages, wherein the user interface displays at least a text field for inputting text, a list for selecting audio configuration settings, a first icon for generating text-to-speech, and a second icon for saving. The one or more processors are further configured to obtain, via the text field in user interface, text to be converted to an audio voice message. Additionally, the one or more processors are further configured to obtain, via the list in the user interface, audio configuration settings for generating the audio voice message. Additionally, the one or more processors are further configured to transmit, to a cloud platform in response to receiving a selection of the first icon via the user interface, a text-to-speech request comprising the text to be converted and the audio configuration settings. Additionally, the one or more processors are further configured to obtain, from the cloud platform, a speech response generated using a neural voice network based on the text and the audio configuration settings. Additionally, the one or more processors are further configured to store the speech response in an audio file library, in response to receiving a selection of the second icon via the user interface, for playback in the alarm control panel. Another example aspect includes an apparatus comprising means for providing a user interface for configuring an alarm control panel to generate audio voice messages, wherein the user interface displays at least a text field for inputting text, a list for selecting audio configuration settings, a first icon for generating text-to-speech, and a second icon for saving. The apparatus further includes means for obtaining, via the text field in user interface, text to be converted to an audio voice message. Additionally, the apparatus further includes means for obtaining, via the list in the user interface, audio configuration settings for generating the audio voice message. Additionally, the apparatus further includes means for transmitting, to a cloud platform in response to receiving a selection of the first icon via the user interface, a text-to-speech request