EP-4292090-B1 - SYSTEMS FOR PROCESSING VOICE AUDIO TO SEGREGATE PERSONAL HEALTH INFORMATION
Inventors
- HAIRALAH, Sahar Bin
- GUNDUMANE, Aravind
Dates
- Publication Date
- 20260506
- Application Date
- 20220210
Claims (8)
- A system (100) for processing voice audio comprising a local device (102), the local device (102) comprising: a local speech-to-text transcriber (106) configured to generate voice text (108) based on voice audio (110) spoken by a user (112); a local natural language processor (NLP) (114) configured to extract one or more spoken phrases (116) from the voice text (108); and a machine learning classifier (118) configured to classify the voice audio (110) as either personal health voice audio (120) or non-personal health voice audio (122) based on the one or more spoken phrases (116) and a personal health phrase database (124), wherein the system (100) further comprises a remote personal health data ecosystem (104), the remote personal health data ecosystem (104) comprising: a remote receiver (126) configured to receive the personal health voice audio (120); a remote speech-to-text transcriber (128) configured to generate personal health voice text (130) based on the personal health voice audio (120); a remote NLP (132) configured to extract one or more personal health spoken phrases (134) from the personal health voice text (130); a text response generator (136) configured to generate a text response (138) based on the one or more personal health spoken phrases (134); a text-to-speech translator (140) configured to generate a voice response (142) based on the text response (138); and a remote transmitter (144) configured to wirelessly transmit the voice response (142) to one or more speakers (146) configured to emit the voice response (142).
- The system (100) of claim 1, wherein the remote personal health data ecosystem (104) is further configured to transmit the voice response (142) to the one or more speakers (146) via the Internet (148).
- The system (100) of claim 1, wherein the system (100) further comprises an audio sensor (150) configured to capture the voice audio (110) spoken by the user (112).
- The system (100) of claim 1, wherein the local device (102) is a smart speaker.
- The system (100) of claim 1, wherein the local device (102) further comprises a local transmitter (152) configured to wirelessly transmit the personal health voice audio (120) to the remote personal health data ecosystem (104).
- The system (100) of claim 5, wherein the local transmitter (152) wirelessly transmits the personal health voice audio (120) to the remote personal health data ecosystem (104) via the Internet (148).
- The system (100) of claim 1, wherein the local device (102) further comprises a voice redirector (154) configured to transmit the non-personal health voice audio (122) to a smart home data ecosystem (156).
- The system (100) of claim 1, wherein the personal health phrase database (124) comprises a plurality of oral health phrases (158).
Description
Field of the Disclosure The present disclosure is directed generally to methods and systems for voice processing to segregate personal health information from other information in a voice activated device. Background The voice-enabled technology landscape continues to grow. Due to this growth, voice-enabled technology has begun to expand into areas related to personal health. For example, oral health is an area where voice-enabled technology may provide value to users by, for instance, coaching users on how they should brush their teeth and providing personalized, oral health related, alerts and recommendations. However, in certain jurisdictions, oral health, as well as other types of personal health information, may be classified similar to medical data requiring a degree of privacy and security within an "data ecosystem" (that is, a collection of infrastructure, analytics, and applications used to capture and analyze data) in which this information is processed and stored. Many of the data ecosystems used by voice-activated systems, specifically in the smart home space, lack the requisite security and privacy protections to process and store personal health information. Accordingly, there is a need for a voice processing system to identify, segregate, and securely store personal health information. As an example, it is proposed in US 2018/330069 A1 to route private/non-private health audio data to different devices in a network such that only the user who is authorized to receive the information contained in the audio data can hear the audio data. Summary of the Disclosure The invention is as defined in appended independent claim 1. Preferred embodiments are set forth in the appended dependent claims. The present disclosure is directed generally to methods and systems for voice processing to segregate personal health information from other information in a voice activated device. Broadly, the system captures voice audio spoken by a user. A local speech-to-text transcriber generates voice text based on the captured audio. A local natural language processor (NLP) then extracts one or more spoken phrases from the voice text. Based on the extracted phrases and a personal health phrase database, a machine learning classifier then classifies the voice audio as either personal health voice audio or non-personal health voice audio. The personal health voice audio is transmitted to a "personal health data ecosystem." As used herein, the term "personal health data ecosystem" generally refers to a collection of infrastructure hardware and software applications used to capture, analyze, and store data related to the personal health of a user (such as personal and family medical histories, hygiene habits, test and laboratory results, vital sign measurements, etc.). In the personal health data ecosystem, the voice audio is processed to generate a voice response conveyed to the user via one or more speakers. The non-personal health voice audio is transmitted to an alternate data ecosystem, such as a "smart home data ecosystem." As used herein, the term "smart home data ecosystem" generally refers to a collection of infrastructure hardware and software applications used to capture, analyze, and store data related to home automation (such as information related to home entertainment, HVAC, or lighting systems, etc.). Generally, in one aspect, a system for processing voice audio is provided. The system may include a local device. According to an example, the local device may be a smart speaker. The local device may include a local speech-to-text transcriber. The local speech-to-text transcriber may be configured to generate voice text. The voice text may be generated based on voice audio spoken by a user. The local device may further include a local NLP. The local NLP may be configured to extract one or more spoken phrases from the voice text. The local device may further include a machine learning classifier. The machine learning classifier may be configured to classify the voice audio as either personal health voice audio or non-personal health voice audio. The voice audio may be classified based on the one or more spoken phrases and a personal health phrase database. According to an example, the personal health phrase database may include a plurality of oral health phrases. According to an example, the local device may further include a local transmitter. The local transmitter may be configured to wirelessly transmit personal health voice audio to a remote personal health data ecosystem. According to a further example, the local transmitter may wirelessly transmit the personal health voice audio to the remote personal health data ecosystem via the Internet. According to an example, the local device may further include a voice redirector. The voice redirector may be configured to transmit non-personal health voice audio to a smart home data ecosystem. According to an example, the system may further include the remote per