US-12623597-B2 - Dynamic outputs for inducing interactions with autonomous vehicles

US12623597B2US 12623597 B2US12623597 B2US 12623597B2US-12623597-B2

Abstract

Dynamic outputs for inducing interactions with autonomous vehicles is described. In one or more implementations, a system includes an output device configured to produce audible signals in an environment outside an autonomous vehicle, and at least one processor operatively coupled to the output device. The processor is configured to detect one or more characteristics of the environment, dynamically generate, based on the characteristics, an audio output for inducing human interactions with the autonomous vehicle, and cause the output device to transmit the audio output as the audible signals produced in the environment for facilitating a human interaction with the autonomous vehicle.

Inventors

Aleks Witko

Assignees

Applied Electric Vehicles Ltd

Dates

Publication Date: 20260512
Application Date: 20240315

Claims (20)

1 . A system comprising: an output device configured to produce audible signals in an environment outside an autonomous vehicle; and at least one processor operatively coupled to the output device and configured to: detect one or more characteristics of the environment; dynamically generate, based on the characteristics, an audio output for inducing human interactions with the autonomous vehicle using a machine-learning model to generate a spoken audio portion of the audio output that uses a spoken language detected based on the characteristics; and cause the output device to transmit the audio output including the spoken audio portion as the audible signals produced in the environment for facilitating a human interaction with the autonomous vehicle.
2 . The system of claim 1 , wherein the at least one processor is further configured to: detect a change to the characteristics in response to the audio output; and modify the audio output based on the change to produce a new output for further inducing the human interactions with the autonomous vehicle, the change indicating changes in at least one of: relative distance between the autonomous vehicle and a human object; relative direction or orientation between the autonomous vehicle and the human object; or behavior or feedback of the human object.
3 . The system of claim 1 , wherein: the at least one processor is further configured to use the machine-learning model to determine different outputs for inducing human interactions with the autonomous vehicle in two or more different construction or mining environments to retrieve construction or mining materials, tools, and safety equipment from an inventory of the autonomous vehicle; and the audio output includes a synthesized message indicating availability of the inventory in at least one of the two or more different construction or mining environments.
4 . The system of claim 1 , wherein the at least one processor is further configured to: dynamically generate the audio output to be a first type of audio generated in a first spoken language for a first environment two or more different environments; and dynamically generate the audio output to be a different type of audio generated in a second spoken language for a second environment of the two or more different environments.
5 . The system of claim 1 , wherein the at least one processor is further configured to: determine an inventory of items transported by the autonomous vehicle; and dynamically generate the audio output for inducing human interactions with the autonomous vehicle based further on a change to the inventory that triggers a dynamically generated audio output for indicating feedback about one or more items that are removed from the inventory.
6 . The system of claim 1 , wherein the at least one processor is further configured to: determine a location of the environment; and dynamically generate the audio output for inducing human interactions with the autonomous vehicle based further on the location.
7 . The system of claim 5 , wherein the at least one processor is further configured to: determine that one or more of the items being transported by the autonomous vehicle are removed from the inventory; and dynamically generate an additional audio output for indicating feedback about the one or more items that are removed from the inventory.
8 . The system of claim 1 , wherein the at least one processor is further configured to: receive messages from a large language model configured to dynamically generate text; and synthesize the messages as the audio output for inducing human interactions with the autonomous vehicle.
9 . The system of claim 1 , wherein the characteristics detected by the at least one processor include information about at least one of: time of day for the environment; weather detected in the environment; or one or more human objects detected in the environment, including position of the human objects, motion of the human objects, shape of the human objects, and size of the human objects.
10 . The system of claim 9 , wherein the information about one or more human objects includes information obtained from a cloud service or platform indicative of one or more of personal preferences, a social media presence, and previous vehicle interactions, and the characteristics detected by the at least one processor include information about one or more inanimate objects detected in the environment, and the inanimate objects comprise at least one of other vehicles, buildings, driving surfaces, sidewalks, signs, awnings, flags, trees, or rocks.
11 . The system of claim 1 , further comprising: a sensor system configured to generate sensor data indicative of the characteristics detected by the at least one processor, wherein the sensor system comprises one or more of a camera sensor, a lidar sensor, a radar sensor, an ultrasonic sensor, a location positioning sensor, a temperature sensor, a pressure sensor, and a communication sensor.
12 . The system of claim 9 , wherein the at least one processor is further configured to: modify intensity or direction of the audible signals when a timer exceeds a time threshold for further facilitating the human interaction with the autonomous vehicle by directionally controlling the audible signals to be broadcast toward the position of the human objects.
13 . The system of claim 1 , wherein the output device comprises a first output device, the system further comprising: a second output device configured to produce visual signals in the environment from a position inside or outside the autonomous vehicle, wherein the at least one processor is operatively coupled to the second output device and further configured to: dynamically generate, based on the characteristics, a visual output for inducing the human interactions with the autonomous vehicle, the visual output including a graphical user interface (GUI) configured to present a kiosk interface for taking possession of an item from the autonomous vehicle when the characteristics indicate a human object is positioned to take the item; and cause the second output device to transmit the visual output as the visual signals produced in the environment for facilitating the human interaction with the autonomous vehicle.
14 . The system of claim 13 , wherein the at least one processor is configured to; control the first output device and the second output device to simultaneously transmit the audio output and the visual output in the environment for facilitating the human interaction with the autonomous vehicle by simultaneously transmitting the audio output and the visual output by producing a light show with the second output device that is based on a rhythm of the audio output produced by the first output device to attract human objects when the characteristics indicate no human objects are approaching the autonomous vehicle.
15 . The system of claim 13 , wherein the at least one processor is further configured to: detect gesture or body language from the characteristics indicating a displeased user with the autonomous vehicle; and cause the output device to transmit a modified audio output addressing the displeased user.
16 . An autonomous vehicle comprising: a first output device configured to produce audible signals in an environment outside an autonomous vehicle; a second output device configured to produce visual signals in the environment outside the autonomous vehicle; and at least one processor operatively coupled to the first output device and the second output device, and configured to: detect one or more characteristics of the environment; dynamically generate, based on the characteristics, audible outputs and visual outputs for inducing human interactions with the autonomous vehicle using a machine-learning model to generate a spoken audio portion of the audible outputs that uses a spoken language detected based on the characteristics, and wherein the visual outputs include an avatar synchronized to speak the spoken audio portion; and cause the first output device and the second output device to simultaneously transmit the audible outputs and the visual outputs as the audible signals and the visual signals produced in the environment for facilitating a human interaction with the autonomous vehicle.
17 . The autonomous vehicle of claim 16 , wherein the at least one processor is further configured to: dispense a digital or printed coupon in response to detecting a change in the characteristics indicating completion of a human interaction with the autonomous vehicle.
18 . A non-transitory computer-readable storage medium comprising instructions that, when executed, cause at least one processor of an autonomous vehicle to: detect one or more characteristics of an environment outside the autonomous vehicle; dynamically generate, based on the characteristics, an audio output for inducing human interactions with the autonomous vehicle using a machine-learning model to generate a spoken audio portion of the audio output that uses a spoken language detected based on the characteristics; and cause an output device of the autonomous vehicle to transmit the audio output including the spoken audio portion as audible signals produced in the environment for facilitating a human interaction with the autonomous vehicle.
19 . The non-transitory computer-readable storage medium of claim 18 , wherein the computer-readable storage medium is installed in the autonomous vehicle.
20 . The non-transitory computer-readable storage medium of claim 18 , wherein the instructions further cause the at least one processor to: process feedback received via one or more human machine interface (HMI) input devices to train the machine-learning model to learn to improve the audio outputs.

Description

BACKGROUND Autonomous vehicle technology is rapidly advancing to perform a diverse range of tasks, including for policing and security, passenger transport, and goods delivery. For example, a police department deploys a robotic ground vehicle to help with crowd control, to direct traffic safely through a congested intersection, or to enforce a security perimeter near a crash site. A merchant may send a robotic ground vehicle out to deliver goods (e.g., so called “last mile delivery”) or provide other services to neighborhoods that are far away from a merchant's warehouse or store. Despite their increasing deployment rates, people may find it unfamiliar to share roadways and sidewalks with robotic ground vehicles. In addition, many autonomous platforms operate silently due to their fully electric nature, which can mask their presence. Even though a pedestrian or vehicle occupant can communicate with drivers of traditional manned vehicles through voice, hand gestures, and body language, they might not know how to signal or make an autonomous system aware that a human is present. A sudden appearance of a near-silent, autonomous ground vehicle may startle pedestrians or occupants of other vehicles and cause even more uncertainty about how to engage with it. As such, challenges exist in designing effective interfaces that facilitate safe and pleasant human interactions with these autonomous machines. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of a non-limiting example environment including a vehicle that generates dynamic outputs for inducing interactions with autonomous vehicles. FIG. 2 is a block diagram of a non-limiting example of a vehicle system that generates dynamic outputs for inducing interactions with autonomous vehicles. FIG. 3 is a block diagram of a non-limiting example of a dynamic output system configured to generate dynamic outputs for inducing interactions with autonomous vehicles. FIG. 4 depicts a non-limiting example scenario of an autonomous ground vehicle that generates dynamic outputs for inducing interactions with humans. FIG. 5 depicts a procedure for generating dynamic outputs for inducing interactions with autonomous vehicles. DETAILED DESCRIPTION Autonomous vehicle technology is progressing at a fast rate, and not just to facilitate driving. Autonomous systems are being deployed to facilitate passenger transportation, as well as to help people perform work and complete everyday tasks. For example, imagine a merchant that deploys an autonomous ground vehicle into a city. The autonomous vehicle may be programmed to travel to different neighborhoods and to make multiple stops throughout the city for delivering merchant goods or services. A first stop for the autonomous vehicle may be at a central park where families are having picnics or enjoying a playground. The autonomous vehicle arrives at the park with storage compartments filled with groceries (e.g., fresh baguettes, artisan cheeses, juicy strawberries) and other items for sale (e.g., towels, sunscreen, beverages, ice) to help make people's park experience more enjoyable. Although autonomous vehicle deployments like these are becoming more common, not everyone is accustomed to interacting with a robotic ground vehicle. Many autonomous vehicle platforms are fully electric and operate in near silence, which can mask their presence and further impede human and vehicle interactions. Even if a pedestrian or vehicle occupant recognizes a robotic ground vehicle operating nearby, that person may not know how to signal the autonomous vehicle. For example, when the autonomous ground vehicle described above arrives at the central park, its presence may be masked by shrubs, trees, or other environmental features of the park (e.g., noise) and fail to gain the attention of customers. These and various other challenges in an environment can interfere with an autonomous system's ability to implement a human-machine-interface (HMI) that facilitates safe and effective human interactions. An autonomous vehicle that has difficulty understanding or responding to human interactions may struggle with helping people to complete their work or tasks. Implementing an effective HMI that facilitates safe and effective human interactions with autonomous systems can be challenging. In accordance with techniques of this disclosure, dynamic outputs for inducing interactions with autonomous vehicles is described. For example, an autonomous vehicle includes one or more output devices configured to produce audible and/or visual signals in an environment surrounding the vehicle. The output devices may include speakers, displays, lights, and other audio-visual output devices. The autonomous vehicle further includes one or more sensor devices configured to generate sensor data about one or more characteristics of the environment. For example, the sensor data indicates a person is approaching the vehicle, one or more pedestrians are walking past the vehi