DE-102026112941-A1 - Method for transforming naturally spoken language into a control command for operating a vehicle function.
Abstract
The invention relates to a method for transforming naturally spoken language into at least one control command (30) designed to operate a vehicle function. A catalog (20) of implausible word sequences for controlling vehicle functions is created, wherein each implausible word sequence comprises at least one word that is permissible as part of a valid voice command for operating a vehicle function. In the naturally spoken language, at least one current word sequence (11) is captured and compared with the catalog (20). If it matches at least one of the implausible word sequences in the catalog (20), it is eliminated or checked.
Inventors
- Kevin Kalineak
Assignees
- Mercedes-Benz Group AG
Dates
- Publication Date
- 20260513
- Application Date
- 20260327
Claims (4)
- Method for transforming naturally spoken language into at least one control command (30) designed to operate a vehicle function of a vehicle, characterized in that - a catalog (20) of word sequences implausible for controlling vehicle functions is created, wherein an implausible word sequence comprises at least one word that is permissible as part of a valid voice instruction for operating a vehicle function, - at least one actual word sequence (11) is captured in the naturally spoken language and compared with the catalog (20) and - if there is a match with at least one of the implausible word sequences of the catalog (20), it is eliminated or checked.
- Procedure according to Claim 1 , characterized in that when checking a captured current word sequence (11) which matches at least one implausible word sequence stored in the catalogue (20), at least one query and/or a corrected word sequence is generated.
- Method according to one of the preceding claims, characterized in that at least one operating state parameter and/or environmental parameter of the vehicle is included in the comparison of the at least one recorded current word sequence (11) with the catalog (20).
- Method according to one of the preceding claims, characterized in that at least one vehicle occupant is acoustically identified and an individually assigned catalog (20) of implausible word sequences is selected and/or continuously adapted.
Description
The invention relates to a method for transforming naturally spoken language into at least one control command designed to operate a vehicle function of a vehicle. Speech-based input methods for vehicles capture the naturally spoken language of a vehicle occupant and generate control commands from it, which can be used to operate and parameterize vehicle functions. To do this, a speech instruction is first extracted from the naturally spoken language, which must adhere to certain predetermined syntactic and semantic rules. A valid (that is, syntactically and semantically correct) speech instruction could, for example, be: "Go to Jena - Paradies train station!" Such a voice instruction is transformed in the conventional manner into a control command, which prompts the vehicle's navigation system to calculate a route from the current location to the geoposition assigned to Jena - Paradies station and to determine and output corresponding navigation and driving instructions. Modern speech-based input methods allow for a significant variation in the speech instructions recognized as valid, for example in the form "Go to Paradise Station" or "Take me to paradise!" where ambiguities can be resolved context-dependently (for example, by using statistical language models - large language models, LLMs - and taking into account the current geoposition of the vehicle). Such methods save the user the tedious adherence to a strict syntax, such as the use of precisely prescribed keywords for certain vehicle functions. However, they also increase the likelihood that words, phrases, or sentences uttered without any intention of operation will be misinterpreted as voice commands. Such misinterpretation is particularly probable for imperative idiomatic expressions (for example, "Go to hell!"). Naturally spoken language, misinterpreted as a voice command and without any intention to control the vehicle, leads to unclear situations when operating vehicle functions. This can reduce user-friendliness and, in extreme cases, compromise operational and traffic safety. Therefore, there is a need for an improved method for transforming naturally spoken language into control commands, one that verifies the plausibility of the intended use. The document US 7,437,297 B2 describes a system and a procedure for processing and executing commands in automated systems, for example, for determining, evaluating, or predicting the results of executing incorrectly recorded or misinterpreted user instructions in such automated systems. The invention is based on the objective of providing an improved method for transforming naturally spoken language into at least one control command designed to operate a vehicle function, in particular a method by which the probability and/or the degree of impact of a misinterpretation of naturally spoken language as a voice instruction is reduced. The problem is solved according to the invention by a method having the features of claim 1. Advantageous embodiments of the invention are the subject of the dependent claims. In a method for transforming naturally spoken language into at least one control command designed to operate a vehicle function, a catalog of implausible word sequences is created. An implausible word sequence includes at least one word that is permissible as part of a valid voice instruction for operating a vehicle function. For example, in the word sequence "Go to the devil!" The sentence element "Fahr' zum" is a syntactically and semantically isolated partial word sequence of a speech instruction for formulating a service request to a navigation system. However, the entire word sequence as such a speech instruction is implausible, since on the one hand a destination "executioner" is typically not available. and since, on the other hand, this sequence of words represents a common idiomatic expression whose overall meaning cannot be derived from the individual elements (that is, by literal interpretation). Similarly, the catalog of implausible word sequences can include curses, curse formulas, phrases or expressives that express an emotional state without expressing an intention to act. Furthermore, the system continuously records naturally spoken language, for example using one or more microphones inside the vehicle. At least one current word sequence is recorded and compared to a catalog of implausible word sequences. Preferred methods are to capture current word sequences that are candidates for valid speech instructions (that is, for which there is at least a probability of an intention to use the command). Natural language processing (NLP) methods are known and available for capturing such current word sequences; these methods are also used in conventional speech-based input methods. A captured current word sequence (potentially assignable to a valid voice command) is compared with the implausible word sequences recorded in the catalog. If there is a match with at least one of these implausibl