KR-20260063186-A - A METHOD AND AN APPARATUS FOR GENERATING A WORKING ROUTE FOR A VEHICLE USING A REINFORCEMENT LEARNING MODEL

KR20260063186AKR 20260063186 AKR20260063186 AKR 20260063186AKR-20260063186-A

Abstract

The present disclosure relates to a method and apparatus for generating a work path of equipment using a reinforcement learning model. According to one embodiment of the present disclosure, a method for generating a work path of equipment using a reinforcement learning model may include: a step of designing a first reward function for a first reinforcement learning model using three-dimensional terrain data of a workplace and at least one of the previous work paths of said workplace; a step of generating at least one initial work path using the first reinforcement learning model to which the first reward function is applied; a step of designing a second reward function by updating the first reward function based on feedback for each of the at least one initial work paths; and a step of generating a final work path using a second reinforcement learning model to which the second reward function is applied.

Inventors

한상규

Assignees

에이치디한국조선해양 주식회사
에이치디현대인프라코어 주식회사
에이치디건설기계 주식회사

Dates

Publication Date: 20260507
Application Date: 20241030

Claims (10)

A step of designing a first reward function for a first reinforcement learning model using three-dimensional terrain data of a workplace and at least one of the previous work paths of said workplace; A step of generating at least one initial work path using the first reinforcement learning model to which the first reward function is applied; A step of designing a second compensation function by updating the first compensation function based on feedback for each of the at least one initial work paths; and A step of generating a final work path using a second reinforcement learning model to which the above second reward function is applied; A method for generating a work path of equipment using a reinforcement learning model including
In Article 1, The above first compensation function is, A method comprising a function including at least one of a first element for the distance traveled along the path, a second element for the time taken to travel along the path, a third element for the fuel consumed during travel along the path, a fourth element for the slope of the path, and a fifth element for the terrain characteristics of the path.
In Article 1, The step of designing the second compensation function above is, A step of receiving a work score for each of the above at least one initial work path as feedback; A method including
In Paragraph 3, The step of designing the second compensation function above is, A step of calculating at least one element reflected in each of the above initial work paths; and A step of designing the second compensation function by adjusting the weights for at least one element included in the initial work path with the highest work score; A method including
In Paragraph 3, The above work score is, A method calculated using at least one of the following: a first item regarding operation data of the equipment on the path, a second item regarding fuel consumption on the path, a third item regarding ease of operation on the path, a fourth item regarding load applied to the equipment on the path, and a fifth item regarding work stability on the path.
In Article 1, The step of designing the second compensation function above is, A method for designing the second compensation function by adjusting the reflection weights for the first compensation function of the above feedback.
In Article 1, The step of generating at least one initial work path above is, A method for generating at least one initial task path by inputting initial task information into the first reinforcement learning model.
In Article 1, The above method is, A step of controlling the equipment so that it moves along the final work path; A method that further includes.
At least one memory; and Includes at least one processor; and The above at least one processor is, A computing device that designs a first reward function for a first reinforcement learning model using three-dimensional terrain data of a workplace and at least one of the previous work paths of the workplace, generates at least one initial work path using the first reinforcement learning model to which the first reward function is applied, designs a second reward function by adjusting the weight of at least one element included in the first reward function based on the work score for each of the at least one initial work path, and generates a final work path using a second reinforcement learning model to which the second reward function is applied.
A computer-readable recording medium having a program for executing the method according to claim 1 on a computer.

Description

A method and apparatus for generating a working route for a vehicle using a reinforcement learning model The present disclosure relates to a method and apparatus for generating a work path of equipment using a reinforcement learning model. Equipment for work at construction sites has continuously been improved and developed. The equipment used at construction sites is designed to be operated directly by skilled personnel who board the equipment as operators to perform tasks. However, the industry is facing difficulties due to a shortage of skilled workers, and profitability continues to deteriorate due to safety management issues and rising wages for skilled laborers. Furthermore, it is challenging to ensure uniformity in construction quality due to variations in the skill levels of individual workers. Recently, active research is being conducted on automated unmanned construction/drying equipment capable of addressing issues such as the absence of skilled workers, safety management, and profitability. While the unmanned operation of construction equipment requires autonomous driving, generating work paths for autonomous driving using only environmental topographic data can be inefficient because the environment of a construction site undergoes rapid changes depending on the operation. Accordingly, there is a need to develop technology that generates efficient work paths by training a work path generation model based on the experience of skilled workers who have worked at construction sites for a long time. The aforementioned background technology is technical information that the inventor possessed for the derivation of the present invention or acquired during the process of deriving the present invention, and it cannot be considered as prior art disclosed to the general public prior to the filing of the present invention. FIG. 1 is a conceptual diagram illustrating a method for generating a work path of equipment using a reinforcement learning model according to one embodiment. FIG. 2 is a block diagram of a path generation device according to one embodiment. FIG. 3 is an exemplary configuration diagram of a system including a path generation device and an external device according to one embodiment. FIG. 4 is a flowchart illustrating a method for generating a final work path of equipment using a reinforcement learning model according to one embodiment. FIG. 5 is an exemplary drawing for explaining a method of designing a first compensation function according to one embodiment. FIG. 6 is an exemplary diagram illustrating a method for generating a second reinforcement learning model by applying a second reward function according to one embodiment. FIG. 7 is an exemplary diagram illustrating a method for generating a final work path using a second reinforcement learning model according to one embodiment. The advantages and features of the present invention, and the methods for achieving them, will become clear by referring to the embodiments described in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments presented below, but can be implemented in various different forms and should be understood to include all modifications, equivalents, and substitutions that fall within the spirit and scope of the present invention. The embodiments presented below are provided to ensure that the disclosure of the present invention is complete and to fully inform those skilled in the art of the scope of the invention. In describing the present invention, detailed descriptions of related known technologies are omitted if it is determined that such detailed descriptions may obscure the essence of the present invention. The terms used in this application are used merely to describe specific embodiments and are not intended to limit the invention. The singular expression includes the plural expression unless the context clearly indicates otherwise. In this application, terms such as "comprising" or "having" are intended to specify the existence of the features, numbers, steps, actions, components, parts, or combinations thereof described in the specification, and should be understood as not precluding the existence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof. Some embodiments of the present disclosure may be represented by functional block configurations and various processing steps. Some or all of these functional blocks may be implemented by various numbers of hardware and/or software configurations that perform specific functions. For example, the functional blocks of the present disclosure may be implemented by one or more microprocessors or by circuit configurations for a specific function. Additionally, for example, the functional blocks of the present disclosure may be implemented in various programming or scripting languages. The functional blocks may be implemented as algorithms executed on o