KR-20260064476-A - Method and system for predicting similar precedents and judgment results using dual encoder of deep learning language model

KR20260064476AKR 20260064476 AKR20260064476 AKR 20260064476AKR-20260064476-A

Abstract

According to one embodiment, a system for searching for similar precedents using a deep learning-based language model equipped with a dual encoder structure may include an electronic device for acquiring first query data corresponding to a user's query, an artificial intelligence model trained to receive the first query data as input data and output a specified number of precedents in order of high similarity to the content of the user's query, and a precedent database for storing precedent data. The artificial intelligence model is a Transformer-based Korean language model and may include a first encoder that receives the first query data, encodes it, and generates a first embedding vector, and a second encoder that receives the first precedent data from the precedent database, encodes it, and generates a second embedding vector.

Inventors

홍진솔

Assignees

주식회사 펜스
김경민

Dates

Publication Date: 20260507
Application Date: 20250711

Claims (5)

In a system for searching for similar precedents using a deep learning-based language model equipped with a dual encoder structure, An electronic device for acquiring first query data corresponding to a user's query; An artificial intelligence model trained to receive the above-mentioned first query data as input data and output a specified number of precedents in order of high similarity to the user's query content; and Includes a case law database that stores case law data; and The above artificial intelligence model is: It is a Transformers-based Korean language model, and A first encoder that receives the above-mentioned first query data, encodes it, and then generates a first embedding vector; and It includes a second encoder that receives first case data from the above case database, encodes it, and then generates a second embedding vector; A system that selects a specified number of precedents in order of high similarity to the first query data as a result of inputting the first query data into the artificial intelligence model.
In claim 1, The above artificial intelligence model is: A system that predicts the judgment result of a case corresponding to the first query data by considering the judgment result of the above-mentioned selected precedent.
In claim 1, The above artificial intelligence model is: Collect data regarding the winning attorneys among the aforementioned selected precedents, and Determine the lawyer with the highest win rate among the aforementioned successful lawyers and match them, Analyze the contents of the complaint filed by the lawyer with the highest win rate, and A system that automatically generates a legal document corresponding to the first query data based on the contents of the above complaint.
In claim 1, The above artificial intelligence model is: Mean pooling is performed on the first embedding vector and the second embedding vector to unify the dimensions, and After performing the above average pooling, the similarity between the first embedding vector and the second embedding vector is calculated, and Generating the first query data, the second query data corresponding to the first case law data, and the similarity information as the first data set, and A system that inputs the above-mentioned first data set into the above-mentioned artificial intelligence model, receives the above-mentioned first query data as input data, and learns to output a specified number of precedents in order of high similarity to the content of the user's query.
In claim 1, A system in which the above artificial intelligence model is stored inside the electronic device to be executed directly on the electronic device, or stored on the cloud server to be executed on a cloud server physically separated from the electronic device.

Description

Method and system for predicting similar precedents and judgment results using dual encoder of deep learning language model The present disclosure relates to a method and system for searching for precedents similar in relevance to a user's question using a deep learning-based language model equipped with a dual encoder. Specifically, the method and system use a dual encoder of a deep learning-based Korean language model to encode the user's question and precedent data respectively, and to search for optimal similar precedents that match the user's question by measuring the similarity between the user's question and the precedent data. Deep learning-based search models commonly use single encoder structures in various natural language processing tasks, particularly those related to search. A single encoder structure transforms input text into high-dimensional embedding vectors through a single encoding process. These structures are generally based on the Transformer architecture. On the other hand, a single encoder structure imposes limitations on accurately understanding the relationship between a query and a document and learning associations. In other words, because a single encoder structure encodes the query and the document independently, it is difficult to effectively learn how the context of the query connects to the document. In particular, the performance degradation of deep learning-based search models is pronounced when processing Korean natural language, which contains a large amount of unstructured text. FIG. 1 illustrates a case law search system for a user to search for case law using a deep learning-based search model according to one embodiment. FIG. 2 illustrates a flowchart of operations for a user to search for a case using a case search system according to one embodiment. FIG. 3 illustrates a flowchart of the operation of training a deep learning-based search model according to one embodiment. FIG. 4 illustrates a flowchart that visualizes and explains the operation of training a deep learning-based search model according to one embodiment. In relation to the description of the drawings, the same or similar reference numerals may be used for identical or similar components. Hereinafter, various embodiments of the present invention are described with reference to the accompanying drawings. However, this is not intended to limit the present invention to specific embodiments and should be understood to include various modifications, equivalents, and/or alternatives of the embodiments of the present invention. Embodiments of the present invention are described below with reference to the attached drawings so that those skilled in the art can easily implement them. However, the present invention may be embodied in various different forms and is not limited to the embodiments described herein. Furthermore, in order to clearly explain the present invention in the drawings, parts unrelated to the explanation have been omitted, and similar parts throughout the specification are denoted by similar reference numerals. Additionally, terms such as “…part,” “…unit,” and “module” described in the specification refer to a unit that processes at least one function or operation, and this may be implemented in hardware, software, or a combination of hardware and software. Furthermore, throughout the specification, when a part is described as being "connected" to another part, this includes not only cases where they are "directly connected," but also cases where they are "electrically connected" with other components in between. Furthermore, when a part is said to "include" a certain component, this means that, unless specifically stated otherwise, it does not exclude other components but rather may include additional components, and it should be understood as not excluding in advance the existence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof. FIG. 1 illustrates a case law search system (10) for a user to search for case law using a deep learning-based search model according to one embodiment. Referring to FIG. 1, when a user (11) inputs a question through an electronic device (e.g., smartphone, tablet PC, IoT device, etc.), the question is input into an Artificial Intelligence (AI) model and processed, thereby allowing data corresponding to an answer optimized for the question to be checked in the database and provided to the user, so that data corresponding to the user's question can be obtained accurately and quickly. According to one embodiment, the case law search system (10) may include an electronic device of a user (11), an AI-driven server on which an AI model (102) is run, and a database (103) on which various data is stored. According to one embodiment, the electronic device of the user (11) may include, but is not limited to, a smartphone, a tablet PC, or an IoT device. That is, the electronic device may include any electronic de