Search

KR-20260063528-A - METHOD AND SYSTEM FOR RETRIEVING API, METHOD AND SYSTEM FOR TRAINING API RETRIEVAL MODEL

KR20260063528AKR 20260063528 AKR20260063528 AKR 20260063528AKR-20260063528-A

Abstract

A method for training an API search model for API search and a system thereof are provided. The method may include the steps of: obtaining a query to be trained; identifying a plurality of subqueries included in the query; constructing training data comprising the query, the plurality of subqueries, and an API (Application Programming Interface) corresponding to each of the plurality of subqueries; and supervising training an API search model using the training data.

Inventors

  • 신재선
  • 김동우
  • 홍혜림
  • 김준이
  • 이민영

Assignees

  • 삼성에스디에스 주식회사

Dates

Publication Date
20260507
Application Date
20241030

Claims (20)

  1. In a method performed by a computing system, Step of obtaining the first query; and The method includes the step of inputting the first query into a previously trained API (Application Programming Interface) search model and determining a search target API corresponding to the first query among a plurality of candidate APIs using the output of the API search model, The above API search model is supervised learning using training data comprising a second query, an API set corresponding to the second query, and subqueries corresponding to each of the APIs included in the API set. The above subquery is composed of some of the text included in the above second query, API search method.
  2. In Article 1, The above first query includes a plurality of tokens, and The above API search model is, A first layer that calculates the similarity between a candidate API included in the plurality of candidate APIs and each of the plurality of tokens; and A second layer comprising a second layer that calculates a search score of the candidate API for the first query based on the similarity, API search method.
  3. In Article 2, The above search score is the sum of the weights of the candidate APIs for each of the plurality of tokens included in the first query assigned based on the similarity, API search method.
  4. In Paragraph 3, The above plurality of tokens includes a first token and a second token, and If the first similarity between the candidate API and the first token is higher than the second similarity between the candidate API and the second token, the first weight of the candidate API for the first token is higher than the second weight of the candidate API for the second token, API search method
  5. In a method performed by a computing system, Step of obtaining the query to be learned; A step of identifying a plurality of subqueries included in the above query; A step of constructing training data including the above query, the above plurality of subqueries, and an API (Application Programming Interface) corresponding to each of the above plurality of subqueries; and A step comprising supervising an API search model using the above training data, API Search Model Training Method
  6. In Article 5, The step of identifying the plurality of subqueries included in the above query is: The step of inputting the above query into a Large Language Model (LLM); and A step comprising determining the API corresponding to each of the plurality of subqueries using information output from the above LLM, API Search Model Training Method
  7. In Article 5, The above query includes multiple tokens, and The above API search model is, A first layer that calculates the similarity between the above API and each of the plurality of tokens; and A second layer comprising a second layer that calculates a search score of the API for the query based on the similarity, API Search Model Training Method
  8. In Article 7, The above search score is the sum of the weights of the API for each of the plurality of tokens included in the query assigned based on the similarity, API Search Model Training Method
  9. In Article 8, The above plurality of tokens includes a first token and a second token, and If the first similarity between the API and the first token is higher than the second similarity between the API and the second token, the first weight of the API for the first token is higher than the second weight of the API for the second token. API Search Model Training Method
  10. In Article 5, The step of supervising the API search model using the above training data is: The method includes the step of supervising the API search model to calculate the search score of the API for the above query using a loss function, wherein The above loss function is a predefined function that utilizes the difference between the search score of the API for the above query and the search score of the API for the subquery corresponding to the above API. API Search Model Training Method
  11. At least one processor; and It includes at least one memory that stores instructions that cause the at least one processor to perform operations when executed by the at least one processor, and The above operations are, The operation of obtaining the first query; and The method includes inputting the first query into a previously trained API (Application Programming Interface) search model and determining a search target API corresponding to the first query among a plurality of candidate APIs using the output of the API search model, The above API search model is supervised learning using training data comprising a second query, an API set corresponding to the second query, and subqueries corresponding to each of the APIs included in the API set. The above subquery is composed of some of the text included in the above second query, API Search System.
  12. In Article 11, The above first query includes a plurality of tokens, and The above API search model is, A first layer that calculates the similarity between a candidate API included in the plurality of candidate APIs and each of the plurality of tokens; and A second layer comprising a second layer that calculates a search score of the candidate API for the first query based on the similarity, API Search System.
  13. In Article 12, The above search score is the sum of the weights of the candidate APIs for each of the plurality of tokens included in the first query assigned based on the similarity, API Search System.
  14. At least one processor; and It includes at least one memory that stores instructions that cause the at least one processor to perform operations when executed by the at least one processor, and The above operations are, The operation of obtaining the query to be studied; An operation to identify multiple subqueries included in the above query; An operation to construct training data comprising the above query, the above plurality of subqueries, and an API (Application Programming Interface) corresponding to each of the above plurality of subqueries; and A process including supervised learning of an API search model using the above training data, API Search Model Training System.
  15. In Article 14, The operation of identifying the plurality of subqueries included in the above query is: The operation of inputting the above query into a Large Language Model (LLM); and A method including an operation to determine the API corresponding to each of the plurality of subqueries using information output from the above LLM, API Search Model Training System.
  16. In Article 14, The above query includes multiple tokens, and The above API search model is, A first layer that calculates the similarity between the above API and each of the plurality of tokens; and A second layer comprising a second layer that calculates a search score of the API for the query based on the similarity, API Search Model Training System.
  17. In Article 16, The above search score is the sum of the weights of the API for each of the plurality of tokens included in the query assigned based on the similarity, API Search Model Training System.
  18. In Article 14, The operation of supervising the API search model using the above training data is: The operation includes supervising the API search model to calculate the search score of the API for the above query using a loss function, wherein The above loss function is a predefined function that utilizes the difference between the search score of the API for the above query and the search score of the API for the subquery corresponding to the above API. API Search Model Training System.
  19. In combination with computing devices, Step of obtaining the first query; and To execute the step of inputting the first query into a previously trained API (Application Programming Interface) search model and determining a search target API corresponding to the first query among a plurality of candidate APIs using the output of the API search model, the method is stored on a computer-readable recording medium, The above API search model is supervised learning using training data comprising a second query, an API set corresponding to the second query, and subqueries corresponding to each of the APIs included in the API set. The above subquery is composed of some of the text included in the above second query, Computer program.
  20. In combination with computing devices, Step of obtaining the query to be learned; A step of identifying a plurality of subqueries included in the above query; A step of constructing training data including the above query, the above plurality of subqueries, and an API (Application Programming Interface) corresponding to each of the above plurality of subqueries; and To execute the step of supervising an API search model using the above training data, stored on a computer-readable recording medium, Computer program.

Description

API retrieval method and system thereof, and API retrieval model training method and system thereof The present disclosure relates to an API search method and system, and a method and system for training an API search model. Specifically, the present disclosure relates to a method for searching for an API corresponding to a query in an information pool and a method for training an API search model for this purpose. Unlike passages (passages, images, etc.), APIs (Application Programming Interfaces) are functionally granular, so multiple APIs may be required to execute or process a single query. In a method for searching for APIs corresponding to a query in an API pool using an API search model, the query is converted into a sentence embedding vector, the similarity between the converted query embedding vector and the API embedding vector for the API is calculated, and APIs with high similarity to the query can be extracted based on the calculated similarity. However, in this case, since the query is converted into a single semantic vector, there is a problem where multiple APIs with different characteristics are matched to the same query embedding vector. In other words, for two different APIs to be extracted corresponding to a query, a problem arises where, despite having different characteristics, the API embedding vectors for each of the two APIs must have a high similarity to the same query embedding vector. The problem of multiple APIs with different characteristics being matched to the same query embedding vector can be solved by using a Large Language Model (LLM) to decompose a query into multiple subqueries and then using an API search model to extract one API for each subquery. However, in this case, when performing query splitting, the reliability of the search results may be lowered because the LLM splits the query based on its built-in knowledge without considering similarity with APIs, as it lacks information about the API pool. In addition, considering that LLM is used for API search and that API search is performed repeatedly for each subquery, a significant amount of resources can be consumed by API search. Therefore, a new approach is required to solve these problems in the method of searching for APIs corresponding to a query. FIG. 1 is a configuration diagram illustrating an example of a search system to which an API search system according to one embodiment of the present disclosure can be applied. FIG. 2 is a flowchart illustrating an API search method according to one embodiment of the present disclosure. FIG. 3 is a flowchart illustrating an example of the overall operation for API search of an API search system according to some embodiments of the present disclosure. FIG. 4 is a configuration diagram illustrating the configuration of an API search model according to some embodiments of the present disclosure. FIG. 5 is a diagram illustrating a method for calculating a search score of candidate APIs for a query according to some embodiments of the present disclosure. FIG. 6 is a flowchart illustrating an API search model learning method according to one embodiment of the present disclosure. FIG. 7 is a flowchart illustrating a specific example of an API search model learning method according to some embodiments of the present disclosure. FIG. 8 is a block diagram illustrating an example of a computing device for carrying out some embodiments of the present disclosure. Preferred embodiments of the present disclosure are described in detail below with reference to the accompanying drawings. The advantages and features of the present disclosure and the methods for achieving them will become clear by referring to the embodiments described below in detail together with the accompanying drawings. However, the present disclosure is not limited to the embodiments described below but may be implemented in various different forms. The embodiments are provided merely to make the present disclosure complete and to fully inform those skilled in the art of the scope of the invention, and the embodiments of the present disclosure are defined only by the scope of the claims. To avoid obscuring the concepts of the present disclosure, known components may be omitted or illustrated in the form of block diagrams focusing on the core functions of each component. Throughout the present disclosure, the same components are described using the same reference numerals, even if they are shown in different drawings. Unless otherwise defined, all terms used herein (including technical and scientific terms) may be used in a meaning commonly understood by those skilled in the art to which this disclosure pertains. Furthermore, terms defined in commonly used dictionaries are not to be interpreted ideally or excessively unless explicitly and specifically defined otherwise. The terms used herein are for describing embodiments and are not intended to limit the invention. In this disclosure, the singular form includes th