Search

CN-122019211-A - MCP tool calling method and system based on dynamic screening large language model

CN122019211ACN 122019211 ACN122019211 ACN 122019211ACN-122019211-A

Abstract

The invention relates to the technical field of natural language data processing and discloses a method and a system for calling an MCP tool based on a dynamic screening large language model, wherein the method comprises the steps of analyzing an MCP tool set, extracting tool description and converting the tool description into a static feature vector; the method comprises the steps of constructing a context-aware hybrid query vector based on current input and historical conversations of a user, calculating semantic matching scores, historical calling success rates and complexity penalty items of all tools, weighting to obtain comprehensive scores, dynamically calculating screening thresholds and selecting candidate tools according to comprehensive score distribution and the residual space of a current context window, adaptively constructing system prompt words according to screening results, and sending the system prompt words to a large language model for reasoning or conversational response. The invention realizes the dynamic and accurate screening of tools, relieves the problems of overload of a context window and distraction of a model caused by injection of the full quantity of tools in the prior art, and improves the success rate of tool calling and the overall response efficiency of the system.

Inventors

  • XIE CHAOHAI
  • XIE QI
  • QI DAWEI
  • PENG BO
  • Ji Zhouhang

Assignees

  • 深圳海云安网络安全技术有限公司

Dates

Publication Date
20260512
Application Date
20260205

Claims (10)

  1. 1. An MCP tool calling method based on a dynamic screening large language model, comprising: Analyzing all tool sets exposed by the MCP server, extracting the function description and parameter Schema information of each tool, and converting the function description of each tool into a static feature vector by utilizing Embedding model; based on the current text input and history dialogue record of the user, introducing a time attenuation factor, and constructing a mixed query vector integrating context information; calculating semantic matching scores of the mixed query vector and the static feature vector of each tool, obtaining historical call success rate of each tool, calculating complexity penalty items of each tool Schema, and calculating comprehensive scores of each tool in a weighted addition mode; Calculating the statistical characteristics of the comprehensive scores of all the tools at present, and calculating a dynamic screening threshold value based on the statistical characteristics and the residual space of the current context window; And if the candidate tool set is empty, constructing the system prompt word without the tool list to carry out pure dialogue response.
  2. 2. The MCP tool calling method based on the dynamic screening large language model according to claim 1, wherein the specific formula of the hybrid query vector is: ; In the formula, The hybrid query vector is represented and, The vector embedding function is represented by a vector, Representing the current text input by the user, Represent the first The historical dialog content of the wheel is displayed, Representing the weight coefficient of the current input, A time decay factor representing the history information.
  3. 3. The MCP tool invocation method based on the dynamic screening large language model according to claim 2, wherein the calculation formula of the composite score of each tool is: ; In the formula, Is a tool Is used for the comprehensive scoring of the (c), Is a tool Is used to determine the semantic matching scores of (1), Is a tool Is a history of call success rate of (c), Is a tool Is a complexity penalty term of (1), , , Semantic item weight, historical performance item weight and complexity penalty item weight respectively, and satisfy 。
  4. 4. The dynamic screening large language model based MCP tool calling method of claim 3, wherein the tool Semantic matching scores of (2) By computing a hybrid query vector And tool static feature vector Obtained by cosine similarity of (c).
  5. 5. The dynamic screening large language model based MCP tool calling method of claim 3, wherein the tool History call success rate of (a) And calculating based on the call success times and the total call times in a preset time window, and obtaining the call success times and the total call times by adopting Laplace smoothing processing.
  6. 6. The dynamic screening large language model based MCP tool calling method of claim 3, wherein the tool Complexity penalty term of (a) Token length based on tool Schema Maximum in the current tool set And minimum value And performing maximum and minimum normalization calculation.
  7. 7. The MCP tool calling method based on the dynamic screening large language model according to claim 3, wherein the calculation formula of the dynamic screening threshold is: ; In the formula, In order to screen the threshold value, The arithmetic mean of the scores for all of the tools currently, The standard deviation of the scores for all of the tools, Is a sensitivity adjustment coefficient; If tool Is of (1) Selecting candidate tool set if And (5) eliminating.
  8. 8. The MCP tool calling method based on the dynamic screening large language model according to claim 7, wherein the calculation formula of the sensitivity adjustment coefficient is: ; In the formula, For the number of remaining available Token; As a truncated function, the truncated range is , Is a preset minimum positive number; for the maximum contextual window limit of the LLM model currently invoked, For the total number of occupied Token in the current session, For the total number of candidate tools currently involved in the screening, For the average Token length of all tools currently involved in the screening, Is the inverse cumulative distribution function of the standard normal distribution.
  9. 9. The method for calling the MCP tool based on the dynamic screening large language model according to claim 1, wherein the system prompt word is constructed by converting the candidate tool set into a JSON description object conforming to the MCP protocol if the candidate tool set is not empty, and assembling the JSON description object, the character definition, the user instruction and the context history dialogue record together into the system prompt word, and if the candidate tool set is empty, the system prompt word does not contain any tool description information.
  10. 10. An MCP tool calling system based on a dynamic screening large language model, which adopts the MCP tool calling method based on a dynamic screening large language model according to any one of claims 1 to 9, and is characterized by comprising: The feature extraction module is used for analyzing the MCP tool set, extracting the functional description of the tool, converting the functional description into a static feature vector, storing the static feature vector into a vector database, and acquiring the current text input and history dialogue record of the user in real time; The context fusion module is used for calculating a context-aware hybrid query vector based on the current input of the user, the historical dialogue record and the time attenuation factor; The scoring module is used for calculating semantic matching scores, historical calling success rates and complexity penalty items of each tool relative to the mixed query vector, and comprehensively calculating multidimensional comprehensive relevance scores of each tool; The dynamic screening module is used for calculating a dynamic screening threshold value based on the statistical characteristics of the comprehensive scores of all tools and the residual space of the current context window, and screening out a candidate tool set according to the dynamic screening threshold value; And the execution calling module is used for adaptively constructing system prompt words containing or not containing the simple tool list according to the screening result and calling the large language model to carry out reasoning or dialogue response.

Description

MCP tool calling method and system based on dynamic screening large language model Technical Field The invention relates to the technical field of natural language data processing, in particular to a MCP tool calling method and system based on a dynamic screening large language model. Background With the rapid development of artificial intelligence technology, a Large Language Model (LLM) has evolved from a pure text generation tool to an Agent with autonomous planning and execution capabilities. In the agent application ecology, in order to give the model the ability to perceive the environment and operate the physical world, it is often necessary to connect various back-end tools or application program interfaces through the Model Call Protocol (MCP). MCP is used as a standardized interaction protocol, so that the integration process between LLM and external tools is greatly simplified, and the intelligent agent can process complex tasks such as weather inquiry, database retrieval, financial analysis and the like. In prior art architectures, systems typically employ a full-volume context injection strategy in order for large language models to recognize and invoke these external tools. That is, in each round of dialogue, the function description, parameter definition (Schema) and call examples of all tools registered by the MCP server are input into the context window of the LLM once and completely by means of system prompt words. The model generates a corresponding function call request by reasoning according to the natural language instruction of the user and combining tool definition in the context. However, with the increasing complexity of the agent application scenario, the number of tools integrated at the MCP server has shown to increase in a explosive manner, extending from the first few core tools to hundreds to thousands. This approach to full context injection, in the face of massive tool libraries, exposes the disadvantages that, on the one hand, the context window length of large language models is limited and Token resource consumption directly corresponds to high reasoning costs and computational delays, that the existing approach fills lengthy, detailed and huge amounts of tool description text into the context indifferently, resulting in tool definitions taking up most of the available Token space, that not only results in wasted computational resources, but also greatly compresses the space for storing user history dialogs, core task instructions, and thought chain reasoning, limiting the model's ability to handle long text and complex logic tasks. On the other hand, the whole set of the tools which are numerous and unordered and contain a large amount of interference information is presented to the LLM without screening, which is equivalent to the requirement that the model searches effective information in an extremely noisy noise environment, and according to the attention mechanism characteristic of the large model, when the context is not too much, the attention of the model to the key instructions can be diluted, and the tools which are most suitable for the current user intention are difficult to be identified from a large number of tools efficiently and accurately, so that the success rate and reliability of tool calling are reduced. Therefore, there is a need for a method and system for MCP tool invocation based on dynamic screening of large language models that addresses the above-described issues. Disclosure of Invention Aiming at the problems in the related art, the invention provides a MCP tool calling method based on a dynamic screening large language model, so as to overcome the technical problems in the prior art. In order to solve the technical problems, the invention is realized by the following technical scheme: In a first aspect, the embodiment of the invention provides a method for calling a MCP tool based on a dynamic screening large language model, which specifically comprises the steps of analyzing all tool sets exposed by a MCP service end, extracting function description and parameter Schema information of each tool, converting the function description of each tool into static feature vectors by utilizing a pre-trained Embedding model, acquiring a historical dialogue record in a current text input and historical dialogue window of a user in real time, constructing a mixed query vector integrating context information based on the current text input and the historical dialogue record of the user, introducing a time attenuation factor, calculating semantic matching scores of the mixed query vector and each tool static feature vector, acquiring a historical calling success rate of each tool, calculating a comprehensive score of each tool Schema, calculating statistical features of the comprehensive scores of all tools currently, calculating a dynamic screening threshold value based on the statistical features and the residual space of the current context window, sc