Search

CN-121996696-A - Intelligent question-answering system for digital library

CN121996696ACN 121996696 ACN121996696 ACN 121996696ACN-121996696-A

Abstract

The invention provides an intelligent question-answering system of a digital library, which comprises a database construction module, a question preprocessing module and a retrieval ordering module, wherein the database construction module collects and converts book information by adopting book information collection equipment and conversion equipment and constructs the database by a web crawler technology, the question preprocessing module is used for preprocessing questions by a question classification model, the retrieval ordering module is used for carrying out matching comparison on the preprocessed questions in a structured index mode on data of the digital library and outputting a retrieval result list from high to low according to matching scoring.

Inventors

  • XU YING

Assignees

  • 北京航天情报与信息研究所

Dates

Publication Date
20260508
Application Date
20251223

Claims (10)

  1. 1. The intelligent question-answering system of the digital library is characterized by comprising a database construction module, a question preprocessing module and a retrieval ordering module, wherein, The database construction module adopts book information acquisition equipment and conversion equipment to acquire and convert book information, and constructs a database through a web crawler technology; the question preprocessing module is used for preprocessing questions through a question classification model; And the retrieval ordering module is used for carrying out matching comparison on the data of the digital library in a structured index mode on the preprocessed questions, and outputting a retrieval result list from high to low according to matching scoring.
  2. 2. The system of claim 1, wherein the system further comprises a controller configured to control the controller, The database construction module is specifically used for collecting question and answer information by adopting an open source crawler framework Scrapy framework, optimizing a database structure, respectively setting assigned weights for basic information of a questioner, basic information of an answer and evaluation information through a characteristic flushing module according to an input question and answer inner barrel, carrying out weight summation on the answers, and sequencing according to the order from high to low to obtain the best answer pair with highest questions and weights.
  3. 3. The system of claim 2, wherein the data content collected by the crawler includes data types, question and answer data, questioner attributes, assessment attributes, and discipline classifications.
  4. 4. The system of claim 3, wherein the data types include feature tags, the question-answer data includes questions, question descriptions, question times, answers, answer times, the question attributes include gender, age, identification number, the rating attributes include poor ratings, acceptance rates, good ratings, and the discipline classifications include discipline categories.
  5. 5. The cloud computing based digital library intelligent question-answering system according to claim 4, wherein the preprocessing of questions by a question classification model comprises: Calculating output information h i of the ith word after passing through the hidden layer: Wherein the method comprises the steps of Respectively representing the output values of the forward direction and the reverse direction; and (3) carrying out weight distribution on the words, and outputting the following results: Wherein t i is the weight of the ith term; The obtained characteristic vector Leading into a classifier to obtain a class result as follows: Wherein g () represents a classifier function, V x represents a weight matrix; The dropout processing of the classifier is shown, and b s is the bias vector of the classifier.
  6. 6. The system according to claim 5, wherein the matching comparison of the data of the digital library by means of the structured index comprises: matching the input question with the history question, and calculating the similarity, wherein the calculation formula is as follows: sim content =s (x) ·α sim (Q, question) +β sim (Q, question), where sim content is a question similarity, α and β are the first and second calculation parameters, respectively, Q is an input question, and question is a history question.
  7. 7. The system of claim 6, wherein the matching of the data of the digital library by means of the structured index further comprises, when the similarity of the retrieved answers is calculated, using sim eva for representation, the adoption rules are:
  8. 8. The system of claim 7, wherein the matching comparison of the data of the digital library by means of the structured index further comprises the category matching search means that the category of the question is judged, and the judgment formula is: Wherein sim cate is category similarity, cate _ q 、cate_2 q represents the first and second question classification results, and field represents the history question category.
  9. 9. The system of claim 8, wherein the architecture of the cloud computing-based digital library intelligent question and answer system comprises an interaction layer, an application layer, an analysis layer, a resource layer and a base layer, and is characterized in that the base layer comprises data needed by the construction of the intelligent question and answer system and is stored in a text form, the resource layer stores data resources including question and answer data, feature data, book resources and a knowledge base, the analysis layer constructs the intelligent question and answer system through an intelligent question and answer engine, a library engine and a search ordering engine, each part further comprises an optimization scheme, the application layer comprises intelligent question and answer, related question recommendation, book interpretation and history tracking, and the interaction layer comprises hardware for intelligent question and answer of a user and comprises a Web terminal and a mobile terminal.
  10. 10. The system of claim 9, wherein the Web terminal comprises access to the resource via a network, and wherein the software and data are stored on a server.

Description

Intelligent question-answering system for digital library Technical Field The invention relates to the technical field of cloud computing, in particular to an intelligent digital library question-answering system. Background The traditional library has no adaptability to the development trend of the current society due to the defects of numerous books, large occupied space and large inquiry difficulty. Digital libraries are produced according to demands, and are mainly libraries for processing and storing various documents such as books by using digital technology. The digital library can store information resources of different carriers and different positions by adopting a digital technology, and based on the information resources, the inquiry and the propagation of users across objects and areas are facilitated. Digital libraries mainly include processing, storage, retrieval, transmission and utilization of information resources. The traditional knowledge acquisition mode returns information too redundant, and users need to consume a great deal of manpower and time to search the information required by themselves in the returned information. The intelligent question-answering system can accurately capture the intention of the user, understand the natural language question of the user, and directly return the answer to the user, so that people attach more and more importance to and research on the intelligent question-answering system. The traditional search engine of the digital library has the defects of low answer returning speed and poor accuracy, and cannot meet the requirements of the current digital library. Therefore, how to effectively extract the information of the requirement and improve the speed and accuracy of the answer returned becomes one of the prior art problems to be solved urgently. Disclosure of Invention The invention provides a digital library intelligent question-answering system which is used for effectively extracting information required and improving the speed and accuracy of answer return. In a first aspect, a digital library intelligent question-answering system is provided, which comprises a database construction module, a question preprocessing module and a retrieval ordering module, The database construction module adopts book information acquisition equipment and conversion equipment to acquire and convert book information, and constructs a database through a web crawler technology; the question preprocessing module is used for preprocessing questions through a question classification model; And the retrieval ordering module is used for carrying out matching comparison on the data of the digital library in a structured index mode on the preprocessed questions, and outputting a retrieval result list from high to low according to matching scoring. In one embodiment, the database construction module is specifically configured to collect question and answer information by using an open source crawler framework Scrapy framework, optimize the database structure, respectively set assignment weights for basic information of a questioner, basic information of an answer and evaluation information by using a feature flushing module according to an input question and answer inner barrel, and perform weight summation on the answers, and order the answers from high to low to obtain an optimal answer pair with highest question and weight. In one embodiment, the data content collected by the crawler includes data types, question and answer data, questioner attributes, assessment attributes, and discipline categories. In one embodiment, the data type comprises a feature tag, the question and answer data comprises questions, question descriptions, question time, answers and answer time, the question attributes comprise gender, age and identification number, the evaluation attributes comprise poor evaluation number, acceptance rate and good evaluation number, and the discipline classification comprises discipline categories. In one embodiment, preprocessing the question by a question classification model includes: Calculating output information h i of the ith word after passing through the hidden layer: Wherein the method comprises the steps of Respectively representing the output values of the forward direction and the reverse direction; and (3) carrying out weight distribution on the words, and outputting the following results: Wherein t i is the weight of the ith term; The obtained characteristic vector Leading into a classifier to obtain a class result as follows: Wherein g () represents a classifier function, V x represents a weight matrix; The dropout processing of the classifier is shown, and b s is the bias vector of the classifier. In one embodiment, the matching comparison of the data of the digital library by means of the structured index specifically comprises: matching the input question with the history question, and calculating the similarity, wherein the calculation form