Search

CN-122019722-A - Medical literature intelligent duplicate checking and grading recommendation interaction system based on discipline service

CN122019722ACN 122019722 ACN122019722 ACN 122019722ACN-122019722-A

Abstract

The invention discloses a medical literature intelligent duplicate checking and grading recommendation interaction system based on discipline service. The method comprises a medical exclusive duplicate checking module, a subject adaptation hierarchical recommendation module, a scientific research scene interaction module, a medical data preprocessing module and a dynamic subject resource module, wherein data interaction and parameter synchronization are realized by the modules through an internal standardized data interface, the data interface adopts a medical data interaction communication rule built in a system, a data transmission format is a structural format preset by the system, data processed by the modules are stored in a distributed database, and efficient data and parameter synchronization of the modules such as medical data preprocessing, duplicate checking, hierarchical recommendation, scientific research scene interaction and dynamic resource update are realized through the built-in standardized data interface and the unified medical data interaction protocol, so that a full-flow closed-loop garment with feedback from a document and optimized algorithm is constructed.

Inventors

  • MENG YANLI
  • Cheng Runfen
  • Ran Dongxian
  • CHANG HONG

Assignees

  • 天津医科大学

Dates

Publication Date
20260512
Application Date
20260202

Claims (10)

  1. 1. The intelligent medical literature weight checking and grading recommendation interaction system based on discipline service is characterized by comprising a medical exclusive weight checking module, a discipline adaptation grading recommendation module, a scientific research scene interaction module, a medical data preprocessing module and a dynamic discipline resource module; The data interface adopts a medical data interaction communication rule built in the system, the data transmission format is a structural format preset by the system, and the data processed by each module are stored in a distributed database; the medical data preprocessing module receives medical documents input by a user, performs format standardization, special term normalization, scientific research element extraction and noise filtering operation according to a medical term standard library and clinical test data specification built in the system, generates structured document data containing research subjects, test designs, data indexes and conclusion expression, and triggers data transmission to the medical exclusive duplicate checking module through an internal data interface after the preprocessing operation is completed; The medical exclusive check and reconstruction module receives structured document data, invokes a medical ontology knowledge base and a discipline exclusive feature base of the dynamic discipline resource module, merges medical semantic association analysis and multidimensional similarity calculation, completes similarity detection on the full text of the document and the scientific research element level, positions repeated content and associates discipline classification and evidence level of similar document, and synchronously pushes detection results to the discipline adaptation classification recommendation module and the scientific research scene interaction module through an internal data interface; The dynamic discipline resource module constructs a four-level medical discipline system, the two-level branches are divided by taking the first-level discipline as a framework, the two-level branches are further subdivided according to disease subtypes, each disease subtype corresponds to a specific research direction, the discipline system covers basic medicine, clinical medicine, preventive medicine, pharmacy, public health and the subdivision directions of the various fields, and the system establishes a special classification coding rule for the four-level medical discipline system; The dynamic discipline resource module simultaneously constructs an exclusive scientific research feature library and a research hotspot graph of each field, the research hotspot graph is updated by the latest data of the discipline field which is captured in real time, the dynamic discipline resource module provides semantic analysis support data for the exclusive check and reconstruction module and classified screening support data for the discipline adaptation classification recommendation module through the fixed data interface, and the parameter synchronous adaptation of the association module is automatically triggered when the support data is updated; the subject adaptation hierarchical recommendation module receives a similarity detection result of the medical exclusive duplicate checking module and a user research direction label transmitted by the scientific research scene interaction module, completes multi-dimensional hierarchical screening of documents through a medical scientific research scene adaptation algorithm built in the system by combining a four-level medical subject system and a research hotspot graph of the dynamic subject resource module, generates a hierarchical recommendation document list, and pushes the hierarchical recommendation document list to the scientific research scene interaction module through an internal data interface; The scientific research scene interaction module receives user operation instructions, transmits parameter setting instructions to the medical exclusive weight checking module and the discipline adaptation hierarchical recommendation module, converts result display requirements into a visual data presentation form, and transmits user research direction labels and interaction behavior data to the discipline adaptation hierarchical recommendation module and the dynamic discipline resource module respectively.
  2. 2. The subject service-based intelligent review and grading recommendation interaction system for medical documents of claim 1, wherein when the medical proprietary review module performs multi-dimensional similarity calculation, the following operations are sequentially completed: Executing the medical exclusive word segmentation processing on the structured document data, extracting medical terms, clinical trial design elements, a data index system, ethical statement expression and core scientific research elements of conclusion deduction logic, wherein the word segmentation processing rule is constructed based on term association relations in a medical ontology knowledge base of a dynamic discipline resource module, and is built in a medical exclusive duplication checking module; The method comprises the steps that an attention mechanism semantic model fused with a medical ontology knowledge base is adopted, the model is generated by training a medical annotation corpus built in a system, the medical annotation corpus covers Chinese and English documents in the fields of basic medicine, clinical medicine, preventive medicine, pharmacy and public health, the model takes core scientific research elements of structured document data as input, and the special semantic association in the fields of the core scientific research elements, the similarity characteristics of clinical test design and the consistency characteristics of data results are captured through attention weight distribution, so that the semantic association characteristic values of the core scientific research elements are output; Constructing a composite similarity calculation model, wherein a weighted calculation logic of medical term semantic similarity, clinical trial design matching degree, data index coincidence degree and conclusion expression similarity is built in the composite similarity calculation model, the weighted calculation weight matrix is a subject exclusive weight matrix, the matrix is generated by a labeling corpus in the corresponding medical field through multi-dimensional similarity contribution degree statistical training and stored in a subject exclusive feature library of a dynamic subject resource module, and the composite similarity calculation model automatically calls the corresponding weight matrix according to subject classification labels of input documents and obtains a comprehensive similarity value through weighted calculation; the composite similarity calculation model compares the comprehensive similarity threshold with a subject exclusive threshold, subject exclusive threshold is generated by counting the number of quantiles of repeated practice data of each medical branch and is stored in a subject exclusive feature library of a dynamic subject resource module, repeated contents are positioned according to three dimensions of the whole text, the chapters and the scientific research elements, the published journal evidence level of similar documents, subject branch classification and data source reliability information are associated, differential labeling is carried out, and the format of the differential labeling is matched with the visual presentation form of a scientific research scene interaction module.
  3. 3. The subject service-based intelligent review and grading recommendation interaction system for medical documents of claim 1, wherein when the subject adaptation grading recommendation module performs grading screening, the following operations are sequentially completed: Setting medical exclusive grading dimensions, wherein the grading dimensions comprise subject branch fitting degree, research hotspot association degree, evidence-based medical evidence level, clinical trial design fitting degree and user scientific research stage matching degree, and each grading dimension establishes fixed data mapping association with a four-level medical subject system and a research hotspot map of a dynamic subject resource module; calculating the fit degree value of the research direction of the input literature and the medical branch disciplines, the disease subtype and the medicine classification through a discipline classification code mapping based on a four-level medical discipline system of a dynamic discipline resource module, wherein the code mapping rule is constructed based on a classification code rule formulated by the system for the four-level medical discipline system; Combining research hotspot graphs of the dynamic subject resource module, generating a relevance value of an input document theme and a subject field research hotspot by subject keyword semantic matching, research method relevance analysis and core conclusion similarity comparison, wherein the judgment rule of each analysis method is built in the subject adaptation classification recommendation module; The system is internally provided with evidence-based medical evidence grading rules, documents to be recommended are classified according to evidence levels, integrated documents are subjected to frequency guidance, periodical influence factors and peer review opinions to construct a quality assessment index system, weight distribution of the quality assessment index system is generated by training an evidence-based medical related annotation corpus, and the quality assessment index system is stored in a subject exclusive feature library of a dynamic subject resource module; The subject adaptation hierarchical recommendation module is internally provided with a scientific research stage hierarchical weight matrix, the matrix is generated by training subject adaptation labeling corpuses in different scientific research stages, the matrix is stored in a subject exclusive feature library of the dynamic subject resource module, the subject adaptation hierarchical recommendation module is used for adjusting the weight ratio of each hierarchical dimension according to the user scientific research stage label, documents are ordered according to the weight ratio, and a hierarchical recommendation document list comprising recommendation reasons and subject adaptation descriptions is generated.
  4. 4. The subject service-based intelligent review and grading recommendation interaction system for medical documents of claim 1, wherein the medical data preprocessing module performs the following operations in sequence when performing scientific research element extraction: Performing format standardization operation on an input document, unifying text coding, chart marking rules, reference document formats and clinical test data presentation specifications, wherein the specifications are medical document format specifications built in a system, and are adapted to scientific research content presentation requirements of medical documents; Removing redundant information, format marks and nonsensical characters in an input document based on a medical universal word stock arranged in the system, removing administrative statements and author introduction irrelevant to scientific research contents through regular matching, constructing regular matching rules according to text structural features and expression features of the administrative statements and the author introduction, arranging the regular matching rules in a medical data preprocessing module, wherein the administrative statement matching rules identify text segments containing sponsored information and credit expressions, and the author introduction matching rules identify text segments containing unit information and communication modes; Invoking a medical term standard library built in the system to perform normalization processing on the names of diseases, the common names of medicines and the diagnosis and treatment technical terms in the literature, wherein the term matching adopts a mode of combining accurate matching and semantic association matching, preferably performs accurate matching, and performs semantic association matching when no accurate matching result exists, so that the semantic consistency of the terms is ensured; A named entity recognition and relation extraction model is adopted, the model is trained and optimized through a medical annotation corpus built in the system, document data after format standardization is used as input by the model, scientific research elements of research types, test design types, sample sizes, intervention measures, ending indexes, statistical methods and conclusion points are extracted, extraction logic is constructed according to the expression positions of the scientific research elements in medical documents and associated keyword guide characteristics, a structured data dictionary is formed after extraction, and the data dictionary format is consistent with the storage format of the structured document data.
  5. 5. The subject service-based intelligent review and hierarchical recommendation interaction system for medical documents of claim 1, wherein the dynamic subject resource module performs the following operations in sequence as it performs the updating and adapting: The special scientific research feature library of each field is constructed according to the discipline branches of a four-level medical discipline system, the feature data comprises a common research method, clinical trial design specifications, a core index system, ethical examination points and high-frequency related terms of each field, the feature data is refined by systematically analyzing high-introduced documents, authoritative guidelines and standard operation rules of each discipline field, and the feature data format is simultaneously adapted to the semantic analysis requirement of a special medical check module and the classified screening requirement of a discipline adaptation classified recommendation module; The dynamic discipline resource module is connected with a medical field document release platform, a scientific research project release platform and an academic conference release platform, the latest release results, project item directions and guide update contents are grasped in real time through a data grasping technology built in the system, when a single-platform single-discipline newly-added document forms a theme cluster, the update of a research hotspot graph is triggered, a judgment rule of the theme cluster is built in the dynamic discipline resource module, the updated research hotspot graph is synchronized to the discipline adaptation hierarchical recommendation module through a triggering mechanism, and the data real-time adaptation is realized through the update data of the research hotspot graph and the hierarchical screening flow of the discipline adaptation hierarchical recommendation module.
  6. 6. The subject service-based intelligent review and grading recommendation interaction system for medical documents of claim 3, wherein the subject adaptation grading recommendation module sets a bi-directional optimization mechanism, and the following operations are sequentially completed when the mechanism is executed: receiving user literature checking time length, downloading behaviors, marking operation, repeated checking parameter setting and recommendation feedback evaluation data transmitted by a scientific research scene interaction module, and constructing a user scientific research behavior portrait based on the data, wherein the user scientific research behavior portrait comprises subject classification, research methods and evidence-level feature labels of user high-frequency interaction documents; Based on the user scientific research behavior portraits, mining the core research direction, the preference research method and the scientific research stage of the user by clustering the key words of the literature subject, the statistics of the research method of the downloaded literature, the setting of the search repetition parameters and the matching of the corresponding scientific research scene data, wherein the checking time length accords with the deep reading characteristics of the literature, the execution rules of the mining methods are built in a subject adaptation hierarchical recommendation module, updating the exclusive research labels and the interest key word library of the user, and the formats of the labels and the key word library are consistent with the label format of the research direction of the user; receiving repeated content subject and similar literature subject distribution data transmitted by a medical exclusive review module, reversely inputting the data into an algorithm model of a hierarchical recommendation strategy, automatically adjusting a weight matrix of a hierarchical dimension by the model, and storing the adjusted weight matrix in a subject exclusive feature library of a dynamic subject resource module; And receiving interactive behavior data of clicking, downloading, collecting and feedback evaluation of the recommended documents by a user transmitted by the scientific research scene interaction module, and combining the data to iteratively optimize a scientific research scene adaptation algorithm, wherein the algorithm optimization is realized by analyzing the correlation between the interactive behavior data and the characteristic data of the recommended documents, and the optimized algorithm parameters are synchronized to a hierarchical screening flow and stored in an algorithm parameter library of the scientific research adaptation hierarchical recommendation module.
  7. 7. The subject service-based intelligent medical document weight checking and grading recommendation interaction system as claimed in claim 1, wherein when the scientific research scene interaction module realizes functions, the following operations are sequentially completed: Setting a weight checking parameter customization interactive interface, wherein the interface comprises a subject branch weight adjustment option, a repeated labeling granularity selection option and a comparison document evidence level range limit option, the adjustable parameters comprise weight coefficients of similarity dimensions, a labeling granularity option and a comparison document evidence level range, and the parameter adjustment logic establishes a fixed data association with a similarity calculation flow of a medical exclusive weight checking module; The visual display of the duplicate checking result is completed by adopting the information priority hierarchical display layout, the display content comprises comprehensive similarity score, three-level repeated content labeling, similar document evidence level comparison and repeated scientific research element association graphs, the core repeated content and the high evidence level similar document are preferentially displayed in the display layout, the color coding adopts a medical document visual color scheme built in the system, and the color saturation and the repetition degree corresponding to the repeated characteristics are positively and dynamically adjusted; Setting a multi-dimensional screening interactive interface of a recommended document, wherein the interface supports a user to screen the recommended document according to the condition combination of evidence level, subject branch, test design type, publishing time and sample size, screening logic is realized through the accurate matching and range matching of a condition corresponding field, logic and operation are executed during multi-condition screening, and screening conditions are in one-to-one correspondence with the grading dimension of a subject adaptation grading recommendation module; The method is characterized in that a distributed database architecture is adopted to store the full-flow data of the user scientific research, the stored content comprises a repeated historical record, a repeated content modification track, a recommended literature interaction record and a parameter setting scheme, a data index is established according to the serial numbers of scientific research projects, the classified backtracking and the export of the full-flow data of the user scientific research are realized, and the data storage format is simultaneously adapted to the formats of the structured literature data and the classified recommended literature list.
  8. 8. The subject service-based intelligent duplication checking and grading recommendation interaction system for medical documents according to claim 2, wherein when the duplication checking module for medical exclusive use performs duplication content differentiation labeling, the following operations are sequentially completed: the full-text level labeling distinguishes similarity levels through color gradients, the corresponding relation between the color gradients and the similarity levels is set according to a visual identification rule of medical documents constructed by the system, subject classification, evidence level and data source reliability score of the similar documents are synchronously displayed, and the display content format is matched with the visual display layout of the scientific research scene interaction module; the chapter level label adds a subject adaptation identifier beside the repeated chapter title, the label information comprises core repeated elements of the chapter, and the identifier format is consistent with the interactive interface display specification of the scientific research scene interaction module; The scientific research element level label aims at repeated clinical test design, core data indexes and conclusion deducing logic, specific comparison details and difference points between similar documents are displayed through a popup window, the comparison logic is constructed according to the association characteristics of the scientific research elements and is built in a medical exclusive duplicate checking module, and the display content of the popup window corresponds to the screening function condition field of the scientific research scene interaction module; Setting a label objection submitting inlet, triggering re-detection after the user submits objection, re-detecting and executing the multi-dimensional similarity calculation whole flow of the medical exclusive check module, and synchronously updating the re-detection result to a visual display interface of the scientific research scene interaction module.
  9. 9. The subject service-based intelligent review and hierarchical recommendation interaction system for medical documents of claim 1, wherein the system sets a subject-specific iteration mechanism, and the following operations are sequentially completed when the mechanism is executed: Collecting objection feedback of a user check result and recommended document suitability evaluation data transmitted by a scientific research scene interaction module, optimizing a semantic analysis model and similarity calculation parameters of a medical exclusive check module by combining with industry data of medical field term update, clinical test specification revision and guideline version upgrade, determining an optimization direction according to a relevance analysis result of feedback problems and industry update data, and synchronously storing the optimized model and parameters into a database of a corresponding module; Updating each domain exclusive scientific research feature library of the dynamic discipline resource module, supplementing feature data of new discipline branches, judging basis of a novel research method and exclusive indexes of rare disease research, refining the supplemental data by analyzing discipline development dynamic documents, and synchronously adapting the supplemented feature data to a similarity calculation flow of a medical exclusive check and repeat module and a classified screening flow of a discipline adaptation classified recommendation module; Based on the data analysis result of the medical science research trend change, adjusting scientific research element extraction rules of a medical data preprocessing module, grading dimension weights of a disciplinary adaptation grading recommendation module and visual presentation and interaction setting of a scientific research scene interaction module, synchronizing the adjusted rules, weights and settings to corresponding modules, and putting the system into operation after the adjustment parameters of the modules are synchronized.
  10. 10. The subject service-based intelligent review and grading recommendation interaction system for medical documents of claim 7, wherein when the scientific research scene interaction module is used for realizing the interface adaptation of the medical scientific research scene, the following operations are sequentially completed: Providing a standardized interface for interfacing with a medical field literature platform, a clinical test registration platform and a scientific research management platform, supporting direct retrieval and check-up comparison of literature data and test design schemes, wherein the interface data format is consistent with a structured literature data format generated by a medical data preprocessing module, and the interface communication rule is consistent with a medical data interaction communication rule built in a system; The method supports integration with a medical paper writing tool and a evidence-based medical analysis tool, realizes embedded application of duplication checking marking and recommending literature reference, adapts the detection result of a medical exclusive duplication checking module to an integrated data transmission format and adapts the hierarchical recommending literature list format of a hierarchical recommending module to a discipline, and the integrated interaction logic is consistent with the user operation flow of a scientific research scene interaction module; Providing a plurality of scientific research scene deriving formats, wherein the derived contents comprise a review report, a recommended literature list and a repeated element comparison map, all derived contents are subjected to standardized processing, the derived formats are medical scientific research common document formats built in a system, the derived content fields correspond to visual display contents and screening function condition fields of a scientific research scene interaction module, and each derived format is respectively suitable for paper publication, project declaration and general written scientific research scene data requirements.

Description

Medical literature intelligent duplicate checking and grading recommendation interaction system based on discipline service Technical Field The invention relates to the technical field of literature service, in particular to a medical literature intelligent duplicate checking and grading recommendation interaction system based on discipline service. Background The number of documents in the medical field is continuously increased along with the development of scientific researches, and the demands of scientific researchers on document weight checking and accurate recommendation are increasingly urgent. Most of existing document processing systems are of universal architecture, lack of special adaptation design for the medical field, difficulty in accurately capturing subject core features such as semantic association of medical terms, clinical trial design difference and the like, and easiness in occurrence of similarity judgment deviation during repeated investigation, and incapability of effectively distinguishing repetition and innovation points of core scientific research elements in medical documents. Meanwhile, the algorithm model and parameter setting of the existing system are manually preset, the technical implementation details are not fully disclosed, the black box problem exists, the latest examination guideline requirements are not met, the resource library update is delayed, and the dynamic changes of terms, specifications and research hotspots in the medical field cannot be adapted. In addition, the existing system has insufficient cooperativity of each functional module, the data interaction rules are not uniform, the weight checking result is disjointed with the recommended service, the full-flow requirement of scientific researchers from weight checking to accurate acquisition of the adaptive document is difficult to meet, and the improvement of the medical scientific research efficiency is restricted. Disclosure of Invention In order to solve the technical problems in the prior art, the embodiment of the invention provides a medical literature intelligent duplicate checking and grading recommendation interaction system based on discipline service. The technical scheme is as follows: On the one hand, the intelligent duplication checking and grading recommendation interaction system for the medical literature based on discipline service comprises a medical exclusive duplication checking module, a discipline adaptation grading recommendation module, a scientific research scene interaction module, a medical data preprocessing module and a dynamic discipline resource module, wherein data interaction and parameter synchronization are realized by the modules through an internal standardized data interface, the data interface adopts a medical data interaction communication rule built in the system, a data transmission format is a structural format preset by the system, and data processed by the modules are stored in a distributed database. The medical data preprocessing module receives medical documents input by a user, performs format standardization, special term normalization, scientific research element extraction and noise filtering operation according to a medical term standard library and clinical test data specification built in the system, generates structured document data containing research subjects, test designs, data indexes and conclusion expression, and triggers data transmission to the medical exclusive duplicate checking module through an internal data interface after the preprocessing operation is completed; The medical exclusive check and reconstruction module receives structured document data, invokes a medical ontology knowledge base and a discipline exclusive feature base of the dynamic discipline resource module, merges medical semantic association analysis and multidimensional similarity calculation, completes similarity detection on the full text of the document and the scientific research element level, positions repeated content and associates discipline classification and evidence level of similar document, and synchronously pushes detection results to the discipline adaptation classification recommendation module and the scientific research scene interaction module through an internal data interface; The dynamic discipline resource module constructs a four-level medical discipline system, the first-level discipline is taken as a framework to divide the second-level branch, the second-level branch is further subdivided according to disease subtypes, each disease subtype corresponds to a specific research direction, the discipline system covers basic medicine, clinical medicine, preventive medicine, pharmacy, public health and the subordinate subdivision directions in each field, and the system formulates a special classification coding rule for the four-level medical discipline system; the subject adaptation hierarchical recommendation module receives a similarity det