Search

CN-121981586-A - Publication content intelligent checking system based on deep learning

CN121981586ACN 121981586 ACN121981586 ACN 121981586ACN-121981586-A

Abstract

The invention discloses an intelligent journal content checking system based on deep learning, and relates to the field of intelligent journal content checking. According to the publication content intelligent checking system based on deep learning, quantitative output of checking results is achieved through the checking comprehensive confidence coefficient formula, the passing threshold of checking is defined by the qualification standard with the C value more than or equal to 85 minutes, traditional manual subjective judgment is replaced, the checking examination of bids of different editions and different departments is unified, and the checking result dispute rate can be effectively reduced. The system integrates instantaneity, specialty and quantification standard depth through an innovative architecture of edge-cloud cooperation, not only solves the efficiency and safety pain points of the traditional checking mode, but also meets the long-term development requirements of the publishing industry through domestic adaptation and closed-loop optimization, and has remarkable technical innovation and service practicability.

Inventors

  • PAN GUANGTIAN
  • Mo Mingling
  • Ruan Ruitao

Assignees

  • 广西数智出版传媒有限公司

Dates

Publication Date
20260505
Application Date
20251204

Claims (8)

  1. 1. The publication content intelligent checking system based on deep learning comprises an edge checking layer, a cloud core checking layer, a data supporting layer and an application interaction layer, and is characterized in that the edge checking layer is a front end sensing and real-time interception layer and is deployed at an edge node, and the nearby processing of checking tasks is realized based on a lightweight deep learning model; the cloud core checking layer relies on a G440KV2 computing power server cluster deployed locally to run DeepSeek full-blood-plate models and is responsible for deep checking tasks which cannot be processed by the edge checking layer; the data support layer is used for providing data storage, knowledge support and treatment services for the edge checking layer and the cloud core checking layer, and is constructed based on a domestic vector database and a knowledge base management system; The application interaction layer is used as an interaction entrance between the system and the user, and is adapted to the use requirements of different roles in the use range.
  2. 2. The deep learning-based publication content intelligent review system as set forth in claim 1, wherein said edge review layer comprises: The real-time grammar error correction sub-module integrates a light DeepSeek-Lite model, identifies basic errors of wrong words, written words, punctuation errors and 'obtained from the ground' misuse in the text input process in the publication content, supports the deployment of main stream editing software plug-ins, marks the checking results in the original text in real time, and particularly outputs edge checking basic scores which are fully divided into 100 points by counting the publication content basic error correction rate and the error identification accuracy rate ; An edge sensitive content interception sub-module is used for internally arranging a localized sensitive word stock and a leader information reference stock, realizing local real-time comparison of sensitive content through an edge computing technology, and particularly counting the interception rate of the sensitive content And interception rate Outputting a score of the edge checking sensitive item which is divided into 100 points When (when) And is also provided with The time is 100 minutes, and the time is 100 minutes, Every time 0.5 percent of the button is lowered for 2 minutes, If the weight is increased by 0.1 percent for 3 minutes 、 When the risk item is directly judged, the checking flow is suspended and an administrator alarm is triggered; the format normalization preprocessing sub-module automatically identifies format elements of title level, font size and line spacing in the publication content, performs primary normalization processing according to the publishing industry standard, synchronizes format problem data to the cloud for deep verification, and simultaneously counts serious format errors Number and format compliance rate of (a) Simultaneously outputting the edge check format score which is divided into 100 points When (when) 100 Points are obtained, and each time 1 percent of the buckle is lowered, 5 points are obtained; is a non-negative integer.
  3. 3. The intelligent review system for publication contents based on deep learning according to claim 2, wherein the edge review base score The method specifically comprises the following steps: Error recognition accuracy The real basic error number recognized by the real-time grammar error correction submodule accounts for the proportion of the total basic error number in the publication content, wherein the proportion is set to be 60 percent, if 60 Points are obtained, 0.6 points are deducted every 1 percent of the points are reduced, if 0 Score, forcibly triggering manual review and marking, and optimizing the lightweight DeepSeek-Lite model based on marked content; Basic error correction rate The basic errors identified by the real-time grammar error correction sub-module are manually adopted, and the proportion of the correction suggestions which are verified to be in accordance with the publication specifications is set to 40 percent when 40 Points are obtained, 0.4 points are deducted every 1 percent of the points are reduced, if 0 Score, forcibly triggering manual review and marking, and optimizing the lightweight DeepSeek-Lite model based on marked content; Then: 。
  4. 4. the intelligent review system for publication contents based on deep learning as set forth in claim 3, wherein according to said method 、 、 Obtaining the edge checking score The formula is: ; and 0.4, 0.5 and 0.1 are weights, and are set according to the priority of publication content publishing business, namely, preprocessing the format of sensitive content interception > basic grammar error correction >.
  5. 5. The intelligent review system of publication content based on deep learning as set forth in claim 4, wherein said cloud core review layer comprises: The knowledge element level matching checking sub-module is used for constructing a publishing professional knowledge element base, decomposing the content of the publication into knowledge elements through a vector generation model, converting the knowledge elements into high-dimensional vectors, and carrying out cosine similarity calculation with the vectors in the knowledge element base to obtain the knowledge element matching degree At the same time, statistics is carried out on the matching qualification rate of knowledge When (when) Judging that the knowledge is qualified when the knowledge is more than or equal to 0.8, <0.8, Marking as suspicious knowledge errors and automatically generating error suggestions, wherein the knowledge element level matching checking submodule is used for outputting the scores of the cloud checking knowledge items The total division is 100: 100 points are obtained when the weight is more than or equal to 98 percent, and 4 points are deducted when the weight is reduced by 1 percent; the multi-mode content checking sub-module integrates image recognition, voice recognition and video analysis technologies to check the picture, audio and video content in the publication, including picture sensitive detection, video subtitle error recognition and audio violation information interception, and simultaneously counts the multi-mode content compliance rate The multi-mode content checking sub-module outputs cloud checking multi-mode scores The total of the content is divided into 100 minutes, namely the multi-mode content compliance rate 100 Points are obtained when the temperature is more than or equal to 99 percent, and each time 1 percent is reduced by 5 points, if the sensitive risk is detected, the method directly uses Setting 0 and triggering an emergency auditing process; Logic and body verification sub-module for verifying journal content outline, chart formula number, reference arrangement, chapter logic continuity, identifying number discontinuity, title repetition, logic contradiction problem, and simultaneously counting logic compliance rate The logic and body check submodule outputs a cloud check logic item score The total score is 100: When the number is more than or equal to 97%, 100 points are obtained, and each time 1 percent of the number is reduced by 3 points, the logic error type is recorded; Based on 、 、 The cloud end check score is obtained through weighted summation calculation The formula is: ; and 0.5, 0.3 and 0.2 are weights, and the expertise matching > multi-modal content > logic is set according to the checking depth priority.
  6. 6. The deep learning-based publication content intelligent review system as set forth in claim 5, wherein said data support layer comprises: the publishing professional knowledge base submodule comprises a compiling and correcting case base, a law and regulation base, an ancient poem library and a term base, supports multidimensional knowledge retrieval and dynamic updating and is used for Calculating to provide accurate knowledge metadata; Vector database submodule, which adopts KBase vector database to store the content of knowledge element and publication content in vectorization, satisfies the requirement Real-time vector matching requirements in the calculation process; The data management sub-module is used for realizing acquisition, cleaning, labeling and quality inspection of check data and collecting the edge layer 、 Related error data and cloud layer 、 The verification result of the data is converted into a structured sample which can be identified by a model, and meanwhile, the data desensitization processing is supported, the copyright safety is protected, and the data is used for the data identification And Provides a high quality, compliant data input.
  7. 7. The deep learning-based publication content intelligent review system as set forth in claim 6, wherein said application interaction layer comprises: The checking structure display sub-module judges the qualification degree of the journal content checking based on a checking comprehensive confidence coefficient formula, wherein the formula is as follows: ; Wherein the method comprises the steps of To check the comprehensive confidence coefficient for the publication content, the value is 0-100; Setting the weight coefficient to be 0.3, and checking the edge; setting the weight coefficient to be 0.7, and checking the weight coefficient by the cloud; Setting the value to 0.5, and obtaining a format error punishment coefficient; The user authority management sub-module is used for distributing differentiated authorities for different roles of editors, auditors and administrators based on RBAC architecture, wherein the editors can view The value and the error details of the personal processing manuscript can be modified and adjusted by an auditor on the premise of double rechecking 、 、 、 The threshold value of the parameter, the administrator can configure the weight coefficient of the formula and manage the knowledge base authority, and accords with the copyright protection and data security requirements of the publishing industry; the checking task management sub-module supports the creation, distribution, tracking and archiving of checking tasks and records each task Value change, error modification record, Matching logs, history The value data is associated with the error type as a sample of DeepSeek model fine-tuning.
  8. 8. The intelligent review system for publication contents based on deep learning as set forth in claim 7, wherein the system comprises The method is divided into qualified check and directly enters the next publishing link; Is divided into to-be-checked and pressed Value is sorted from low to high to allocate audit resources and key marks In (a) Is a correction proposal item of (a); The method is characterized in that the method is classified as unqualified, error types are automatically associated, targeted modification guidance is pushed, and meanwhile, specific error positions and modification suggestions are presented in the forms of error tables, highlighting marks and visual reports, so that error tracing is supported.

Description

Publication content intelligent checking system based on deep learning Technical Field The invention relates to the field of intelligent review of publication contents, in particular to an intelligent review system of publication contents based on deep learning. Background In the digital transformation process of the publishing media industry, the journal content checking serves as a core link for guaranteeing the publishing quality, and the technical challenges of multiple dimensions are in contradiction with the industry demands. The current main stream checking mode is mainly divided into a single cloud checking mode and a local isolation checking mode, and obvious technical bottlenecks exist: Firstly, although the deep verification can be realized by means of stronger calculation, the problems of high data transmission delay and large risk of cross-network transmission of sensitive content exist in the single cloud verification, namely, the manuscripts recorded by an editing workstation need to be uploaded to the cloud in full, the response time often exceeds hundreds of milliseconds, sensitive data such as administrative information, author privacy and the like are easy to cause compliance risks in the transmission process, meanwhile, the resources are occupied intensively due to excessive dependence on cloud calculation, and the verification efficiency in the peak period is greatly reduced. Secondly, the local isolation checking is based on a rule engine to realize basic error recognition, semantic understanding capability of a deep learning model is lacked, recognition accuracy of complex problems such as expert knowledge errors, multi-mode content violations and the like is insufficient, dynamic updating of a knowledge base and multi-terminal coordination cannot be realized, and poor consistency of checking results and insufficient professional depth are caused. Meanwhile, the quantitative evaluation requirement of the publishing industry on the checking work is increasingly urgent, the prior art relies on manual subjective judgment of the checking effect, a quantitative index system of the whole process of identification, correction and checking is lacking, and standardized management and control of the checking quality are difficult to realize. In addition, the requirements of domestic computing power adaptation and localized deployment are outstanding, and the traditional checking system which depends on non-domestic hardware and models cannot meet the core requirements of a publishing group in the aspects of data safety and autonomous controllability. Under the background, the edge computing and cloud cooperative architecture is fused, and an intelligent checking system with real-time performance, professional performance and quantification standard is built based on a domestic computing power base and a deep learning model, so that the intelligent checking system becomes a key technical direction for solving the pain points of the current publishing checking. Therefore, it is necessary to provide a publication intelligent review system based on deep learning to solve the above problems. Disclosure of Invention The invention mainly aims to provide an intelligent journal content checking system based on deep learning, which can effectively solve the problems in the background technology. In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: the intelligent journal content checking system based on deep learning comprises an edge checking layer, a cloud core checking layer, a data supporting layer and an application interaction layer, wherein the edge checking layer is a front end sensing and real-time interception layer and is deployed at an edge node, and the near processing of checking tasks is realized based on a lightweight deep learning model; the cloud core checking layer relies on a G440KV2 computing power server cluster deployed locally to run DeepSeek full-blood-plate models and is responsible for deep checking tasks which cannot be processed by the edge checking layer; the data support layer is used for providing data storage, knowledge support and treatment services for the edge checking layer and the cloud core checking layer, and is constructed based on a domestic vector database and a knowledge base management system; The application interaction layer is used as an interaction entrance between the system and the user, and is adapted to the use requirements of different roles in the use range. Preferably, the edge checking layer specifically includes: The real-time grammar error correction sub-module integrates a light DeepSeek-Lite model, identifies basic errors of wrong words, written words, punctuation errors and 'obtained from the ground' misuse in the text input process in the publication content, supports the deployment of main stream editing software plug-ins, marks the checking results in the original text in real time, and particu