CN-121996353-A - Nuclear power production data processing system and method based on cloud primordial technology

CN121996353ACN 121996353 ACN121996353 ACN 121996353ACN-121996353-A

Abstract

The disclosure belongs to the technical field of nuclear power, and particularly relates to a nuclear power production data processing system and method based on a cloud primordial technology. The system of the present disclosure proposes deployment of ETL-NIFI framework-related principles and simplified installation deployment based on Kubernetes and Docker containerized virtualization technologies. Convenient NIFI installation, deployment and maintenance are provided, and powerful, safe, reliable and data migration functions are provided through NIFI. Therefore, the manpower consumption for different data migration, backup and other requirements is reduced, and the accuracy and the instantaneity of the data are ensured. The migration of data can be realized through the custom processor according to different requirements, and the high expansion is supported.

Inventors

LI JIE
CHEN WU
WU YAOYAO
LI ZHIANG

Assignees

核动力运行研究所

Dates

Publication Date: 20260508
Application Date: 20260115

Claims (8)

1. Nuclear power production data processing system based on cloud primordial technology, characterized by comprising: a NiFi containerized application instance running in a Kubernetes cluster, the instance deployed with APACHE NIFI core services, the core services running in a Java virtual machine and comprising: a flow controller for distributing execution threads and scheduling data processing resources for the NiFi processor; the stream file warehouse is used for storing the current state and metadata of the stream file in a pluggable lasting pre-written log mode; a content repository for storing actual content bytes of the stream file on a configurable plurality of physical storage paths in a pluggable mechanism; The data tracing warehouse is used for storing, indexing and inquiring event data of all data processing processes by a pluggable mechanism; The Web server is used for bearing an HTTP-based NiFi command, control and graphical data stream design API and providing a visual interface to support the design and modification of the data stream in the running process; The system is deployed through a PaaS management platform or a Kubernetes resource configuration file, wherein the configuration file is used for defining computing, storage resources and service exposure modes required by the NiFi containerized application instance.
2. The system of claim 1, wherein the NiFi core service supports function expansion by a custom processor, wherein the custom processor is developed by implementing a specific Java interface or inheritance abstract class, and is packaged into an expansion package in a NiFi archiving NAR format, and the expansion package is loaded and validated by an expansion mechanism of the system.
3. The system of claim 1, wherein the system further comprises: the data delivery module is used for jointly realizing data delivery through a persistent pre-write log of the stream file warehouse and a copy-on-write technology of the content warehouse; the data back pressure module is used for triggering data back pressure when the data processing queue reaches a preset capacity or time threshold so as to control the data inflow rate; The data tracing module is used for recording and indexing event logs of each operation in the data stream to form a data processing link; The system comprises a priority queue module, a priority queue module and a priority processing module, wherein the priority queue module is used for setting one or more priority queues and retrieving data from each queue; and the QoS guarantee module is used for providing QoS guarantee configuration for the specified data flow.
4. The system of claim 1, further comprising a monitoring alarm module for monitoring an operational status of the NiFi containerized application instance, the monitoring indicator comprising at least: the use condition of JVM heap memory and non-heap memory of the node; the storage space utilization rate of the stream file warehouse, the content warehouse and the data tracing warehouse; average processing time of processor and number of system daemon threads.
5. The system of claim 1, wherein the system supports a flow template function that allows a user to save a constructed data flow process flow as a template and publish, share and multiplex through the visualization interface.
6. A nuclear power production data processing method based on a cloud native technology, wherein the method is implemented based on the system of any one of claims 1 to 5, the method comprising: Step 100, resource allocation, namely defining deployment parameters of a NiFi containerized application instance, including container mirror sources, CPU and memory resource requirements, persistent storage volume declarations and service ports through a PaaS management platform or by writing a Kubernetes resource configuration file; Step 101, clustered deployment, namely, starting a clustered mode in configuration, setting cluster node addresses and protocol ports, and configuring a ZooKeeper connection character string to realize state management and coordination of nodes; Step 102, application deployment, namely submitting the resource configuration file through a Kubernetes API, and creating and running the NiFi containerized application instance in a Kubernetes cluster; Step 103, flow layout, namely designing and arranging a data flow processing diagram formed by a plurality of Processor and Connection in a dragging mode through a Web visual interface provided by a NiFi example; and 104, performing operation and maintenance management, namely horizontally expanding, rolling and updating, fault self-healing and resource dynamic adjustment on the running NiFi instance by utilizing the container arrangement capability of Kubernetes.
7. The method according to claim 1, wherein the method further comprises: step 200, packaging the data to be processed into a stream file FlowFile, wherein the stream file comprises an attribute set and a content reference pointer pointing to an actual content data storage position; Step 201, when the processor needs to modify the content of the streaming file, the system creates an original data copy in the content repository, performs modification operation on the copy, and retains the original data; step 202, when the modification operation occurs, the system generates and stores a snapshot of the stream file and its processing context in the data tracing warehouse; And 203, writing the processed data into a target data storage system through an output processor.
8. A non-transitory computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of claim 6 or 7.

Description

Nuclear power production data processing system and method based on cloud primordial technology Technical Field The invention belongs to the technical field of nuclear power, and particularly relates to a nuclear power production data processing system and method based on a cloud primordial technology. Background The informatization and digital construction of the nuclear power industry depends on software services, the scale of related services is continuously enlarged along with the expansion of basic services of the nuclear power industry, related data is continuously increased, and the requirements of processing and using a large amount of data are met. Because the complex nuclear power service system uses a plurality of different data stores, the complex nuclear power service system relates to operations such as data format, garbage data processing, dumping of different storage tools and the like under the use situations of data export, import, migration and the like. The traditional data migration method occupies a large amount of manpower and time, and the accuracy, timeliness and the like of the data are affected. Disclosure of Invention In order to overcome the problems in the related art, a nuclear power production data processing system and method based on a cloud primordial technology are provided. According to an aspect of the embodiments of the present disclosure, there is provided a nuclear power production data processing system based on a cloud native technology, including: a NiFi containerized application instance running in a Kubernetes cluster, the instance deployed with APACHE NIFI core services, the core services running in a Java virtual machine and comprising: a flow controller for distributing execution threads and scheduling data processing resources for the NiFi processor; the stream file warehouse is used for storing the current state and metadata of the stream file in a pluggable lasting pre-written log mode; a content repository for storing actual content bytes of the stream file on a configurable plurality of physical storage paths in a pluggable mechanism; The data tracing warehouse is used for storing, indexing and inquiring event data of all data processing processes by a pluggable mechanism; The Web server is used for bearing an HTTP-based NiFi command, control and graphical data stream design API and providing a visual interface to support the design and modification of the data stream in the running process; The system is deployed through a PaaS management platform or a Kubernetes resource configuration file, wherein the configuration file is used for defining computing, storage resources and service exposure modes required by the NiFi containerized application instance. In one possible implementation manner, the NiFi core service supports function expansion through a custom processor, wherein the custom processor is developed through realizing a specific Java interface or inheritance abstract class, and is packaged into an expansion package in a NiFi archiving NAR format, and the expansion package is loaded and validated through an expansion mechanism of the system. In one possible implementation, the system further includes: the data delivery module is used for jointly realizing data delivery through a persistent pre-write log of the stream file warehouse and a copy-on-write technology of the content warehouse; the data back pressure module is used for triggering data back pressure when the data processing queue reaches a preset capacity or time threshold so as to control the data inflow rate; The data tracing module is used for recording and indexing event logs of each operation in the data stream to form a data processing link; and each queue adopts any one of first-in first-out, last-in first-out and maximum first-out processing strategies according to the setting. And the QoS guarantee module is used for providing QoS guarantee configuration for the specified data flow. In a possible implementation manner, the system further includes a monitoring alarm module, where the monitoring alarm module is configured to monitor an operation state of the NiFi containerized application instance, and the monitoring index includes at least: the use condition of JVM heap memory and non-heap memory of the node; the storage space utilization rate of the stream file warehouse, the content warehouse and the data tracing warehouse; average processing time of processor and number of system daemon threads. In one possible implementation, the system supports a flow template function, allowing a user to save a constructed data flow process flow as a template and publish, share and multiplex through the visualization interface. According to another aspect of the embodiments of the present disclosure, there is provided a nuclear power production data processing method based on a cloud native technology, the method being implemented based on the system of any one of claims 1 to 5, the method