Search

CN-121995003-A - Hydrogen production mode identification method and system based on tree classifier

CN121995003ACN 121995003 ACN121995003 ACN 121995003ACN-121995003-A

Abstract

The invention discloses a hydrogen production mode identification method and a hydrogen production mode identification system based on a tree classifier, which belong to the technical field of hydrogen analysis, wherein the hydrogen production mode identification method based on the tree classifier comprises the following steps of S1, extracting the content of original components of a hydrogen sample; S2, constructing engineering characteristics based on the content of the original components, wherein the engineering characteristics comprise component proportion, component entropy, logarithmic transformation, component interaction items and main components, and S3, identifying a hydrogen production mode, namely synthesizing a minority of oversampling by taking the engineering characteristics as input and judging the hydrogen production mode through a tree classifier, wherein the minority of oversampling is used for relieving class unbalance. By adopting the technical scheme, the hydrogen production mode can be rapidly analyzed by using the tree classifier based on the original component content and engineering characteristics of the hydrogen sample.

Inventors

  • LIANG XIN
  • Zhan Yuechen
  • NING MIN

Assignees

  • 中国石油化工股份有限公司
  • 中国石化销售股份有限公司
  • 中国石化销售股份有限公司华北分公司

Dates

Publication Date
20260508
Application Date
20251224

Claims (10)

  1. 1. The hydrogen production mode identification method based on the tree classifier is characterized by comprising the following steps of: S1, extracting the content of original components of a hydrogen sample; S2, constructing engineering characteristics based on the content of the original components, wherein the engineering characteristics comprise component duty ratio, component entropy, logarithmic transformation, component interaction items and main components; S3, identifying a hydrogen production mode, namely synthesizing minority class oversampling by taking the engineering characteristics as input to be used for relieving class imbalance, and judging the hydrogen production mode through a tree classifier.
  2. 2. The method for identifying hydrogen production mode based on tree classifier as claimed in claim 1, wherein the raw component contents include H 2 O、O 2 , total hydrocarbon, he, N 2 、Ar、CO 2 , CO, total sulfur, HCHO, HCOOH, total halide, NH 3 , total non-hydrogen, hydrogen purity and maximum particulate concentration.
  3. 3. The hydrogen production mode identification method based on the tree classifier as claimed in claim 1, wherein S1 comprises: s101, extracting a hydrogen sample; s102, detecting the content of original components in the hydrogen sample; s103, cleaning the data of the content of the original components; s104, defining and mapping target variables.
  4. 4. The hydrogen production mode identification method based on the tree classifier as claimed in claim 1, wherein S2 comprises: s201, constructing 50 engineering features based on the original component content; S202, after combining basic features, the total feature number of the engineering feature set is 66.
  5. 5. The method for identifying hydrogen production mode based on tree classifier as claimed in claim 4, wherein S201 includes calculating component sum, inert gas sum, hydrocarbon to carbon dioxide ratio, oxygen to water ratio, composition entropy, zero value component count, missing value count based on the raw component contents, calculating the ratio of each raw component content in the component sum, log1p transformed value, and selected inter-component product interaction term, normalizing the raw component matrix, and extracting the first three main component scores to generate 50 engineering features.
  6. 6. The method for identifying hydrogen production mode based on tree classifier as claimed in claim 4, wherein said basic characteristics include raw composition data of water, oxygen, total hydrocarbons, helium, nitrogen, argon, carbon dioxide, carbon monoxide, total sulfur, formaldehyde, formic acid, total halides, ammonia, total non-hydrogen, hydrogen purity, maximum particulate matter concentration.
  7. 7. The hydrogen production mode identification method based on the tree classifier as claimed in claim 1, wherein the minority class oversampling comprises selecting k nearest neighbor in a feature space of minority class samples, generating a synthesized sample by linear interpolation of the minority class samples and nearest neighbor samples until the number of the minority class samples is equal to that of the majority class samples, and realizing class balance of a training data set, thereby enhancing identification sensitivity of the tree classifier to the minority class and improving overall classification performance.
  8. 8. A hydrogen production mode identification system based on a tree classifier, comprising: the data extraction module is used for extracting the content of the original components of the hydrogen sample; The feature generation module is used for constructing engineering features based on the content of the original components, wherein the engineering features comprise component duty ratio, component entropy, logarithmic transformation, component interaction items and main components; And the recognition module is used for recognizing the hydrogen production mode, namely synthesizing minority class oversampling by taking the engineering characteristics as input as a mode for relieving class imbalance, and judging the hydrogen production mode through a tree classifier.
  9. 9. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, is a method for identifying hydrogen production modes based on a tree classifier as claimed in any one of claims 1-7.
  10. 10. A computer readable storage medium comprising instructions that when run on a computer cause the computer to perform the tree classifier-based hydrogen production mode identification method of any one of claims 1-7.

Description

Hydrogen production mode identification method and system based on tree classifier Technical Field The invention belongs to the technical field of hydrogen analysis, and particularly relates to a hydrogen production mode identification method and system based on a tree classifier. Background Hydrogen is a colorless transparent light gas with no pungent smell and easy combustion, and has a chemical molecular formula of H 2, and is formed by combining one hydrogen atom with another hydrogen atom through covalent bonds. As the gas element with the lightest mass and the least density in the nature, the density of the hydrogen under the standard condition is only 0.0899g/L, and the hydrogen has extremely strong diffusivity and permeability. The unique physical and chemical properties make the material play an important role in various fields of industrial manufacture, energy storage, aerospace and the like. The current hydrogen preparation process route adopted in industry is various and mainly comprises the following technical routes of water electrolysis hydrogen production, fossil fuel (such as natural gas and coal) reforming hydrogen production, industrial byproduct gas (such as chlor-alkali industry and coke oven gas) purifying hydrogen production, biomass hydrogen production, photocatalysis hydrogen production and other novel green hydrogen production technologies. The water electrolysis hydrogen production process has high energy consumption but high purity, can be coupled with a renewable energy power generation system, is regarded as a clean hydrogen production mode with the most development prospect, and fossil fuel reforming hydrogen production takes the dominant role in the total global hydrogen supply at present by virtue of a mature technical system and lower production cost. It is worth noting that along with the breakthrough of proton exchange membrane electrolyzer technology and the continuous reduction of renewable energy power generation cost such as wind power, photovoltaic and the like, the large-scale production of green hydrogen in the future gradually realizes the double breakthrough of economic feasibility and environmental friendliness. In the actual operation process of the hydrogen energy industry chain, accurate tracing of the source of hydrogen is often required due to the requirements of quality control, safety supervision, carbon footprint tracking and the like. For example, in the operation link of a hydrogenation station, the preparation process route of the used hydrogen needs to be clarified, in the chemical production process, the purity and impurity content of raw material hydrogen need to be traced, and in the accounting of carbon emission, the environmental attributes of hydrogen with different sources such as gray hydrogen, blue hydrogen, green hydrogen and the like need to be distinguished. The application scenes all provide clear technical requirements for the identification of the earlier production mode of the hydrogen. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a hydrogen production mode identification method and system based on a tree classifier, which are used for rapidly and accurately analyzing the hydrogen production mode based on the original component content and engineering characteristics of a hydrogen sample. The specific technical scheme adopted by the invention is as follows: The first object of the present invention is to provide a hydrogen production mode identification method based on a tree classifier, comprising: S1, extracting the content of original components of a hydrogen sample; S2, constructing engineering characteristics based on the content of the original components, wherein the engineering characteristics comprise component duty ratio, component entropy, logarithmic transformation, component interaction items and main components; S3, identifying a hydrogen production mode, namely synthesizing minority class oversampling by taking the engineering characteristics as input to be used for relieving class imbalance, and judging the hydrogen production mode through a tree classifier. Preferably, the raw component content includes H 2O、O2, total hydrocarbons, he, N 2、Ar、CO2, CO, total sulfur, HCHO, HCOOH, total halides, NH 3, total non-hydrogen, hydrogen purity, and maximum particulate concentration. Preferably, S1 comprises: s101, extracting a hydrogen sample; s102, detecting the content of original components in the hydrogen sample; s103, cleaning the data of the content of the original components; s104, defining and mapping target variables. Preferably, S2 comprises: s201, constructing 50 engineering features based on the original component content; S202, after combining basic features, the total feature number of the engineering feature set is 66. Preferably, S201 comprises calculating a component sum, an inert gas sum, a hydrocarbon to carbon dioxide ratio, an oxygen to water ratio, a composition en