CN-115001505-B - Mass spectrum data compression method, system, storage medium and electronic equipment
Abstract
The invention provides a mass spectrum data compression method, a system, a storage medium and electronic equipment, which comprise the following steps of collecting mass-to-charge ratio data, ion intensity data and mass spectrum number data of a plurality of mass spectrums, sorting the mass-to-charge ratio data, generating a mass-to-charge ratio difference data set through differences between adjacent mass-to-charge ratio data after sorting, reducing the precision of the ion intensity data to generate an ion intensity data set, generating a corresponding mass spectrum number data set through the mass spectrum number data and the mass-to-charge ratio difference data set, and finally compressing the generated set by adopting a conventional compression algorithm. According to the characteristic of small change between adjacent mass-to-charge ratios in the mass spectrum, the mass-to-charge ratios in a plurality of mass spectra are ordered, and the difference between the adjacent mass-to-charge ratios is used for replacing original data. In addition, the ion intensity does not need to have high precision in many occasions, and rounding the ion intensity reduces the precision, so that the storage capacity of the whole mass spectrum data is further reduced.
Inventors
- MA BIN
Assignees
- 上海快序生物科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20220531
Claims (10)
- 1. A method of mass spectrometry data compression, the method comprising the steps of: Collecting mass-to-charge ratio data of a plurality of mass spectra, ion intensity data corresponding to the mass-to-charge ratio data and mass spectrum numbering data corresponding to the mass-to-charge ratio data; Sorting the mass-to-charge ratio data, and generating a mass-to-charge ratio difference data set through the difference between adjacent mass-to-charge ratio data after sorting; Step three, generating a corresponding mass spectrum number data set through the mass spectrum number data and the mass-to-charge ratio difference data set; And fourthly, carrying out data compression on the mass-to-charge ratio difference data set, the ion intensity data set and the mass spectrum number data set through a data compression algorithm.
- 2. The method of claim 1, wherein the second step comprises sorting the mass to charge ratio data from small to large.
- 3. The method of claim 1, wherein the second step comprises reserving two decimal places for the mass to charge ratio data.
- 4. The method of claim 1, wherein the second step comprises rounding the ionic strength data.
- 5. A method of compressing mass spectrometry data as recited in claim 1, wherein adjacent data in said set of mass spectrometry numbered data is stored as difference values.
- 6. The method of claim 5, wherein the difference between adjacent data in the set of mass spectrometry numbered data is: W=(k2-k1+K)%K, Where K1 and K2 are two adjacent data and K is the total number of mass spectra.
- 7. The method of claim 1, wherein the data compression algorithm comprises LZW algorithm, LZSS algorithm, arithmetic coding algorithm, huffman coding algorithm and RLE coding algorithm.
- 8. A mass spectrometry data compression system, the system comprising: The data acquisition module is used for acquiring mass-to-charge ratio data of a plurality of mass spectrums, ion intensity data corresponding to the mass-to-charge ratio data and mass spectrum serial number data corresponding to the mass-to-charge ratio data; the data ordering module is used for ordering the mass-to-charge ratio data; the data difference calculation module is used for calculating the difference of the adjacent mass-to-charge ratio data; the data rounding calculation module is used for rounding calculation of the ionic strength data; The data integration module is used for generating a mass-to-charge ratio difference data set, an ion intensity data set and a mass spectrum number data set; And the data compression module is used for carrying out data compression on the mass-to-charge ratio difference data set, the ion intensity data set and the mass spectrum number data set through a data compression algorithm.
- 9. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of a method for compressing mass spectrometry data according to any one of claims 1 to 7.
- 10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of a method for compressing mass spectrometry data according to any one of claims 1 to 7 when the computer program is executed by the processor.
Description
Mass spectrum data compression method, system, storage medium and electronic equipment Technical Field The invention relates to the technical field of mass spectrometry, in particular to a mass spectrometry data compression method, a mass spectrometry data compression system, a storage medium and electronic equipment. Background During mass spectrometry, the amount of data generated by high-precision mass spectrometry is enormous. When analyzed by mass spectrometry, 10GB or more of data can be produced per hour. With the development of mass spectrometry technology and the increase of analysis projects, mass spectrum data volume is also rapidly increasing, and the storage and transmission of mass spectrum data can be a potential problem. However, the conventional data compression algorithm is adopted in the existing mass spectrum data compression processing, and the characteristics of the mass spectrum data are not utilized to be specially optimized, so that the mass spectrum data still occupy excessive storage space. Disclosure of Invention In view of the above-mentioned drawbacks of the prior art, an object of the present invention is to provide a method, a system, a storage medium and an electronic device for compressing mass spectrum data, which are used for solving the problem that the mass spectrum data still occupy too much storage space due to the fact that the characteristics of the mass spectrum data are not specially optimized in the prior art. In order to solve the technical problems, the invention discloses a mass spectrum data compression method, which comprises the following steps: Collecting mass-to-charge ratio data of a plurality of mass spectra, ion intensity data corresponding to the mass-to-charge ratio data and mass spectrum numbering data corresponding to the mass-to-charge ratio data; Sorting the mass-to-charge ratio data, and generating a mass-to-charge ratio difference data set through the difference between adjacent mass-to-charge ratio data after sorting; Step three, generating a corresponding mass spectrum number data set through the mass spectrum number data and the mass-to-charge ratio difference data set; And fourthly, carrying out data compression on the mass-to-charge ratio difference data set, the ion intensity data set and the mass spectrum number data set through a data compression algorithm. In order to further solve the technical problem to be solved, the second step of the method for compressing mass spectrometry data provided by the invention comprises the step of sequencing the mass-to-charge ratio data from small to large. In order to further solve the technical problem to be solved, the second step of the method for compressing mass spectrometry data provided by the invention comprises the step of reserving two decimal places for the mass-to-charge ratio data. In order to further solve the technical problem to be solved, the second step of the method for compressing mass spectrum data provided by the invention comprises rounding the ionic strength data. In order to further solve the technical problem to be solved, the invention provides a mass spectrum data compression method, which is used for storing adjacent data in a mass spectrum number data set by taking difference values. In order to further solve the technical problem to be solved by the invention, in the mass spectrum data compression method provided by the invention, the difference value between adjacent data in the mass spectrum number data set is as follows: W=(k2-k1+K)%K, Where K1 and K2 are two adjacent data and K is the total number of mass spectra. In order to further solve the technical problem to be solved by the invention, the data compression algorithm comprises an LZW algorithm, an LZSS algorithm, an arithmetic coding algorithm, a Huffman coding algorithm and an RLE coding algorithm. In order to further solve the technical problem to be solved by the present invention, the present invention also provides a mass spectrometry data compression system, the system comprising: The data acquisition module is used for acquiring mass-to-charge ratio data of a plurality of mass spectrums, ion intensity data corresponding to the mass-to-charge ratio data and mass spectrum serial number data corresponding to the mass-to-charge ratio data; the data ordering module is used for ordering the mass-to-charge ratio data; the data difference calculation module is used for calculating the difference of the adjacent mass-to-charge ratio data; the data rounding calculation module is used for rounding calculation of the ionic strength data; The data integration module is used for generating a mass-to-charge ratio difference data set, an ion intensity data set and a mass spectrum number data set; And the data compression module is used for carrying out data compression on the mass-to-charge ratio difference data set, the ion intensity data set and the mass spectrum number data set through a data compression algorithm.