CN-122020279-A - Astronomical light variation curve characterization and classification method based on contrast learning
Abstract
The invention provides an astronomical light variation curve characterization and classification method based on contrast learning, which can automatically analyze time sequence light variation data from different night-time. The method comprises the steps of firstly carrying out pretreatment such as normalization on an original optically variable sequence, then inputting the original optically variable sequence into a representation model based on a transducer, and obtaining universal features shared across patrols through hierarchical comparison learning of samples and time levels. In the fine tuning stage, the pre-training model is loaded to carry out fine tuning on the classification head so as to realize classification. On the system realization, a unified reasoning interface is provided, and the weight can be automatically loaded and the classification result and the confidence level can be output. Compared with the traditional method relying on artificial features or a single data source, the method has the advantages of being capable of migrating across the night and the sky through one-time pre-training, enhancing feature consistency under different observation conditions, being good in expandability and the like, and provides a high-reliability scheme for intelligent processing of large-scale time domain astronomical data.
Inventors
- Ding Junyao
- CHEN XIAODIAN
- GAO XINYI
- TANG XIAOYU
- QI XINYU
- XUE GUIRONG
Assignees
- 之江实验室
- 中国科学院国家天文台
Dates
- Publication Date
- 20260512
- Application Date
- 20251225
Claims (10)
- 1. The astronomical light variation curve characterization and classification method based on contrast learning is characterized by comprising the following steps of: acquiring a light variation curve data set from different astronomical telescope; Preprocessing the light variation curve data set to generate light variation curve data in a uniform format; Inputting the light variation curve data into a depth representation model based on a transducer encoder architecture, extracting consistent time sequence characteristics from the light variation curve data by using a self-attention mechanism, and carrying out characteristic alignment on the time sequence characteristics under an unsupervised condition by using a layering comparison learning mechanism so as to obtain general characteristics of the light variation curve capable of migrating across the night and finish the pre-training of the depth representation model, wherein the layering comparison learning mechanism comprises time-level comparison learning and sample-level comparison learning; performing supervised fine tuning on the depth representation model by using marked star-changing sample data to realize star-changing classification tasks; and automatically reading the to-be-processed light change curve data file through the unified reasoning interface and outputting the target category and the confidence score.
- 2. The astronomical light variation curve characterization and classification method based on contrast learning according to claim 1 is characterized in that the preprocessing method of the light variation curve data set is characterized in that firstly abnormal values of observation points are eliminated according to a 3sigma criterion, and after minimum-maximum normalization processing is carried out on a time sequence and a flow sequence respectively, mask filling is adopted to align light variation sequence length differences of different patrol sources.
- 3. The contrast learning-based astronomical light variation curve characterization and classification method according to claim 1, wherein the depth characterization model is a depth contrast learning model based on a transducer encoder structure.
- 4. The method for characterizing and classifying an astronomical light variation curve based on contrast learning according to claim 1, wherein the step of time-level contrast learning comprises: (a) Randomly intercepting two sections of partially overlapped subsequences in the same light variation curve; (b) The two sub-sequences are respectively input into a depth representation model to obtain corresponding time sequence feature vectors; (c) In the overlapping interval of the two sections of feature sequences, according to the alignment relation of the time indexes, defining the feature vector at the position with consistent time indexes as a positive sample pair, and defining the feature vector at the position with inconsistent time indexes as a negative sample pair; (d) Training a depth characterization model according to the positive sample pair and the negative sample pair.
- 5. The method for characterizing and classifying an astronomical light variation curve based on contrast learning according to claim 1, wherein the sample-level contrast learning step comprises: (a) In a training batch, a plurality of light variation curves are input simultaneously; (b) Generating two enhancement sequences with partial overlapping areas for each light variation curve respectively, and inputting all the enhancement sequences into a depth representation model to obtain time sequence feature vectors of the enhancement sequences; (c) Defining the feature vectors of two enhancement sequences from the same light-variation curve and having the same time index as positive sample pairs on the aligned time indexes of the feature sequences; (d) Training a depth characterization model according to the positive sample pair and the negative sample pair.
- 6. The astronomical light variation curve characterization and classification method based on contrast learning according to claim 1, wherein the unified inference interface supports multi-format data input and model automatic loading, when a new light variation curve is input, the system automatically performs data preprocessing and classification operation, and the model inference result simultaneously outputs a prediction category and a confidence score.
- 7. The astronomical light variation curve characterization and classification method based on contrast learning according to claim 1 is characterized in that the method is applied to light variation curves observed in different night-time days to realize transfer learning of cross telescope data, and the transfer process is realized by keeping consistency of characteristic space distribution and contrast loss constraint and is suitable for star variation identification under different time sampling rates and observation conditions.
- 8. An electronic device comprising a memory and a processor, wherein the memory is coupled to the processor, wherein the memory is configured to store program data, and wherein the processor is configured to execute the program data to implement the contrast learning-based astronomical light variation curve characterization and classification method according to any one of claims 1-6.
- 9. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed by a processor, implements the contrast learning-based astronomical light variation curve characterization and classification method according to any one of claims 1 to 6.
- 10. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the contrast learning-based astronomical light variation curve characterization and classification method according to any one of claims 1-6.
Description
Astronomical light variation curve characterization and classification method based on contrast learning Technical Field The invention belongs to the technical field of artificial intelligence algorithm and astronomical time domain data processing, and particularly relates to an astronomical light variation curve characterization and classification method based on contrast learning. Background With the continuous operation of large-scale time domain sky patrol projects (such as TRANSITING EXOPLANET SURVEY SATELLITE, TESS; zwicky TRANSIENT FACILITY, ZTF; gaia, etc.), astronomy has entered the time domain big data era with time as a core dimension. Different night telescopes continuously generate mass light variation curves (light curves) in the long-time observation process, and the brightness information of the celestial body changing along with time is recorded. The optical data contains abundant physical process information, and can be used for identifying various celestial body types, such as pulsar (Cepheid, RR Lyrae, delta Scuti, and the like), food double-star (ECLIPSING BINARY), rotational variable (Rotational Variable), non-variable Stars, and the like. However, significant differences exist between the observation systems in terms of sampling frequency, observation band, signal-to-noise ratio and data completeness, resulting in inconsistent distribution characteristics and temporal structure of the light variation curve, thus providing great challenges for automated classification and feature extraction. Currently, the mainstream light-variation curve classification model is trained by relying on manually constructed features (such as period, amplitude, fourier parameters and the like) and combining with a traditional machine learning algorithm (such as random forest, support vector machine and the like). The method requires astronomical specialists to perform prior feature design on each variable type, has high labor cost and is difficult to adapt to mass data scale. In recent years, deep learning models (particularly convolutional networks and cyclic networks) have been introduced into light-dependent curve analysis, which enable automatic learning of time-series features to some extent. However, these deep learning models usually require large scale labeling data for supervised training, and the acquisition process of the star-changing labels is extremely dependent on manual light-changing feature comparison, which is costly. In addition, the non-consistency of different patrol data obviously reduces the generalization performance of the model in a cross-telescope scene, namely, the model trained on a single data set cannot be directly migrated to another patrol task for use. At present, the research on migration learning among multiple night telescopes is still relatively few. Different sky patrol systems have significant differences in observation strategies, such as observation intervals, observation lengths, signal to noise ratios and the like, and these factors directly influence the time sequence structure and characteristic distribution of the light variation curve. Because time sampling of different night-time is uneven and span difference is big, directly carrying out deep learning model migration between different data sources often leads to characteristic mismatch and performance degradation, becomes the main technical difficulty of learning across night-time. The existing work is concentrated on the analysis of the light variation curve under a single telescope or a single sampling rate, and an effective model capable of realizing uniform characterization and stable classification under a multi-observation condition is lacking. With the rapid development of contrast learning (Contrastive Learning) technology in the field of characterization learning, feature alignment models based on unsupervised signals are significantly broken through in the fields of images, voices and the like. The model is used for realizing efficient representation learning of data by constructing positive and negative sample pairs in a sample space, maximizing consistency among similar samples and minimizing differences among dissimilar samples. However, most of the existing contrast learning models are derived from the fields of images, voices and the like, and are not designed for the long time sequence structure, the multi-source heterogeneity and the non-uniform sampling characteristic of astronomical light variation curves, so that the existing contrast learning models are difficult to directly apply to the unified characterization of multi-patrol light variation data. Therefore, there is a need for an optical variation curve characterization model that can simultaneously process multiple night viewing differences, has unsupervised feature learning capabilities, and can achieve universal feature migration between different data sources. The model should be able to: Under the condition of no label, extract