CN-121997978-A - MOSFET performance prediction method based on comparison learning and knowledge distillation
Abstract
The invention discloses a method for predicting the performance of a MOSFET based on contrast learning and knowledge distillation, which is suitable for predicting the performance degradation of a MOSFET device in a vehicle-mounted circuit board. The method comprises the steps of taking MOSFET on-resistance as input, firstly adopting a temperature compensation mechanism to align data, adopting Kalman filtering to reduce data noise, then utilizing phase space reconstruction to extract features, determining reconstruction dimensions and time delay by grid search, and constructing a multi-scale transducer teacher model based on a comparison learning frame on the basis, wherein the multi-scale transducer comprises a high-time resolution branch and a low-time resolution branch, and the two are subjected to feature fusion under the constraint of comparison loss. And finally, constructing a parameter lightweight student model based on knowledge distillation, and training the student model with smaller parameter by taking the transducer as a teacher model, thereby obtaining the lightweight model for predicting the performance of the MOSFET device.
Inventors
- SONG XUDONG
- SONG LINLIN
Assignees
- 大连交通大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260123
Claims (8)
- 1. The MOSFET performance prediction method based on contrast learning and knowledge distillation is characterized by comprising the following steps of: Step one, acquiring the original data of the on-resistance of the MOSFET device, and carrying out signal preprocessing on the data of the on-resistance, wherein the preprocessing process comprises data alignment and data noise reduction. The data alignment process is used for removing thermal drift components in data, and the data noise reduction process adopts Kalman filtering to reduce noise. Performing feature mapping on the preprocessed data, wherein the feature mapping adopts phase space reconstruction to map the data to a high-dimensional feature space so as to obtain a phase space matrix; And secondly, establishing a teacher model based on a comparison learning framework and a multi-scale transducer, wherein the multi-scale transducer consists of a long-term feature extractor and a short-term feature extractor. The teacher model takes the phase space matrix as input, extracts the long-term degradation characteristics and the short-term degradation characteristics of the MOSFET through a multi-scale converter, and performs characteristic fusion based on a comparison learning framework so as to realize MOSFET performance prediction; And thirdly, constructing a light-weight student model, wherein the light-weight algorithm is realized based on a knowledge distillation frame, and the teacher model is used for guiding the student model to complete knowledge migration, so that the student model has the performance prediction capability of the MOSFET device while the parameter quantity is reduced.
- 2. The method of claim 1, wherein in step one, the signal preprocessing process uses a temperature compensation mechanism to perform data alignment to remove non-fault resistance drift generated by the rise of junction temperature of the device, and the calculation formula of the temperature compensation model is as follows:
- 3. wherein, the The real on-resistance value after eliminating the thermal drift at the time t is shown, The original acquired resistance value at time t is indicated, Is the temperature coefficient of resistance of the MOSFET device, The junction temperature monitoring value at the time t is, Is the reference temperature of the base material, and then The Kalman filter noise reduction process is input to output a smoothed degenerate sequence.
- 4. The method according to claim 1, wherein in the step one, the specific embodiment of phase space reconstruction includes performing traversal calculation within a preset parameter range by using grid search, determining an optimal embedding dimension m and a time delay τ according to the influence of each parameter combination on prediction accuracy, and mapping the original data to Gao Weixiang spaces according to the optimal embedding dimension m and the time delay τ.
- 5. The method of claim 1, wherein in step two, the teacher model is constructed based on a multi-scale transducer, and the training process is constrained by a multi-scale timing contrast loss function. The multi-scale transducer is composed of a long-term feature extractor and a short-term feature extractor, and extracts long-term degradation feature vectors and short-term local degradation feature vectors respectively. Under a comparison learning framework, the multiscale transducer takes the output of the trained long-term feature extractor as a fusion degradation feature, and inputs the fusion degradation feature into a prediction network to predict degradation performance. The prediction network is composed of a fully connected neural network. The long-term feature extractor consists of a position coding module, a multi-head self-attention mechanism module and a fully-connected neural network module, wherein the modules are connected through residual errors and are subjected to normalization processing. The short-term feature extractor consists of a position coding module, a two-dimensional convolution neural network module and a full-connection neural network module, wherein the modules are connected through residual errors and are subjected to normalization processing. The long-term feature extractor and the short-term feature extractor take as input phase space features at different time resolutions, and train under a contrast learning framework, wherein a contrast loss function is defined as follows:
- 6. Wherein the method comprises the steps of In order to train the number of samples, For long-term degradation features of the long-term feature extractor output, As short-term degradation features of the short-term feature extractor output, In order to predict the output of the network, For the sample The first term of L is the cosine similarity of the long-term and short-term features, and the second term is the predicted MSE loss based on the performance of the long-term feature extractor.
- 7. The method according to claim 1, wherein in step three, the lightweight algorithm is implemented based on a knowledge distillation framework, the distillation loss function of which is defined as follows:
- 8. wherein the first item For the Kullback-Leibler divergence loss based on soft targets, the method is used for constraining the output probability distribution of a student model Approximating output probability distribution of teacher model Second item Cosine similarity loss for feature space, high-dimensional feature vectors for constraining student models Feature vector with teacher model And remain geometrically consistent.
Description
MOSFET performance prediction method based on comparison learning and knowledge distillation Technical Field The invention relates to the field of artificial intelligence, in particular to an intelligent method for predicting performance degradation of a MOSFET (metal oxide semiconductor field effect transistor), and particularly relates to a method for predicting the performance degradation of the MOSFET based on comparison learning and knowledge distillation. Background MOSFETs are used as switching elements for vehicle-mounted inverters, DC-DC converters and battery management systems, and are subjected to mechanical shock, alternating thermal stress and high-power loads for a long time. The working condition is easy to cause thermal fatigue cracks of bonding wires and solder layers in the device, so that performance is irreversibly degraded. Failure to monitor this degradation process can lead to circuit failure or even thermal runaway. Therefore, the full life cycle performance prediction is performed on the MOSFET, and the method has important significance for guaranteeing the running safety of the whole automobile. In a vehicle-mounted operating environment, signal interference and feature extraction difficulties exist based on degradation prediction of on-resistance. On one hand, the on-resistance value is easily affected by junction temperature fluctuation, the acquisition link is easily interfered by high-voltage electromagnetic pulse, so that non-fault thermal drift and random measurement noise exist in time sequence data at the same time, and the real degradation track is difficult to directly extract, on the other hand, the MOSFET aging process has nonlinearity and time-varying property, not only comprises long-term degradation trend, but also comprises short-time local fluctuation, and the conventional data driving method is difficult to simultaneously take account of the multi-time scale information in a unified feature space, so that the prediction capability of a model under a variable working condition is affected. Although a transducer et al neural network improves timing prediction capability for complex component signals, its model size and computational overhead are typically large. For the vehicle-gauge control unit, the computational power resources and the storage space are limited, and a prediction model with higher complexity is difficult to directly deploy. Therefore, how to construct a lightweight performance prediction method which has stronger anti-interference capability and is suitable for vehicle-mounted scene application on the premise of ensuring prediction capability and considering model complexity and engineering deployment requirements is still a technical problem to be solved. Disclosure of Invention In view of the shortcomings of the prior art, the invention provides a MOSFET device performance prediction method based on contrast learning and knowledge distillation. The invention reduces noise of original data by Kalman filtering in a data preprocessing stage by constructing a knowledge migration frame comprising a high-precision teacher model and a light-weight chemical raw model, and removes a thermal drift component of resistance caused by junction temperature rise in a data alignment mode. And determining optimal embedding dimension and time delay through grid search, and mapping the resistance data to a high-dimensional feature space by utilizing phase space reconstruction. And finally, constructing a light student model in a knowledge distillation mode, ensuring that the student model inherits the prediction precision of the teacher model and obviously reducing the parameter quantity and the calculation complexity of the model. The invention adopts the following technical means: a MOSFET device performance prediction method based on contrast learning and knowledge distillation comprises the following steps: And acquiring the original data of the on-resistance of the MOSFET device, and carrying out signal preprocessing on the data of the on-resistance, wherein the preprocessing process comprises data alignment and data noise reduction, the data alignment is used for removing a thermal drift component caused by junction temperature rise, and the data noise reduction is realized by adopting Kalman filtering. Performing feature mapping on the preprocessed data, wherein the feature mapping adopts phase space reconstruction to map the data to a high-dimensional feature space so as to obtain a phase space matrix; A teacher model based on a contrast learning framework and a multi-scale transducer is built, wherein the multi-scale transducer is composed of a long-term feature extractor and a short-term feature extractor. The teacher model takes the phase space matrix as input, extracts the long-term degradation characteristics and the short-term degradation characteristics of the MOSFET through a multi-scale converter, and completes characteristic fusion under a compar