CN-122024368-A - Method and system for detecting authenticity of embedded optically variable ink paper money for unbalanced small sample
Abstract
The invention discloses an embedded photo-change ink paper money authenticity detection method and system for unbalanced small samples, which are used for detecting authenticity of photo-change ink paper money with few samples and unbalanced samples, and realize the authenticity detection of photo-change ink paper money through an integrated method of imaging, constrained learning, calibration fine tuning and end-side deployment. By adopting infrared transmission and templated ROI, taking TPR not less than 99.9% as a constrained learning and calibrating link with hard constraint, pruning a support vector with red line stopped, deploying LUT acceleration and self-description model package and other technologies, on the premise of only relying on an infrared transmission diagram, the above measures enable the system to stably meet the targets of TPR not less than 99.9% and overall accuracy not less than 99.9% on small computing power equipment, and simultaneously achieve the expected real-time processing and power consumption indexes.
Inventors
- TAO YUKUN
- Yuan Rongshuai
- ZHANG YONGSHENG
Assignees
- 深圳市倍量科技股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251230
Claims (10)
- 1. An embedded optically variable ink paper money authenticity detection method for unbalanced small samples is used for detecting authenticity of optically variable ink paper money with few samples and unbalanced samples, and is characterized by comprising the following steps: The method comprises the steps of firstly, data preprocessing, namely acquiring banknote images by using infrared transmission light, carrying out geometric standardization on the whole banknote after registration, template cutting, radiance standardization, light denoising and scale unification, and extracting optically variable ink OVI samples from a digital face value area of the lower right corner according to templated coordinates; Building and shaping the optically variable ink paper money authenticity detection model under an SVM framework, compressing the model size into a specified range through model training and model pruning, and outputting a lightweight model capable of being directly deployed; Thirdly, model deployment and end-side pushing stages; determining the step length and the upper bound of the LUT, constructing a one-dimensional index lookup table, judging each input sample, reconstructing the distance, and finally judging the authenticity of the sample; And step four, generating a result report stage, and counting TPR, overall accuracy, throughput and power consumption indexes to generate a banknote authenticity report.
- 2. The method for detecting the authenticity of an embedded optically variable ink banknote for an unbalanced small sample according to claim 1, wherein the specific steps of the data preprocessing stage include: Step 1.1, four-point positioning is carried out on an infrared transmission frame, a homography matrix is estimated, and an original image is corrected to a unified standard nominal face coordinate system in a perspective mode, so that a registered standard nominal face image is obtained; Step 1.2, calibrating a rectangular template of a right lower corner digital face value area in advance for each denomination of paper money in a standard coordinate system, and directly cutting on a registration chart at one time to obtain an OVI infrared transmission ROI; Firstly, performing flat field correction to weaken slow intensity fluctuation caused by non-uniformity of backlight and thickness difference of paper, and then performing split cutting and linear stretching to unify dynamic range so as to ensure that gray contrast of different samples is in a comparable interval; step 1.4, adopting light denoising to inhibit random noise and isolated dead points, reserving digital textures and fine edge information as far as possible, and avoiding excessively smoothing a high-frequency structure useful for classification discrimination; And 1.5, uniformly scaling the normalized ROI to a preset pixel value, and flattening the normalized ROI to a vector representation of a preset value to adapt to the memory layout of the traditional lightweight classifier and embedded reasoning, wherein the memory layout is used as a standard input scale of subsequent modeling.
- 3. The method for detecting the authenticity of the embedded optically variable ink paper currency facing the unbalanced small sample according to claim 1, wherein the specific steps of the construction stage of the SVM improved optically variable ink paper currency authenticity detection model facing the embedded constraint comprise the following steps: 2.1, building a model, namely building a discrimination model which is based on RBF cores and contains class cost weights on the basis of a light-variable ink OVI sample extracted from a right lower corner digital face value area by using the standardized coordinates of 1X 1024 vectors; Step 2.2, model training, namely performing constrained calibration around the hard constraint TPR of more than or equal to 99.9%, performing beta translation locking recall by using extreme position division of the true coin score, then combining the optimizing core width and class weight under the red line to minimize false alarm, and completing Platt probability calibration and single-side threshold determination to ensure consistent on-line caliber and stable judgment; And 2.3, pruning the model, namely pruning the support vector under the premise of not damaging the recall red line and not deteriorating the overall accuracy, compressing the model size into a specified range, outputting a light model capable of being directly deployed so as to meet the end computing power and the storage budget, and outputting a self-description model package capable of being directly deployed.
- 4. The method for detecting the authenticity of the embedded optically variable ink paper currency facing the unbalanced small sample according to claim 3, wherein the specific steps of the model construction in the step 2.1 include: step 2.1.1 intensity vector using a1×1024 infrared transmission OVI sample Zero mean/unit variance normalization is performed in dimensions as a unique feature input, as shown in equation (1): (1); Wherein x represents the intensity vector of the original infrared transmission OVI region; Representing the normalized infrared transmission characteristic vector; mu represents the mean value vector of each dimension intensity characteristic in the training set, namely the average infrared transmission intensity obtained by statistics on a large-scale true and false sample; step 2.1.2, the classifier configuration adopts radial basis function RBF-SVM, as shown in the formula (2) and the formula (3): (2); . (3); wherein, in the formula (2), Normalized OVI infrared transmission characteristic vector (1 x 1024) of the paper currency to be detected; Normalized feature vectors of training samples (typically support vectors); The difference degree (square Euclidean distance) of the two in the feature space is calculated as the inverse measurement of the similarity of the transmission textures/density distribution of two OVI (smaller and more similar); The core width parameter is used for controlling the attenuation speed of the similarity along with the distance, wherein the larger the gamma is, the more local/critical is, and the smaller the gamma is, the more smooth/wide is; Core similarity represents the matching degree of the sample and a certain support vector; wherein, in the formula (3), SVM judges the score (scoring), not probability, the bigger the score, the more prone it is to be Genuine (true coin, positive class); Ith support vector (representative sample in training set); the number of support vectors (the time consumption and the memory occupation of reasoning are also determined in deployment); Class labels are commonly encoded as Genuine (genuine coin) = +1, counterfeit) = -1; The dual coefficient (Lagrangian multiplier) represents the contribution strength of the support vector to the decision plane; SVM model bias/intercept term; Setting penalty coefficient of soft interval SVM as C, kernel width The calculation formula of (2) is shown as formula (4): (4); Wherein d is the dimension of the input feature; , For the scaling factor of a single parameter, The overall variance before normalization for the training set; step 2.1.3 facing Performing a two-step calibration of recall hard constraint and false positive minimization on a validation set, the optimum value being subscripted The representation is: (1) Decision biasing of recall constraints Gathering genuine currency scores A kind of electronic device The quantile is recorded as The reasoning stage is shown in the formula (5): , (5), make the model prior satisfy ; (2) And (3) with The solution formula in the verification set is shown as the formula (6): s.t. (6); wherein TPR_val is the true rate/recall rate on the validation set, FPR_val is the false positive rate on the validation set, and Counterfeit false coins are judged to be true coins; Category cost weight, namely loss weight, is used for adjusting the cost of misclassification of different categories under an extremely unbalanced scene; and curing the optimized core width As shown in formula (7): (7); Wherein, the Finally curing the core width parameters for deployment; The optimal scaling factor obtained by constrained optimization; step 2.1.4, probability calibration and single threshold two-state decision are performed by adopting Platt logistic regression calibration Mapping to probabilities As shown in formula (8): .(8); Wherein, the Applying recall constraint translation (beta translation) score on the decision score for priori satisfying TPR > = 99.9% recall red line, a, b: fit parameters (fit on verification set) for Platt calibration for mapping SVM score to more interpretable, more stable probability output; solving for a single threshold under constraint bars on the validation set as shown in equation (9): s.t. (9); Wherein, the To verify false positive rates on the set; the two-state output rule is as shown in formula (10): (10); wherein, among them, The final output of the end side judges the category; outputting the probability of the true coin reliability after calibration; and meeting a single-side threshold selected under the constraint condition, and ensuring the verifiability and consistency of the decision strategy.
- 5. The method for detecting the authenticity of the embedded optically variable ink paper currency facing the unbalanced small sample according to claim 3, wherein the specific step of model training in the step 2.2 comprises the following steps: Step 2.2.1, data preparation, wherein the data source comprises a light-variable ink OVI sample extracted from a right-hand lower corner digital face value area by standardized coordinates of 1X 1024 vectors, and a classification label Genuine/Counterfeit; Dividing the data into three non-overlapping sets, namely a training set, a testing set and a verification set; data standardization, namely, all data are standardized, and a calculation formula is shown as a formula (11): (11); Wherein, the The training set data average value; The standard deviation is fixed as training set data, x represents original input characteristic vector; representing the normalized input feature vector; step 2.2.2, training an SVM model; (1) Grid setting: Penalty coefficient Nuclear wide scaling factor Setting cost weight ratio of class ; (2) Training RBF SVM by a model, and starting probability output; Training on a training set by using an SVM of RBF core, starting a probability output function, and carrying out probability estimation by using Platt scaling internally to obtain Support vector set and bias And calculate a decision score for each sample at the validation set ; Wherein, the Decision score/discriminant scoring; representing support vector contribution coefficients; Category cost weight ratio; (3) Recall constraint ( ); Taking all positive samples in the verification set Score sets of (2) Calculating a score set As 0.1% quantiles of (C) ; Translating the decision scores of all samples to obtain translated scores At this time, the verification set satisfies ; (4) Probability calibration Platt; Decision score using Platt calibration Mapping to calibration probabilities with actual probability meaning So that it better represents the posterior probability that the sample is positive; Wherein, the Judging evidence quantity after recall constraint translation; The true coin probability after calibration; (5) Single threshold selection; after the constraint has been satisfied Constrained calibration probabilities On, single threshold value is found on verification set Make at Under constraint Minimum; (6) Determining an optimal combination; From all parameter combinations Selecting the optimal combination which is satisfied And is also provided with Minimum combination to obtain And (3) with And cure parameters To form a final production model; And 2.2.3, testing and evaluating, namely testing by using a testing set, multiplexing parameter output results obtained in the training stage, and recording reasoning time delay.
- 6. The method for detecting the authenticity of the embedded optically variable ink paper money facing the unbalanced small sample according to claim 3, wherein the specific step of model pruning in the step 2.3 comprises the following steps: Step 2.3.1, targets and constraints; On the premise of keeping the decision link and the input caliber unchanged, carrying out scale constraint and sparsification on the support vector set, and setting targets and constraints as follows: Constraint of satisfaction on verification set ; Target minimizing under recall red line constraint And ensure that the overall accuracy is not degraded while limiting To meet end-side latency and storage budget; Step 2.3.2, pruning and outputting; For each support vector in the current model Evaluating the marginal contribution, sorting according to the marginal contribution value, and preferentially removing the support vector with the lowest contribution value , The smaller and easier pruning is performed, the recalculation is performed on the verification set after pruning, the comparison is performed with the set constraint target, if the constraint and target conditions are met, pruning is continued, otherwise, the pruning step length is retracted and reduced; When (when) Or stopping when continuing pruning and violating TPR red line, and outputting self-description model package after pruning.
- 7. The method for detecting the authenticity of the embedded optically variable ink paper money facing the unbalanced small sample according to claim 1, wherein the specific steps of the model deployment and end-side inference stage comprise the following steps: deriving necessary parameters from the trained model, including support vector matrix, coefficient vector, statistics and calibration parameters and square norm of each support vector for end-side inference; and 3.2, end-side reasoning, namely constructing a lookup table, and replacing exponential calculation in the RBF core by using LUT+linear interpolation to accelerate the reasoning, reduce the power consumption and enable the time delay to be more predictable.
- 8. The method for detecting the authenticity of the embedded optically variable ink paper currency facing the unbalanced small sample according to claim 7, wherein the specific step of deriving the model in the step 3.1 comprises the following steps: step 3.1.1, deriving statistics and calibration parameters of the training model: ; step 3.1.2, the structural scale of the model includes input dimensions Number of support vectors ; Step 3.1.3, the model core includes: support vector matrix ; Coefficient vector Wherein ; Pre-storing the square norm of each support vector 。
- 9. The method for detecting the authenticity of the embedded optically variable ink paper currency facing the unbalanced small sample according to claim 8, wherein the specific step of end-side reasoning in the step 3.2 comprises the following steps: Step 3.2.1, after loading the model, determining the step length and the upper bound of the LUT at one time at the equipment end; Step 3.2.2, calculating the table length As shown in formula (12): (12); Wherein, the Maximum index of LUT; A maximum effective range of the index argument t; Discrete sampling intervals (resolution) of the look-up table; step 3.2.3, constructing a one-dimensional index lookup table, as shown in a formula (13): (13); The storage capacity of the lookup table is Bytes; Wherein, the The maximum index of the LUT, the table index, the sampling interval; step 3.2.4 for each support vector Distance reconstruction to obtain Will be Restricted to legal intervals: ; Step 3.2.5, completing table lookup and interpolation, as shown in formulas (14), (15) and (16): (14); (15); (16); Wherein, the Linear interpolation coefficients; RBF kernel value (similarity); Step 3.2.6, accumulation summation, and weighted accumulation of the contribution of each support vector, as shown in (17): (17); Wherein, the The weighted accumulation result of all the support vector contributions; Effective weights of the support vector; The kernel similarity of the ith support vector and the current sample; and 3.2.7, adding paranoid and recall translation in final decision, and judging authenticity by using the Platt probability.
- 10. An embedded optically variable ink banknote authenticity detection system for unbalanced small samples, for implementing the embedded optically variable ink banknote authenticity detection method for unbalanced small samples according to any one of claims 1 to 9, comprising: the data marking and preprocessing module is used for acquiring banknote images by using infrared transmission light, carrying out geometric standardization on the whole banknote after registration, template cutting, radiance standardization, light denoising and scale unification, and extracting optically variable ink OVI samples from a lower right corner digital face value area according to templatized coordinates; The SVM improved type optically variable ink paper money authenticity detection model construction module is oriented to embedded constraint and is used for completing construction and shaping of an optically variable ink paper money authenticity detection model under an SVM framework, compressing the model size to a specified range through model training and model pruning, and outputting a lightweight model capable of being deployed directly; The model deployment and end-side processing stage is used for replacing index calculation in RBF core by LUT+linear interpolation to accelerate reasoning, reduce power consumption and make time delay more predictable; and the generation result report module is used for counting the TPR, the overall accuracy, the throughput and the power consumption index and generating a banknote authenticity report.
Description
Method and system for detecting authenticity of embedded optically variable ink paper money for unbalanced small sample Technical Field The invention relates to the technical field of computer vision and detection, in particular to an embedded optically variable ink paper money authenticity detection method and system for unbalanced small samples. Background The detection method based on Random Forest (RF) and the detection method based on Support Vector Machine (SVM) are two common banknote authenticity detection methods. The detection method based on the random forest is a statistical learning technology for carrying out integrated discrimination by utilizing a plurality of randomized decision trees. The method has the core ideas that training samples are sampled by self and feature subsets are randomly selected on each dividing node, so that diversity among all base learners is kept, and the samples are classified through majority voting or mean probability during prediction. Due to the variance suppression effect of 'multi-tree integration', RF has certain robustness to noise and characteristic correlation, and is suitable for high-dimensional but relatively sparse-structure characterization vectors. The method does not depend on a depth network, but relies on multi-model statistical aggregation to generate stable output, however, under a category extremely unbalanced scene, leaf node probability is biased, simple thresholding is difficult to maintain lower false alarm under extremely high recall constraint (such as 99.9%), and an additional cost sensitive or calibration mechanism is needed to improve. The paper money authenticity detection based on the support vector machine is a discriminant statistical learning method with interval maximization as a core. The basic idea is to extract vectorized representation (which can be pixel flattening or small amount of texture/morphology statistics) on the normalized nominal target area, map the samples to high-dimensional feature space through kernel skills, learn hyperplanes capable of maximizing the two types of intervals, and determine the sign and amplitude of the decision function (which can be used for scoring through probability calibration) by a small amount of support vectors during prediction. The method does not depend on a depth network, has good generalization capability under the characteristics of small samples and high dimensions, and can process nonlinear boundaries by selecting a kernel function (such as RBF), so that the method is widely used as a base line scheme in the fields of bill and anti-counterfeiting detection. However, in a real financial environment, the number of the real coins is overstay one's leave coins, so that the SVM decision boundary is often pulled by most types (real coins), the counterfeit coins are easily misjudged as the real coins, and the ultra-low misinformation requirement of the financial grade is difficult to meet. The RBF-SVM reasoning stage needs to calculate an index kernel function for each test sample and all support vectors, which brings larger calculation power and energy consumption burden on embedded equipment and is difficult to meet the real-time requirement of the end side. The prior art has the following disadvantages: 1) Index distortion under extreme sample imbalance. In an actual banknote checking scene, the sample distribution shows extremely unbalanced that the true banknote is far more than the false banknote, the false banknote sample is extremely few, the random forest, the decision tree and the conventional two-class classification (including SVM and the like) usually take the minimum overall loss or the optimal voting as training targets, and the recall hard constraint of 'TPR is not less than 99.9%', meanwhile, the statistical variance of a few classes is large, the probability and the threshold lack of stable calibration, the decision boundary is pulled by a plurality of classes and the threshold is highly sensitive to the distribution drift, and the ultrahigh recall and the sufficiently low false alarm are difficult to realize simultaneously in an online environment, so that the overall accuracy of not less than 99.9% cannot be stably achieved. 2) Embedded resources are not friendly. At the embedded end, the traditional methods have insufficient resource suitability, namely hundreds of trees are often needed for random forest variance suppression, so that the model volume and memory occupation linearly expand along with the tree number, time delay and power consumption accumulation are caused by reasoning full-scale traversal, a single deep tree has smaller volume but is easy to fit and sensitive to noise, tree traversal is accompanied by a large number of branch jumps and discontinuous memory, the pipeline efficiency of a small-core CPU/MCU is obviously reduced, and the model scale, memory, time delay budget and quantization compression mechanism facing the end side are lacking, so th