CN-122020122-A - Dynamic mixed curvature manifold-based kriford multi-vector semantic processing method and device, electronic equipment and storage medium

CN122020122ACN 122020122 ACN122020122 ACN 122020122ACN-122020122-A

Abstract

The invention provides a kriford multi-vector semantic processing method and device based on dynamic mixed curvature manifold, electronic equipment and a storage medium, and relates to the fields of natural language processing, deep learning and neural network computing. The method comprises the steps of mapping an input sequence into a Kliford algebraic multi-vector representation, generating mixed curvature weights of spherical/hyperbolic/Euclidean three subspaces, continuously normalizing, projecting the multi-vector to each subspace according to the weights, forming attention score by continuous nonlinear function transformation of scalar parts based on geometric products and coefficient vector magnitudes of two-way parts in each subspace, obtaining attention weights by Softmax normalization, carrying out weighted aggregation on the multi-vector values to obtain subspace output, carrying out logarithmic mapping on subspace output to a tangent space, and outputting semantic processing results by weight fusion, wherein the mapping and stabilizing mechanism is continuously in a micro form and does not contain conditional branches based on threshold comparison.

Inventors

JI YANDA
ZHANG HONGMENG
GENG HAO
YANG HAO

Assignees

南京航空航天大学

Dates

Publication Date: 20260512
Application Date: 20260203

Claims (10)

1. The kriford multi-vector semantic processing method based on dynamic mixed curvature manifold is characterized by comprising the following steps of: s1, receiving an input sequence, and mapping the input sequence into a multi-vector characteristic representation in a Kraft algebra space, wherein the multi-vector comprises a scalar part, a vector part and a bidirectional vector part; S2, generating a mixed curvature distribution parameter through a curvature prediction network based on the multi-vector feature representation or the context feature thereof, wherein the mixed curvature distribution parameter comprises a positive curvature weight, a negative curvature weight and a zero curvature weight, wherein the positive curvature weight corresponds to a spherical subspace, the negative curvature weight corresponds to a hyperbolic subspace and the zero curvature weight corresponds to an Euclidean subspace; S3, based on the mixed curvature distribution parameters, respectively projecting the multi-vector feature representation to a spherical subspace, a hyperbolic subspace and an Euclidean subspace to obtain a triplet subspace feature representation, and using the triplet subspace feature representation as a manifold representation for subsequent attention calculation, wherein the manifold representation is a parallel triplet representation; S4, respectively constructing a query multi-vector, a key multi-vector and a value multi-vector in the spherical subspace, the hyperbolic subspace and the Euclidean subspace, and respectively calculating attention mark and subspace attention output by using a geometric product operator; S5, the attention output of the spherical subspace, the hyperbolic subspace and the Euclidean subspace is mapped back to the tangent space through logarithms respectively, and the blended upper part and the blended lower part Wen Biaozheng are obtained through weighted fusion according to the mixed curvature distribution parameters; s6, inputting the fused context representation into a task head to output a semantic processing result.
2. The method of claim 1, wherein step S2 further comprises normalizing the positive curvature weight, the negative curvature weight and the zero curvature weight by Softmax, temperature Softmax or LogSumExp to obtain a weight vector, wherein the weight vector satisfies w i is greater than or equal to 0 and Σw i =1.
3. The kriford multi-vector semantic processing method based on dynamic mixed curvature manifold of claim 1, wherein the curvature prediction network comprises a feature transformation sub-network and a weight output layer, wherein the weight output layer outputs an un-normalized curvature score vector u= [ u - ,u 0 ,u + ] and applies Softmax normalization, temperature Softmax normalization or LogSumExp normalization to the curvature score vector to obtain the weight vector w= [ w - ,w 0 ,w + ], wherein the curvature prediction network has a generation granularity of being generated by token, or by attention head, or by network layer.
4. The method according to claim 1, wherein in step S3, the projection process includes applying an exponential/logarithmic mapping constraint matching curvature parameters to the vector portion of the multi-vector or its learnable linear combination, the scalar portion and the bi-directional metric portion maintaining euclidean coordinates or employing homomorphic continuous mapping.
5. The method according to claim 1, wherein in step S4, the attention score comprises scalar components of the geometric product result and an outer product scalar term obtained by coefficient vector-based magnitude calculation from coefficient vectors of a bi-directional magnitude part of the geometric product result and continuous nonlinear function transformation, wherein the scalar components and the outer product scalar term are linearly combined to form an attention score matrix and used for Softmax normalization to generate attention weights.
6. The kriford multi-vector semantic processing method based on dynamic mixed curvature manifold of claim 1, wherein in step S6, the semantic processing result comprises: a retrieval score, a reordering score, or a generation condition context for retrieving an enhanced generation RAG system; inference results, closed-loop consistency labels or confidence levels for knowledge-graph inference or closed-loop consistency determination; category probabilities for text classification; the next token probability distribution for sequence generation; A ranking score for ranking or recommendation.
7. The kriford multi-vector semantic processing method based on dynamic mixed curvature product manifold of claim 1, wherein different order components of the multi-vector are stored in a multi-channel real tensor in a continuous memory block manner, scalar components and bi-directional vector components of the geometric product are computed in a vectorized parallel manner in one or more fusion operators, and at least include attention scoring and weighted aggregation operations, wherein the fusion operators run on GPU, NPU or TPU; The fusion operator caches intermediate multi-vector results in an on-chip shared memory and/or a register and performs intra-block reduction to reduce global memory round trip, the fusion operator completes geometric product calculation, grade projection and bi-directional vector magnitude calculation in the same kernel, further completes continuous nonlinear transformation of an outer product scalar term, linear combination of scalar components and the outer product scalar term, temperature scaling and mask additive biasing in the same kernel to generate attention scoring, and at least one of the intermediate multi-vector results and the intermediate attention scoring is not written back to the global memory or the external memory.
8. A multi-vector semantic processing device comprising: the multi-vector embedding module is used for receiving an input sequence, mapping the input sequence into a multi-vector characteristic representation in a kriford algebraic space, wherein the multi-vector comprises a scalar part, a vector part and a bidirectional vector part; the curvature prediction module is used for generating mixed curvature distribution parameters through a curvature prediction network, wherein the mixed curvature distribution parameters comprise positive curvature weights, negative curvature weights and zero curvature weights, the positive curvature weights correspond to spherical subspaces, the negative curvature weights correspond to hyperbolic subspaces, and the zero curvature weights correspond to Euclidean subspaces; the manifold mapping module is used for projecting the multi-vector characteristic representation to a sphere subspace, a hyperbolic subspace and an Euclidean subspace respectively to obtain a triplet subspace characteristic representation and using the triplet subspace characteristic representation as a manifold representation for subsequent attention calculation, wherein the manifold representation is a parallel triplet representation; The geometric algebraic attention module is used for respectively constructing a query multi-vector, a key multi-vector and a value multi-vector in the spherical subspace, the hyperbolic subspace and the Euclidean subspace, and respectively calculating attention marks and subspace attention output by using a geometric product operator; the cut space fusion module is used for mapping the attention output of the spherical subspace, the hyperbolic subspace and the Euclidean subspace back to the cut space through logarithms respectively, and carrying out weighted fusion according to the mixed curvature distribution parameters to obtain fused upper and lower Wen Biaozheng; And the output module is used for representing the fusion context into a task head so as to output a semantic processing result.
9. An electronic device comprising a processor and a memory, the memory storing a computer program, characterized in that the computer program, when executed on the processor, causes the electronic device to perform the kriford multi-vector semantic processing method based on dynamic mixed curvature manifold of any of claims 1 to 7.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed on a processor, implements the kriford multi-vector semantic processing method based on dynamic mixed curvature product manifold according to any of claims 1 to 7.

Description

Dynamic mixed curvature manifold-based kriford multi-vector semantic processing method and device, electronic equipment and storage medium Technical Field The invention relates to the fields of natural language processing, deep learning and neural network computing, in particular to a kriford multi-vector semantic processing method, device, electronic equipment and storage medium based on dynamic mixed curvature manifold. Background In recent years, neural network models based on attention mechanisms are widely used in natural language processing and multimodal learning. To enhance the modeling ability of models on hierarchical structures, long-range dependencies, and knowledge relationships, some solutions attempt to embed semantic vectors into non-euclidean spaces (e.g., hyperbolic spaces or spherical spaces). However, the existing solutions generally have the following problems: (1) Topology priors are singular, using only a single kind of curvature space (e.g., only negative curvature or only positive curvature), it is difficult to simultaneously characterize hierarchical, cyclic, and parallel/tiled structures that are ubiquitous in languages; (2) Numerical stability relies on hard threshold branching, part of the schemes introduce threshold-based conditional branching when calculating distance, normalization or mapping (for example, alternative formulas are adopted when the modulo length is smaller than the threshold), so that a calculation graph is discontinuous, parallel execution efficiency is reduced, and the training process is easy to be unstable; (3) Operator expressive power is limited by the fact that attention scoring often uses scalar operators such as dot products or distances, and it is difficult to express higher order geometric relationships (e.g., directed areas, subspace relationships, or asymmetric relationships that are spanned by two semantic vectors), thereby limiting complex inference power. Therefore, a new semantic processing method is needed, which can adaptively select various geometric topologies and express relationships by using stronger geometric operators while maintaining microconnectability and hardware friendliness. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a Criford multi-vector semantic processing method, a device, electronic equipment and a storage medium based on dynamic mixed curvature manifold, which are used for solving the problems that at least the prior art has single semantic embedding space topology, which leads to the incapability of modeling structures such as hierarchy, period, parallel and the like at the same time, the numerical stability depends on threshold value branches, which leads to the problems of discontinuous calculation graph, low hardware efficiency and unstable training, and the problems that attention operators are mainly scalar similarity and difficult to express high-order geometric relationship and asymmetric semantic interaction. To achieve the above object, a first aspect of the present invention provides a kriford multi-vector semantic processing method based on dynamic mixed curvature manifold, comprising the steps of: Step S1, acquiring a characteristic representation of an input sequence, and mapping the characteristic representation into a multi-vector characteristic representation in a Kraft algebraic space, wherein the multi-vector characteristic representation at least comprises a scalar part, a vector part and a bidirectional vector part; Step S2, generating a mixed curvature distribution parameter through a curvature prediction network based on the multi-vector feature representation or the context feature thereof, wherein the mixed curvature distribution parameter at least comprises a positive curvature weight, a negative curvature weight and a zero curvature weight, and obtaining a weight vector through Softmax normalization, temperature Softmax normalization or LogSumExp normalization processing, wherein the weight vector satisfies w i is more than or equal to 0 and Σw i =1, the positive curvature weight corresponds to a spherical subspace, the negative curvature weight corresponds to a hyperbolic subspace, and the zero curvature weight corresponds to an Euclidean subspace; Step S3, based on the mixed curvature distribution parameters, respectively projecting the multi-vector feature representation to a spherical subspace, a hyperbolic subspace and an Euclidean subspace to obtain a triplet subspace feature representation (X -,X0,X+) and use the triplet subspace feature representation as a manifold representation for subsequent attention calculation, wherein the manifold representation is a parallel triplet representation; And S4, respectively constructing a query multi-vector, a key multi-vector and a value multi-vector in a spherical subspace, a hyperbolic subspace and an Euclidean subspace, and respectively calculating attention scoring and subspace attention output by using