CN-121052937-B - System for selecting strands based on distributed high-efficiency index formula

CN121052937BCN 121052937 BCN121052937 BCN 121052937BCN-121052937-B

Abstract

The invention provides a system based on distributed high-efficiency index formula stock selection, which comprises a front-end proxy service module, a plurality of fragment calculation service modules and a data source service module. The front-end proxy service module receives a user request containing plate factors and index formulas, acquires a reference stock pool list needing stock selection according to the plate factors, divides the reference stock pool list needing stock selection into a plurality of fragments through increment internal codes and total fragment number sampling to generate a corresponding fragment calculation request, and the internal fragment calculation node management module distributes the request to the corresponding fragment calculation service module by adopting a self-adaptive consistency hash algorithm according to fragment identification. The method and the system for achieving the automatic clustering of the virtual nodes by the MapReduce task distribution and result aggregation based on the split computing have the advantages that the architecture that the front-end agent is combined with the split computing is combined, instantaneity and concurrency processing capacity of strand selection computing are improved, the distribution of the virtual nodes and the entity nodes is dynamically adjusted through the self-adaptive consistency hash strategy, the problem of unbalanced split is solved, and elastic expansion of the nodes is supported.

Inventors

GE CHUNGUANG
YU QINGFENG
HE HAO
Ji Diriha

Assignees

上海大智慧信息科技有限公司

Dates

Publication Date: 20260505
Application Date: 20251106

Claims (7)

1. A system based on distributed high-efficiency index formula stock selection is characterized by comprising a front-end proxy service module, a plurality of fragment calculation service modules and a data source service module; The front-end proxy service module is used for receiving a user request, wherein the user request comprises a plate factor to be calculated and an index formula, the front-end proxy service module is used for acquiring a corresponding stock pool list as a reference stock pool list needing stock selection according to the plate factor, traversing the reference stock pool list needing stock selection, carrying out modular arithmetic according to an increment internal code and a total shard number of each stock, determining shards to which each stock belongs, generating a plurality of shard calculation requests, and each shard calculation request comprises a sub stock pool list corresponding to one shard and a shard identifier; The process of generating a plurality of fragment calculation requests by the pre-proxy service module comprises the following steps: step a1, maintaining an increment internal code mapping table of a stock pool A in the whole market in the front-end proxy service module, wherein the increment internal code mapping table stores the corresponding relation between stock codes and increment internal codes; step a2, the pre-proxy service module subscribes to the change notification of the full market A stock pool list, and dynamically maintains the increment inner code in the increment inner code mapping table according to stock code sequencing; Step a3, the pre-proxy service module traverses the standard stock pool list needing stock selection, and performs modular operation according to the increment internal code of the stock and the total fragment number for each stock in the standard stock pool list needing stock selection to determine the fragment identifier to which the stock belongs; Step a4, the front-end proxy service module traverses all the segment identifiers, gathers stock lists belonging to the same segment identifier, and forms a sub stock pool list corresponding to each segment identifier; The system comprises a data source service module, a fragmentation computing service module, a pre-defined application program interface, a pre-defined data source service module, a fragmentation computing service module, a target formula and a pre-defined proxy service module, wherein the fragmentation computing service module is used for receiving the fragmentation computing request; The data source service module is used for providing the stock information parameters for the slicing calculation service module; the front-end proxy service module aggregates stock screening results returned by the plurality of fragment calculation service modules to generate a final stock selection result; The front-end proxy service module distributes a plurality of fragment calculation requests in a self-adaptive consistency hash mode according to the fragment identification through an internal fragment calculation node management module, and the process comprises the following steps: Step b1, traversing all the fragment identifiers by the front-end proxy service module, and constructing a fragment calculation request for each fragment identifier, wherein the fragment calculation request comprises a core field of a user request, the fragment identifier and the corresponding sub stock pool list; Step b2, the front-end proxy service module submits a plurality of the slicing calculation requests to the slicing calculation node management module through an internally maintained thread pool in parallel; Step b3, the partition computing node management module executes route forwarding for each partition computing request through an internally maintained scheduler, wherein the route forwarding comprises: Step b3.1, the scheduler maintains a hash ring, and all the fragment computing service nodes and the attached virtual nodes are mapped on the hash ring; Step b3.2, for each received fragment calculation request, the scheduler calculates a hash value according to a fragment identifier in the fragment calculation request, and determines a target fragment calculation service node on the hash ring based on the hash value; step b3.3, the scheduler forwards the request for the calculation of the shard to the target shard calculation service node; And the fragment computing node management module dynamically adjusts the number of entity nodes or virtual node distribution participating in the hash ring by adopting a self-adaptive consistency hash mode according to the real-time load state of each fragment computing service node, wherein the self-adaptive consistency hash mode comprises a self-adaptive consistency hash capacity expansion node algorithm and a self-adaptive consistency Ha Xisu capacity node algorithm.
2. The distributed efficient index formula stock-selection-based system according to claim 1, wherein the adaptive consistent hash capacity-expansion node algorithm is: DCHEN(S(t),Θ): Wherein, the DCHEN is a self-adaptive consistent hash capacity-expansion node processing function; S(t)= Is a real-time node state set; Representing the current CPU utilization rate of the ith fragment computing service node; Representing the current Mem memory utilization rate of the ith fragment computing service node; Θ={ , , ,start,end, , Configuring parameters for a service system; is the Mem memory upper threshold; Is the CPU upper threshold; the virtual nodes default adding quantity when the entity node is expanded; The step value is adjusted for the fluctuation rate of the virtual node during capacity shrinking; start is the starting time of DCHEN function execution, default open time; end is the termination time of DCHEN function execution, defaulting to the closing time; The maximum service node number of the node cluster is calculated for the fragments; Amplifying entity node functions for global overload; a single-point overload capacity-shrinking virtual node function; recording relevant state data of each node of the landing disc; for the adaptive node capacity expansion factor, it is defined as: The calculation mode is to calculate the duty ratio of CPU or Mem of all node services exceeding the threshold value in a traversing way; Wherein, the The adaptive consistent hash capacity expansion node algorithm comprises the following steps: step C1, the partition computing node management module in the front-end proxy service module performs asynchronous timing execution DCHEN (S (t), Θ) functions according to the system parameter start and end time, and calculates in real time according to the S (t) Value, carrying out timing judgment processing capacity expansion flow; Step C2, when Not less than 0.5 Triggering of Node expanding operation; Step C3, when When triggering Performing the virtual node capacity shrinking operation of the designated node; Step C4, when the conditions of the step C2 and the step C3 are not satisfied, executing And (3) performing a node state storage process by the function, and performing disc-drop recording on the node state of each fragment calculation according to the time point.
3. The distributed efficient index formula-based stock selection system according to claim 2, wherein in step C2 The node expanding operation comprises the following substeps: Step C2.1, a fragment computing node management module in the front-end proxy service module informs node operation through a service bus, performs point expanding operation through an API of K8S, and registers to complete new fragment computing node service; step C2.2, new entity node After automatic capacity expansion, entity nodes A kind of electronic device Number of virtual nodes And entity node Itself is added to the hash ring; step C2.3, after the new fragment calculation request comes, automatically reassigning the entity node And virtual nodes Surrounding fragmented data not allocated to a physical node And virtual nodes And the fragmented data of (a) continues to fall to the previous node.
4. The distributed efficient index formula-based stock selection system according to claim 2, wherein in the step C3 The method for performing the virtual node capacity reduction operation of the designated node comprises the following substeps: step C3.1, the segment computing node management module in the front-end proxy service module finds out all node lists exceeding the threshold value by traversing ]; Step C3.2, for the list of nodes exceeding the threshold value [ Adaptive capacity-shrinking virtual node number according to capacity-shrinking function Calculating the number of new virtual nodes of each node in the node list exceeding the threshold value; Step C3.3, after the new shard calculation request comes, automatically reassigning the list of nodes exceeding the threshold value [ Tile data of ].
5. The distributed efficient index formula-based stock selection system according to claim 1, wherein the adaptive consistency Ha Xisu capacity node algorithm is: DCHSN( ,Θ): Wherein, the DCHSN is an adaptive consistency Ha Xisu capacity node processing function; The state sets of all nodes in the latest transaction time period; ={S(t)|t∈{ , ..., }}; S(t)= A set of node states for a designated point in time; Θ={ , , , α, β, start, end, collectTime } is a service system configuration parameter; The minimum number of service nodes is calculated for the fragments; the step value is adjusted for the fluctuation rate of the virtual node when the capacity of the virtual node is expanded; Is a CPU lower threshold; a memory lower limit threshold of Mem, a CPU calculated specific gravity when calculating the shrinkage, a specific gravity when calculating the shrinkage, and a start Counting the starting time of the node state set, defaulting the starting time, end is the ending time of the node state set, defaulting the closing time, collectTime is the time point of DCHSN function timing processing; for the adaptive node scaling factor, the following is defined: The calculation mode is traversing the CPU which calculates all node services and the duty ratio of Mem below a threshold value; Is a global low-load capacity-shrinking entity node function; A single-point low-load capacity expansion virtual node function; The adaptive consistency Ha Xisu capacity node algorithm comprises the following steps: Step D1, the partition computing node management module in the front-end proxy service module performs asynchronous timing execution DCHSN according to the system parameters collectTime Θ) function according to Calculate the day of the day Value, carrying out timing judgment and processing of the volume shrinking flow; Step D2, when When triggering Reducing node operation; step D3, when When triggering And performing the virtual node capacity expansion operation of the designated node.
6. The distributed efficient index formula-based stock selection system according to claim 5, wherein in the step D2 The reduced node operation includes the following sub-steps: step D2.1, traversing the partitioned computing node management module in the pre-proxy service module when executing the reduced node function Average state value of each node in the network and find out the node with the minimum duty ratio And (3) calculating: = , Wherein, the The CPU utilization rate at the moment of calculating the service node t by the j-th fragment is represented; the memory utilization rate of the Mem under the moment of calculating the service node t by the j-th fragment is represented; Is that The total time point quantity collected in the collection, the target node with the lowest average CPU and Mem utilization rate is calculated according to the formula; step D2.2, the internal node management module of the front-end proxy service module informs the cleaning node operation through a service bus, and performs the pinch point operation through the API of the K8S; In the step D2.3 of the method, After the node cleaning is successful, the node is triggered Related virtual nodes Sum node Itself is removed from the hash ring; step D2.4, after the new shard calculation request comes, automatically reassigning the node And virtual nodes Surrounding fragmented data is transferred to other nodes for processing.
7. The distributed efficient index formula-based stock selection system according to claim 6, wherein in the step D3 The virtual node capacity expansion operation of the designated node comprises the following sub-steps: Step D3.1, the segment computing node management module in the front-end proxy service module finds out all the segments through traversal Node list below threshold value [ ]; Step D3.2, for the list of nodes below the threshold value [ Performing self-adaptive capacity expansion on the number of virtual nodes according to the capacity expansion function Calculating the number of new virtual nodes of each node in the node list below the threshold value; Step D3.3, after the new shard calculation request comes, automatically reassigning new data to the node list below the threshold value [ On ].

Description

System for selecting strands based on distributed high-efficiency index formula Technical Field The invention relates to the technical field of financial data processing, in particular to a system based on distributed efficient index formula stock selection. Background In the field of financial investment, along with the continuous expansion of stock market scale and continuous improvement of complexity of trade data, the rapid and accurate screening of targets meeting specific indexes from mass stocks has become the core demands of investors and financial institutions. The traditional stock selection system is mainly provided with a single service architecture, and the architecture has the advantages that firstly, the computing efficiency is low, when the computing requirement of a large-scale stock pool or a complex index formula is faced, the single computing node is difficult to bear too high computing pressure, the time consumption of the stock selection process is too long and cannot meet the requirement of scenes on instantaneity, meanwhile, the expansibility of the single service architecture is obviously limited, parallel expansion cannot be realized, the phenomenon of service congestion is extremely easy to occur under the high concurrency scenes, the use experience of users is influenced, in addition, although part of the traditional stock selection system tries to optimize the distributed architecture, the traditional consistency hash algorithm is still relied on for stock slicing, the stock quantity of the whole market is limited, the algorithm is difficult to realize balanced slicing under the condition of insufficient data quantity, and finally, the problems of excessive load of part of computing nodes and idle part of node resources are caused. The invention patent with the publication number of CN118519780A is found through the search of the patent literature, and discloses a distributed quantization strategy development platform based on cloud computing, which consists of a factor development and analysis module, a machine learning model integration module, an asset combination strategy development and analysis optimization module, a distributed cloud computing engine module and a management module, has higher reliability and expandability, combines a micro-service architecture and a containerization technology based on a distributed cloud computing technology, so as to support high-concurrency data processing and rapid execution of complex algorithms, and comprises a factor development and analysis function, a machine learning model development, analysis and integration function, an asset combination strategy development and analysis optimization function, a distributed computing engine function and a performance assessment function, and can effectively improve the development efficiency and the execution performance of an asset investment strategy. The patent cannot solve the technical problems of low calculation efficiency, unbalanced segmentation and weak customization, cannot realize the decoupling of index calculation and architecture, and is difficult to meet the high-efficiency stock selection requirement. In summary, in view of the above problems in the prior art, research on a system based on distributed efficient index formula stock selection is a critical task to be solved currently. Disclosure of Invention Aiming at the defects in the prior art, the invention aims to provide a system for selecting stocks based on a distributed efficient index formula. The invention provides a system based on distributed efficient index formula stock selection, which comprises a front-end proxy service module, a plurality of fragment calculation service modules and a data source service module, wherein the front-end proxy service module is used for sending data to the data source service modules; The system comprises a front-end proxy service module, a corresponding split computing service module, a split computing service module and a split computing module, wherein the front-end proxy service module is used for receiving a user request, the user request comprises a plate factor to be computed and an index formula, the front-end proxy service module acquires a corresponding stock pool list as a reference stock pool list needing to be subjected to stock selection according to the plate factor, traverses the reference stock pool list needing to be subjected to stock selection, carries out modular operation according to increment internal codes of each stock and total split numbers, determines the split to which each stock belongs, and generates a plurality of split computing requests, and each split computing request comprises a sub stock pool list corresponding to a split and a split identifier; The system comprises a data source service module, a fragmentation computing service module, a pre-set index formula computing function and a pre-set agent service module, wherein the data source servic