CN-121255468-B - Method and system for dynamically optimizing CPU scheduling based on thread affinity

CN121255468BCN 121255468 BCN121255468 BCN 121255468BCN-121255468-B

Abstract

The invention relates to a method and a system for dynamically optimizing CPU scheduling based on thread affinity, belonging to the technical field of performance optimization of a computer operating system. The system builds an affinity thread map of the target thread in real time through eBPF, analyzes CPU resident distribution of the affinity thread, integrates affinity information into CPU selection, task enqueuing and dispatching logic, realizes the same core/same NUMA node scheduling of the target thread and the affinity thread, reduces cross-core expenditure, and improves the execution efficiency of the target thread.

Inventors

DAI HONGJUN
LI ZHUOHANG
LI BING
ZHAI MINGJIE

Assignees

山东大学

Dates

Publication Date: 20260508
Application Date: 20251031

Claims (4)

1. A method for dynamically optimizing CPU scheduling based on thread affinity is characterized by comprising the following steps: (1) Real-time tracking CPU affinity data of the affinity relation of the target thread and the affinity thread, specifically: (11) Constructing an affinity thread map, storing the mapping relation between a target thread and the affinity thread through the LRU hash table, wherein a key is a target thread PID, and the key is struct affinity _node, and contains the affinity thread PID, the wake-up frequency and the last cooperation timestamp; recording the wake-up frequency of the target thread and other threads through a wake-up function, and marking the thread as a high-affinity thread of the target thread when the wake-up frequency exceeds a preset threshold value; (12) Counting CPU resident cores of the high-affinity thread in real time through the cpu_ affinity _map, namely CPU cores where the affinity thread is executed for 10 times recently, taking the core with the highest frequency as the resident CPU, and completing acquisition of CPU affinity data; (13) Synchronizing the affinity thread map and resident CPU information to a scheduling decision module through the annular buffer; (2) Converting the affinity information into a scheduling strategy, specifically, constructing a scheduling decision module, and converting the affinity information into the scheduling strategy, wherein the steps are as follows: (21) When a target thread is awakened, querying an LRU hash table and a cpu_ affinity _map to obtain a resident CPU core of a high-affinity thread, if the resident CPU core is in an idle state and is in a CPU permission mask of the target thread, directly dispatching the target thread to the resident CPU core, and if the resident CPU core is in an occupied state, selecting idle CPU cores of the resident CPU and NUMA nodes, so as to avoid dispatching across the NUMA nodes; (22) Enqueuing the target thread to the local DSQ of the CPU; (23) If eBPF tracking module abnormality or affinity thread does not exist, automatically backing to a default scheduling logic of a scheduler, and guaranteeing system stability; (3) And performing target thread configuration, state monitoring and parameter adjustment.
2. The method for dynamically optimizing CPU scheduling based on thread affinity according to claim 1, wherein in step (22), the target thread is enqueued to the local DSQ of the CPU where the affinity thread resides preferentially, if the target thread is associated with a plurality of high affinity threads, the CPU local DSQ with the highest number of high affinity threads is selected, and when enqueuing, the virtual time weight of the target thread is updated synchronously, and when the target thread and the affinity thread are in the same core, the time slice ratio is increased according to a preset ratio.
3. The method for dynamically optimizing CPU scheduling based on thread affinity according to claim 2, wherein in step (3), the following steps are specified: Target thread configuration, namely supporting a user to specify target thread PID to be optimized through a command line, and writing the target PID into target_ PIDs mapping of eBPF; state monitoring, namely reading eBPF annular buffer affinity _stats, and displaying an affinity thread list of a target thread, an affinity thread resident CPU and target thread scheduling core information in real time; And the parameter adjustment is to support dynamic adjustment of affinity thread threshold values and NUMA node priority parameters and adapt to different application scenes.
4. A system for dynamically optimizing CPU scheduling based on thread affinity, applied to the method for dynamically optimizing CPU scheduling based on thread affinity of claim 1, comprising: The kernel-mode eBPF module is used for tracking the affinity relation of the target thread and the CPU affinity data of the affinity thread in real time; the scheduling module is used for converting the affinity information into a scheduling strategy; and the user management module is used for target thread configuration, state monitoring and parameter adjustment.

Description

Method and system for dynamically optimizing CPU scheduling based on thread affinity Technical Field The invention relates to a method and a system for dynamically optimizing CPU scheduling based on thread affinity, belonging to the technical field of performance optimization of a computer operating system. Background In a multi-task operating system, the CPU scheduling efficiency of a thread directly determines the program execution performance, and the following key problems exist in the existing scheduling technology: Static affinity limitation that a traditional scheduler (such as CFS) depends on static CPU affinity configuration (such as taskset) and cannot respond to dynamic cooperative relationships (such as frequently-awakened thread pairs and shared-buffer cooperative threads) when threads run, so that the cooperative threads are easily scattered in different CPU cores, cross-core buffer invalidation is caused, data migration overhead is increased, and target thread performance is reduced; The scheduling strategy is disjointed with the cooperative demand, namely, although the existing extensible scheduling framework (such as a CFS scheduler) supports custom scheduling logic, fairness or simple sequencing is realized mainly based on virtual time (vtime) or FIFO, and the scheduling decision is optimized without combining with the actual affinity relationship (such as wake-up frequency and resource sharing degree) among threads; Affinity relationship tracking and scheduling linkage is absent, namely the existing eBPF tool (such as waker program) can track thread wakeup relationship, but is only used for data acquisition and display, affinity relationship data is not fed back to a scheduler to drive a scheduling strategy to dynamically adjust, and target thread performance cannot be directly improved. Therefore, a technical scheme capable of capturing the dynamic affinity relation of the threads in real time and linking the dynamic affinity relation with the depth of the CPU scheduling is needed, so as to solve the problem of performance loss caused by cross-core scheduling of the cooperative threads. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a method and a system for dynamically optimizing CPU scheduling based on thread affinity, which are used for constructing an affinity thread map of a target thread in real time (based on indexes such as wake-up frequency, cooperation frequency and the like) and analyzing CPU resident distribution of the affinity thread, and integrating affinity information into CPU selection, task enqueuing and dispatching logic based on sched _ext scheduling framework to realize 'target thread and affinity thread same core/same NUMA node scheduling', reduce cross-core expenditure and improve target thread execution efficiency. Term interpretation: 1. The BPF (Berkeley PACKET FILTER) BPF (BerkeleyPacket Filter) is produced in 1992 and is originally designed for efficient filtering of network data packets, useful data is screened by executing simple rules in a kernel state, unnecessary data is prevented from being copied to a user state, and the BPF is the bottom foundation of tools such as tcpdump and the like, but the functions of the BPF are limited to network filtering, and the BPF depends on a simple instruction set and interpretation and is limited in performance and complexity. 2. EBPF (extended Berkeley PACKET FILTER) eBPF (extended Berkeley PACKET FILTER) is subversion expansion of the traditional BPF after 2014, breaks through a single network filtering scene, becomes a kernel-state general execution engine, expands an instruction set to support complex logic (such as circulation and function call), introduces a verifier to ensure program safety, converts codes into machine codes to improve performance by JIT compiling, realizes kernel-to-user state data interaction by BPF mapping, is widely applied to the fields of performance analysis, security monitoring, network optimization and the like, and is a flexible and efficient kernel programming framework in a modern Linux system. 3. DSQ (Dispatch Queue) is a core component in the Linux kernel sched _ext extensible scheduling framework, and is mainly used for storing tasks to be scheduled as a temporary buffer before the tasks are finally dispatched to a CPU for execution, and simultaneously supporting task sequencing and Dispatch logic according to different strategies. 4. Sched _ext is an extensible BPF scheduling class introduced by the 6.12 version of the Linux kernel, and unlike a traditional scheduler, the scheduling behavior is allowed to be dynamically defined through a group of BPF programs, a self-defined scheduling strategy can be realized without modifying kernel source codes, and scheduling flexibility is greatly improved. 5. NUMA (Non-Uniform Memory Access ) NUMA (Non-UniformMemory Access, non-uniform memory access) is a memory architecture designed for multiprocesso