Search

CN-122027618-A - Data transmission method, device, equipment, medium and product

CN122027618ACN 122027618 ACN122027618 ACN 122027618ACN-122027618-A

Abstract

The present disclosure relates to the field of computer technologies, and in particular, to a data transmission method, apparatus, device, medium, and product. In the disclosure, a target data transmission task from an FTP client is received, load allocation weights of candidate data agent nodes are determined based on node performance information and node use information of a plurality of candidate data agent nodes obtained from a registry, the target data agent nodes are determined based on the load allocation weights and connection number information of the candidate data agent nodes, and the target data transmission task is sent to the target data agent nodes. And determining the load utilization rate condition of the nodes through the node performance information and the node utilization information of the proxy nodes so as to determine the load distribution weight of each candidate data proxy node. Therefore, whether each node has spare resources can be accurately judged, and the distribution of all tasks to the nodes with spare resources before can be avoided by combining the connection number information. Therefore, the efficiency of data transmission can be improved.

Inventors

  • HUANG XINAN

Assignees

  • 中移(苏州)软件技术有限公司
  • 中国移动通信集团有限公司

Dates

Publication Date
20260512
Application Date
20260129

Claims (10)

  1. 1. A data transmission method, applied to an FTP server node, comprising: receiving a target data transmission task from an FTP client; Determining a load distribution weight of each candidate data agent node based on node performance information and node use information of a plurality of candidate data agent nodes obtained from a registry; Determining a target data proxy node based on the load distribution weight and the connection number information of the candidate data proxy nodes; and sending the target data transmission task to the target data proxy node.
  2. 2. The method of claim 1, wherein determining the load distribution weight for each candidate data agent node based on node performance information and node usage information for a plurality of candidate data agent nodes obtained from a registry comprises: Determining, for any one of the candidate data proxy nodes, a first node parameter of the candidate data proxy node based on the node performance information, where the first node parameter is used to characterize a maximum processing capability of the candidate data proxy node; Determining a second node parameter of the candidate data proxy node based on the node usage information, the node warm-up time and the service run time difference, the second node parameter characterizing an actual processing capacity of the candidate data proxy node; Determining a load distribution weight influence factor based on the node use information; And determining the load distribution weight of each candidate data agent node based on the first node parameter, the second node parameter and the load distribution weight influence factor.
  3. 3. The method of claim 2, wherein the node performance information comprises at least one of CPU processing speed, memory size, network bandwidth, disk read/write rate; the node use information comprises at least one of CPU use rate, memory use rate, network use rate and disk read-write use rate.
  4. 4. A method according to claim 3, wherein said determining, for any one of said candidate data proxy nodes, a first node parameter for said candidate data proxy node based on said node performance information, comprises: Obtaining products of the CPU processing speed, the memory size, the network bandwidth, the disk read-write speed and the corresponding weights of the node performance information to obtain a plurality of performance parameters; and calculating the sum of a plurality of the performance parameters to obtain the first node parameters.
  5. 5. The method of claim 3, wherein the determining the second node parameter for the candidate data proxy node based on the node usage information, node warm-up time, and service run time difference comprises: When the service operation time difference value is greater than or equal to the node preheating time, obtaining the products of the CPU utilization rate, the memory utilization rate, the network utilization rate, the disk read-write utilization rate and the corresponding weight of the node utilization information to obtain a plurality of utilization parameters, obtaining the sum of the plurality of utilization parameters to obtain the second node parameter, wherein the service operation time difference value is the difference value between the current time and the service starting time of the candidate data proxy node, and the node preheating time is the time when the candidate data proxy node is operated to a first state; And when the service operation time difference value is smaller than the node preheating time, obtaining the products of the CPU utilization rate, the memory utilization rate, the network utilization rate, the disk read-write utilization rate and the corresponding node utilization information to obtain a plurality of utilization parameters, and calculating the second node parameters based on the sum of the plurality of utilization parameters, the node preheating time and the service operation time difference value.
  6. 6. A method according to claim 3, wherein said determining a load distribution weight impact factor based on said node usage information comprises: obtaining the products of the CPU utilization rate, the memory utilization rate, the network utilization rate, the disk read-write utilization rate and the corresponding node utilization information to obtain a plurality of utilization parameters; Determining the maximum value among the CPU utilization rate, the memory utilization rate, the network utilization rate and the disk read-write utilization rate; the load distribution weight impact factor is determined based on the maximum value and a plurality of the usage parameters.
  7. 7. The method of claim 1, wherein the determining a target data proxy node based on each of the load distribution weights and connection number information of each of the candidate data proxy nodes comprises: For any first node in the candidate data proxy nodes, when a second node is a node except for the last node in a plurality of candidate data proxy nodes, judging the product of the connection number information of the first node and the load distribution weight of the second node, and the product of the connection number information of the second node and the load distribution weight of the first node in sequence; updating the first node and/or the second node based on the judgment result; And taking the current first node as the target data agent node until the second node is the last node in the candidate data agent nodes.
  8. 8. The method according to claim 1, wherein the method further comprises: and before the FTP server node executes the FTP type instruction, changing the current working directory, wherein the current working directory changing instruction is used for changing a target naming space of the FTP type instruction, and the target naming space is a non-initialized naming space.
  9. 9. A data transmission method applied to a target data proxy node determined based on each load allocation weight and connection number information of each candidate data proxy node, the method comprising: Receiving a target data transmission task from an FTP server node; And sending the target data transmission task to an HDFS distributed file system.
  10. 10. A data transmission apparatus for use with an FTP server node, the apparatus comprising: the first receiving module is used for receiving a target data transmission task from the FTP client; a first determining module, configured to determine a load allocation weight of each candidate data proxy node based on node performance information and node usage information of a plurality of candidate data proxy nodes acquired from a registry; a second determining module, configured to determine a target data proxy node based on each of the load allocation weights and connection number information of each of the candidate data proxy nodes; And the first sending module is used for sending the target data transmission task to the target data proxy node.

Description

Data transmission method, device, equipment, medium and product Technical Field The disclosure belongs to the technical field of computers, and in particular relates to a data transmission method, a data transmission device, a data transmission equipment, a data transmission medium and a data transmission product. Background Hadoop (Hadoop distributed file system) HDFS (Hadoop Distributed FILE SYSTEM) has high fault tolerance and provides high throughput data access, and is very suitable for application on large-scale data sets. The HDFS-over-FTP component is an FTP (FILE TRANSFER Protocol) server built on the HDFS, namely a tool for providing the FTP Protocol to access the HDFS, and a user can access the HDFS through the FTP Protocol, so that uploading and downloading of the HDFS file system are realized simply and easily. Wherein the hdfs-over-ftp component comprises FTP SERVER nodes and dataproxy (data proxy) nodes, and FTP SERVER (file transfer protocol server) nodes can issue data transfer tasks to the appropriate dataproxy nodes for execution. In the related art, when the dataproxy node is selected, the dataproxy node with the smallest load is often used as the node for executing the data transmission task through the own performance of the dataproxy node. However, in such a data transmission method, the index of influencing the load is single, and the load balancing of the whole process of starting the server cannot be realized. Therefore, the efficiency of data transmission may be reduced. Disclosure of Invention Aiming at some defects related to the background technology, the embodiment of the disclosure provides a data transmission method, a device, equipment, a medium and a product, which can improve the efficiency of data transmission. In a first aspect, an embodiment of the present disclosure provides a data transmission method, applied to an FTP server node, the method including: receiving a target data transmission task from an FTP client; Determining a load distribution weight of each candidate data agent node based on node performance information and node use information of a plurality of candidate data agent nodes obtained from a registry; Determining a target data proxy node based on the load distribution weight and the connection number information of the candidate data proxy nodes; and sending the target data transmission task to the target data proxy node. Optionally, the determining the load distribution weight of each candidate data agent node based on the node performance information and the node usage information of the plurality of candidate data agent nodes acquired from the registry includes: Determining, for any one of the candidate data proxy nodes, a first node parameter of the candidate data proxy node based on the node performance information, where the first node parameter is used to characterize a maximum processing capability of the candidate data proxy node; Determining a second node parameter of the candidate data proxy node based on the node usage information, the node warm-up time and the service run time difference, the second node parameter characterizing an actual processing capacity of the candidate data proxy node; Determining a load distribution weight influence factor based on the node use information; And determining the load distribution weight of each candidate data agent node based on the first node parameter, the second node parameter and the load distribution weight influence factor. Optionally, the node performance information comprises at least one of CPU processing speed, memory size, network bandwidth and disk read-write speed; the node use information comprises at least one of CPU use rate, memory use rate, network use rate and disk read-write use rate. Optionally, the determining, for any one of the candidate data proxy nodes, the first node parameter of the candidate data proxy node based on the node performance information includes: obtaining products of the CPU processing speed, the memory size, the network bandwidth, the disk read-write speed and the corresponding node performance information to obtain a plurality of performance parameters; and calculating the sum of a plurality of the performance parameters to obtain the first node parameters. Optionally, the determining the second node parameter of the candidate data proxy node based on the node usage information, the node warm-up time and the service run time difference value includes: When the service operation time difference value is greater than or equal to the node preheating time, obtaining the product of the CPU utilization rate, the memory utilization rate, the network utilization rate, the disk read-write utilization rate and the corresponding node utilization information to obtain a plurality of utilization parameters, obtaining the sum of the plurality of utilization parameters to obtain the second node parameter, wherein the service operation time difference value is the differen