CN-121979444-A - Data writing method and related device
Abstract
The application discloses a data writing method and a related device, wherein the data writing method comprises the steps of receiving a writing request about data to be written, wherein the writing request comprises a first node identifier of the data to be written, the first node identifier is used for identifying a data generating node of the data to be written, responding to the data writing request, determining a target batch writing node corresponding to the first node identifier from all candidate batch writing nodes, and adding the data to be written into a to-be-written queue of the target batch writing node, so that the target batch writing node writes the to-be-written queue into a persistent storage medium under the condition that batch writing conditions are met. By the scheme, the problem of disordered writing of different data to be written of the same data generating node can be avoided.
Inventors
- CAI WANJIN
- ZHANG JINGLEI
- GUO ZHENBIN
Assignees
- 杭州华橙软件技术有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251225
Claims (13)
- 1. A data writing method, comprising: receiving a writing request about data to be written, wherein the writing request comprises a first node identification of the data to be written, and the first node identification is used for identifying a data generation node of the data to be written; Responding to the data writing request, and determining a target batch writing node corresponding to the first node identifier from all candidate batch writing nodes; And adding the data to be written into a queue to be written of the target batch writing node, so that the target batch writing node writes the queue to be written into a persistent storage medium under the condition that batch writing conditions are met.
- 2. The method of claim 1, wherein the determining, from among the candidate bulk write nodes, the target bulk write node to which the first node identification corresponds comprises: mapping the first node identifiers to obtain batch writing values; selecting a batch writing range to which the batch writing numerical value belongs from the batch writing ranges of the candidate batch writing nodes as a target batch writing range; and taking the candidate batch writing node corresponding to the target batch writing range as the target batch writing node.
- 3. The method of claim 2, wherein the bulk write value is a hash value, the bulk write range is a hash value range, and/or, The mapping the first node identifier to obtain a batch writing value includes: Combining the service identifier of the data to be written with the first node identifier to obtain a combined identifier, wherein the service identifier is used for identifying the service to which the data to be written belongs; mapping the combination identifier to obtain the batch writing numerical value.
- 4. The method of claim 1, wherein the batch write condition is that a join duration of earliest joined data to be written in the queue to be written reaches a first time threshold or that a quantity of the data to be written in the queue to be written reaches a quantity threshold.
- 5. The method of claim 1, wherein the adding the data to be written to the queue to be written of the target bulk write node such that the target bulk write node writes the queue to be written to a persistent storage medium if a bulk write condition is satisfied, comprises: adding the data to be written into a queue to be written corresponding to the first node identifier in the target batch writing node, so that the target batch writing node writes the queue to be written corresponding to the first node identifier into the persistent storage medium under the condition that the batch writing condition is met; And/or, the method further comprises: and responding to the data writing request, and writing the data to be written into the volatile storage medium in real time.
- 6. The method of claim 5, further comprising, after said writing said data to be written to a volatile storage medium in real time: Judging whether the data to be written meets a deletion condition or not, wherein the deletion condition is a first deletion condition or a second deletion condition, the first deletion condition is that the data to be written is written into the persistent storage medium by the target batch writing node, the second deletion condition is that the writing duration of the data to be written reaches a second duration threshold value, and the second duration threshold value is larger than or equal to a first duration threshold value defined by the batch writing condition; And deleting the data to be written from the volatile storage medium in response to the deletion condition being satisfied.
- 7. The method of claim 5, wherein the volatile storage medium comprises a number of candidate volatile storage clusters, and wherein the writing the data to be written to the volatile storage medium in real-time comprises: Determining a target cluster address corresponding to the first node identifier from each candidate cluster address; and writing the data to be written into a target volatile storage cluster pointed by the target cluster address.
- 8. The method of claim 7, wherein said determining the target cluster address to which the first node identification corresponds from among the respective candidate cluster addresses comprises: Determining a target data partition corresponding to the first node identifier from each candidate data partition; And determining the target cluster address corresponding to the target data partition from each candidate cluster address.
- 9. The method of claim 5, wherein the method further comprises: receiving a query request about data to be queried; querying the data to be queried from the volatile storage medium; Responding to the data to be queried in the volatile storage medium, and taking the data to be queried as a data query result; and in response to the data to be queried not being searched in the volatile storage medium, searching the data to be queried from the persistent storage medium and taking the data to be queried as a data query result.
- 10. The method of claim 9, wherein the volatile storage medium comprises a number of candidate volatile storage clusters, the query request comprises a second node identification of the data to be queried, and the querying the data to be queried from the volatile storage medium comprises: determining a candidate cluster address corresponding to the second node identifier from the candidate cluster addresses; and querying the data to be queried from the candidate volatile storage clusters pointed by the candidate cluster addresses corresponding to the second node identification.
- 11. A data write management system comprising a data generating node, a data write management system, a data storage system, the data storage system comprising a persistent storage medium, the data write management system being for performing the method of any of claims 1-10.
- 12. An electronic device comprising a memory and a processor for executing program instructions stored in the memory to implement the method of any one of claims 1-10.
- 13. A computer readable storage medium/program product, characterized in that a program instruction/computer program is stored thereon, which program instruction/computer program, when being executed by a processor, implements the method of any of claims 1-10.
Description
Data writing method and related device Technical Field The present application relates to the field of data management technologies, and in particular, to a data writing method and a related device. Background The data write management system can write data to the persistent storage medium through the persistent storage middleware. The persistent storage middleware is software such as MongoDB (document database), ES (search analysis engine). Persistent storage media are hardware, such as hard disks, magnetic disks. The persistent storage middleware supports data batch write operations (Bulk). The nature of batch write operations is to combine multiple discrete data write operations into one efficient continuous data write operation. In the related art, a data writing method for implementing batch writing may be generally described as sending data to be written of a data generating node to a batch writing cluster, where the batch writing cluster writes received data to be written into a persistent storage medium under a condition that a batch writing condition is satisfied. The long-term research of the inventor discovers that the data writing method for realizing batch writing operation in the related technology has the problem of disordered data writing of the same data generating node, namely the sequence of sending different data to be written generated by the data generating node to a batch writing cluster is inconsistent with the sequence of finally writing the data to be written into the cluster to write into the persistent storage medium. The problem of data disorder not only affects the query accuracy of the persistent storage medium, but also causes business anomalies related to data to be written in the data generating node. Disclosure of Invention The application provides a data writing method and a related device, which can solve the problem of disorder of data writing of the same data generating node in the data writing method for realizing batch writing operation in the related technology. The application provides a data writing method, which comprises the steps of receiving a writing request about data to be written, wherein the writing request comprises a first node identification of the data to be written, the first node identification is used for identifying a data generation node of the data to be written, responding to the data writing request, determining a target batch writing node corresponding to the first node identification from all candidate batch writing nodes, and adding the data to be written into a to-be-written queue of the target batch writing node, so that the target batch writing node writes the to-be-written queue into a persistent storage medium under the condition that batch writing conditions are met. The application provides a data writing management system, which comprises a data generating node, a data writing management system and a data storage system, wherein the data storage system comprises a persistent storage medium, and the data writing management system is used for executing the method. The application provides an electronic device comprising a memory and a processor for executing program instructions stored in the memory to implement the above method. The present application provides a computer readable storage medium having stored thereon program instructions which, when executed by a processor, implement the above-described method. The present application provides a computer program product comprising a computer program which, when executed by a processor, implements the above method. According to the scheme, the writing request carries the first node identification of the data generating node of the data to be written, after the writing request is received, the target batch writing node corresponding to the first node identification is determined from the candidate batch writing nodes, the data to be written is added into the to-be-written queue of the target batch writing node, and the target batch writing node writes the to-be-written queue into the persistent storage medium under the condition that the batch writing condition is met. Since the target batch writing node is determined according to the first node identifier, after the writing request of different data to be written carrying the same node identifier (the first node identifier) arrives, the same candidate batch writing node is determined to be the target batch writing node in a consistent manner, and the different data to be written of the same node identifier is generated by the same data generating node, so that the different data to be written generated by the same data generating node can be ensured to be sent to the target batch writing node, the target batch writing node executes the writing of the different data to be written of the same data generating node through batch writing operation, the batch writing asynchronous writing condition of the different candidate batch writing no