JP-7854838-B2 - Data acquisition systems, methods, and programs
Inventors
- 伊藤 大輔
- 齊藤 信一郎
- 羽原 拓哉
Assignees
- 株式会社日立製作所
Dates
- Publication Date
- 20260507
- Application Date
- 20220401
Claims (11)
- Multiple data receiving units share the processing of receiving data from the data source, A plurality of data transfer units are provided corresponding to each of the data receiving units and transfer the data received by the data receiving units, A data storage unit that receives and stores the data transferred from the plurality of data transfer units, It has, The multiple data receiving units perform synchronized time management with each other, assign a timestamp representing the time based on the time management to the received data, and transmit the data and the correspondence information, which associates the data's identification information with the timestamp, to the data transfer unit. The data transfer unit receives the data and the correspondence information from the data receiving unit, temporarily stores the correspondence information, transfers the data and the correspondence information to the data storage unit, and after confirming that the data has been stored in the data storage unit, releases the temporary storage of the correspondence information. If the data transfer unit experiences an error after receiving the data and the correspondence information, and then resumes operation from the error, it sends a response to the data source via the data receiving unit prompting the retransmission of the data, and also retains the correspondence information and uses the timestamp within the correspondence information for the retransmitted data. Data collection system.
- The data storage unit uses the time indicated by the timestamp in the correspondence information as the creation time of the data, and stores the data in association with the creation time. The data acquisition system according to claim 1.
- The data transfer unit temporarily stores the data together with the correspondence information, sends a response to the data source via the data receiving unit indicating that the data has been stored, transfers the correspondence information and the data to the data storage unit, and after confirming that the data has been stored in the data storage unit, releases the temporary storage of the correspondence information and the data. The data acquisition system according to claim 1.
- The aforementioned multiple data receiving units perform synchronized time management with each other using a Global Positioning System or Precision Time Protocol. The data acquisition system according to claim 1.
- The data collection system according to claim 1, wherein, upon resumption from the failure, if the mapping information was being transferred to the data storage unit, the data transfer unit requests the data storage unit to delete the data corresponding to the identification information in the mapping information, and after confirming the deletion of the data by the data storage unit, sends a response prompting the data to be retransmitted.
- A data collection method in a data collection system comprising: a plurality of data receiving units that share the processing of receiving data from a data source; a plurality of data transfer units provided corresponding to each of the data receiving units and transferring the data received by the data receiving units; and a data storage unit that receives and stores the data transferred from the plurality of data transfer units, wherein The multiple data receiving units perform synchronized time management with each other, assign a timestamp representing the time based on the time management to the received data, and transmit the data and the correspondence information, which associates the data's identification information with the timestamp, to the data transfer unit. The data transfer unit receives the data and the correspondence information from the data receiving unit, temporarily stores the correspondence information, transfers the data and the correspondence information to the data storage unit, and after confirming that the data has been stored in the data storage unit, releases the temporary storage of the correspondence information. If the data transfer unit experiences a failure after receiving the data and the correspondence information, and then resumes operation from the failure, it sends a response to the data source via the data receiving unit prompting the retransmission of the data, and also retains the correspondence information and uses the timestamp within the correspondence information for the retransmitted data. Data collection methods.
- The data storage unit takes the time indicated by the timestamp in the correspondence information as the creation time of the data, and stores the data in association with the creation time. The data collection method according to claim 6.
- The data collection method according to claim 6, wherein when the data transfer unit resumes operation after the failure, if the mapping information was being transferred to the data storage unit, the data storage unit requests the data storage unit to delete the data corresponding to the identification information in the mapping information, and after confirming the deletion of the data by the data storage unit, sends a response prompting the data to be retransmitted.
- A data collection program for a computer to operate a data collection system having: a plurality of data receiving units that share the processing of receiving data from a data source; a plurality of data transfer units provided corresponding to each of the data receiving units and transferring the data received by the data receiving units; and a data storage unit that receives and stores the data transferred from the plurality of data transfer units. The multiple data receiving units perform synchronized time management with each other, assign a timestamp representing the time based on the time management to the received data, and transmit the data and the correspondence information, which associates the data's identification information with the timestamp, to the data transfer unit. The data transfer unit receives the data and the correspondence information from the data receiving unit, temporarily stores the correspondence information, transfers the data and the correspondence information to the data storage unit, and after confirming that the data has been stored in the data storage unit, releases the temporary storage of the correspondence information. If the data transfer unit experiences a failure after receiving the data and the correspondence information, and then resumes operation from the failure, it sends a response to the data source via the data receiving unit prompting the retransmission of the data, and also retains the correspondence information and uses the timestamp within the correspondence information for the retransmitted data. A data collection program that causes the aforementioned computer to perform the following action.
- The data storage unit causes the computer to perform the following actions: set the time indicated by the timestamp in the correspondence information as the creation time of the data, and store the data in association with the creation time. The data acquisition program according to claim 9.
- When the data transfer unit resumes operation after the failure, if the mapping information was being transferred to the data storage unit, the computer will instruct the data storage unit to delete the data corresponding to the identification information within the mapping information, and after confirming that the data storage unit has deleted the data, it will send a response prompting the computer to retransmit the data. The data acquisition program according to claim 9.
Description
This disclosure relates to the technology used to collect data. Data hubs are gaining attention as a way to link data across multiple information systems. Data hubs are characterized by low data loss and high scalability. In recent years, the areas in which data hubs are applied have expanded, and consequently, the data they handle has become more diverse. For example, data hubs are required not only for collecting numerical information output from devices, but also for file transfer. File transfer often requires guaranteed order between data points. Furthermore, information systems performing batch processing output large amounts of data to the data hub simultaneously. Patent Document 1 discloses a technology for guaranteeing the order of data. The data processing system disclosed in Patent Document 1 consists of a router connected to a network, a load balancing server connected to the router, multiple receiving servers connected to the load balancing server, and a processing server connected to the multiple receiving servers. The router assigns a timestamp to incoming packets. For transactions transmitted in multiple packets with the same identifier, the receiving servers assign the latest timestamp assigned to each packet as the transaction reception time. The processing server sorts and processes the multiple transactions received from the multiple receiving servers according to the transaction reception times assigned to transactions that have elapsed a pre-set maximum delay time. Japanese Patent Publication No. 2012-129857 This is a block diagram of the information gathering system of this embodiment.Figure 1 is a sequence diagram illustrating the process when no failures occur in the data receiving unit and data transfer unit of the information collection system shown in Figure 1.Figure 2 is a flowchart illustrating the details of the processing in the data receiving unit.Figure 2 is a flowchart illustrating the details of the processing in the data transfer section.Figure 1 shows an example of the configuration of the work-in-progress status management database.Figure 1 shows an example of the configuration of the data storage database.Figure 1 is a sequence diagram illustrating the process that occurs when a failure occurs in the data receiving unit during data collection in the information collection system shown in Figure 1.Figure 1 is a sequence diagram illustrating the process that occurs when a failure occurs in the data receiving unit after data has been transferred from the data receiving unit to the data transfer unit in the information collection system shown in Figure 1.Figure 1 is a sequence diagram illustrating the process that occurs when a failure occurs in the data transfer unit before data is transferred from the data transfer unit to the data storage unit in the information collection system shown in Figure 1.Figure 1 is a sequence diagram illustrating the process that occurs when a failure occurs in the data transfer unit while transferring data from the data transfer unit to the data storage unit in the information collection system shown in Figure 1.These are flowcharts illustrating the details of the processing in the data transfer section shown in Figures 9 and 10.This sequence diagram illustrates another example of processing in the information collection system shown in Figure 1, where no failures occur in the data receiving unit and the data transfer unit.Figure 12 is a flowchart illustrating the details of the processing in the data transfer unit.Figure 1 shows an example of the configuration of the work-in-progress status management database.This sequence diagram illustrates another example of how to handle a failure that occurs in the data transfer unit before transferring data from the data transfer unit to the data storage unit in the information collection system shown in Figure 1.Figure 15 is a flowchart illustrating the details of the processing in the data transfer unit. Embodiments of the present invention will be described below with reference to the drawings. Figure 1 is a block diagram showing the information gathering system of this embodiment. As shown in Figure 1, the data acquisition system 100 of this embodiment is a system that transfers data transmitted from data sources 101 and 111 for data utilization 106, and includes a load balancing unit 102, data receiving units 103 and 113, data transfer units 104 and 114, and a data storage unit 105. Note that the number of data receiving units 103 and 113 and data transfer units 104 and 114 are not limited to two each. The load balancing unit 102 distributes the data transmitted from data sources 101 and 111 to data receiving units 103 and 113 using a predetermined load balancing method, such as round-robin, and then transmits it. Note that the data distribution method is not limited to load balancing, and the distributed data may sometimes be referred to as chunks. The data receiving units 103 and 113 are respo