CN-122027712-A - Multi-protocol data grabbing method and system
Abstract
The invention discloses a multi-protocol data grabbing method and system, and relates to the technical field of data grabbing. The multi-protocol data capture method is applied to a multi-protocol data capture system, the multi-protocol data capture system comprises a client and a server which is in communication connection with the client through a plurality of network protocols, the multi-protocol data capture method comprises the steps that the client sends a data capture request to the server through one of the network protocols, the server calls a corresponding processor to capture data based on a request type carried in the data capture request, the server converts captured data into a unified intermediate format, and the server sends the data converted into the unified intermediate format to the client through one of the network protocols. The multi-protocol data grabbing method and system disclosed by the invention can realize cross-language and cross-protocol data grabbing service.
Inventors
- CHEN XU
- YANG QIJUN
- LEI YU
Assignees
- 成都堃升科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260128
Claims (10)
- 1. The multi-protocol data grabbing method is applied to a multi-protocol data grabbing system, and the multi-protocol data grabbing system comprises a client and a server which is in communication connection with the client through a plurality of network protocols, and is characterized by comprising the following steps: the client sends a data grabbing request to the server through one of the network protocols; The server side calls a corresponding processor to perform data capture based on a request type carried in the data capture request; the server converts the captured data into a uniform intermediate format; and the server transmits the data converted into the uniform intermediate format to the client through one network protocol in the plurality of network protocols, so that the client restores the data converted into the uniform intermediate format.
- 2. The multi-protocol data crawling method according to claim 1, wherein the data crawling request includes the request type, data parameters of the crawled data and authentication information of the client, and the server calls a corresponding processor to crawl data based on the request type carried in the data crawling request, the method further includes: The server performs identity verification based on the identity verification information; Correspondingly, the server calls a corresponding processor to perform data capture based on the request type carried in the data capture request, and the method comprises the following steps: and after the identity verification is passed, the server calls a processor corresponding to the request type to perform data capture based on the data parameters of the captured data, wherein the data parameters of the captured data comprise the data capture position, the data information of the captured data and the capture mode of the data.
- 3. The multi-protocol data crawling method of claim 2, wherein the request type is a database query request, an HTTP request or a file download request.
- 4. The multi-protocol data crawling method according to claim 1, wherein after said server converts the crawled data into a unified intermediate format, said method further comprises: The server compresses and encrypts the data converted into the unified intermediate format; Correspondingly, the server sends the data converted into the uniform intermediate format to the client through one network protocol in the plurality of network protocols, and the method comprises the following steps: And the server transmits the data converted into a unified intermediate format and subjected to compression and encryption processing to the client through one network protocol in the plurality of network protocols.
- 5. The multi-protocol data grabbing method of claim 1, wherein the method further comprises: the server records the state of the grabbing task in the data grabbing process and the performance index which is related to the grabbing task in the data grabbing process.
- 6. The multi-protocol data grabbing method according to claim 1, wherein the server sends the data converted into the unified intermediate format to the client through one of the plurality of network protocols, including: and the server transmits the data converted into the uniform intermediate format to the client in a streaming mode through one network protocol in the plurality of network protocols.
- 7. The multi-protocol data crawling method according to claim 1, characterized in that before said client sends a data crawling request to said server via one of said plurality of network protocols, said method further comprises: the server registers the data operation types supported by the server.
- 8. The multi-protocol data grabbing system is characterized by comprising a client and a server which is in communication connection with the client through a plurality of network protocols, wherein the client is used for sending a data grabbing request to the server through one of the network protocols; The server side is used for calling a corresponding processor to perform data capture based on a request type carried in the data capture request; Converting the captured data into a unified intermediate format, and And transmitting the data converted into the uniform intermediate format to the client through one of the network protocols so that the client restores the data converted into the uniform intermediate format.
- 9. The system according to claim 8, wherein the data capture request includes the request type, a data parameter of the captured data, and authentication information of the client, and the server is further configured to perform authentication based on the authentication information before invoking a corresponding processor to perform data capture based on the request type carried in the data capture request; correspondingly, when the server side is used for calling the corresponding processor to perform data capture based on the request type carried in the data capture request, the server side is specifically used for: and after the identity verification is passed, invoking a processor corresponding to the request type to perform data capture based on the data parameters of the captured data, wherein the data parameters of the captured data comprise the data capture position, the data information of the captured data and the capture mode of the data.
- 10. The multi-protocol data crawling system of claim 9, wherein the request type is a database query request, an HTTP request, or a file download request.
Description
Multi-protocol data grabbing method and system Technical Field The invention belongs to the technical field of data grabbing, and particularly relates to a multi-protocol data grabbing method and system. Background With the advent of the information age, data has become an important asset for businesses that need to acquire data from a variety of sources, including internal systems, external websites, databases, etc., for data analysis, business decisions, and product development. Data crawling techniques have evolved that can help businesses to automatically acquire data from different data sources and integrate into a data staging or other data analysis platform. In the prior art, the data grabbing method generally depends on a specific language or Protocol, such as RESTful API (Representational STATE TRANSFER Application Programming Interface, representing a layer conversion application programming interface), JDBC (Java Database Connectivity, java database connection), FTP (FILE TRANSFER Protocol ), and the like. However, these methods have the following drawbacks: Language binding is strong-traditional data crawling methods are typically bound to specific programming languages, such as Java, python, etc. This limits the cross-language calling capability of the data crawling service, and makes it difficult to meet the diversified data acquisition requirements. The protocols are not uniform in that different data sources may use different communication protocols, such as HTTP, FTP, database protocols, etc. This results in a high integration complexity of the data crawling service, requiring different crawling modules to be developed for different protocols. The expansibility is poor, and the traditional data grabbing method is difficult to support novel data sources or protocols, such as social media data, streaming data and the like. This limits the range of data acquisition for the enterprise and does not fully utilize the emerging data resources. Based on the defects, the existing data capture scheme is difficult to realize cross-language and cross-protocol data capture service. Therefore, how to provide an effective solution to implement a cross-language and cross-protocol data grabbing service has become a challenge in the prior art. Disclosure of Invention The present invention is directed to a multi-protocol data capturing method and system, which are used for solving the above problems in the prior art. In order to achieve the above purpose, the present invention adopts the following technical scheme: In a first aspect, the present invention provides a multi-protocol data crawling method, applied to a multi-protocol data crawling system, where the multi-protocol data crawling system includes a client and a server communicatively connected to the client through multiple network protocols, and the method includes: the client sends a data grabbing request to the server through one of the network protocols; The server side calls a corresponding processor to perform data capture based on a request type carried in the data capture request; the server converts the captured data into a uniform intermediate format; and the server transmits the data converted into the uniform intermediate format to the client through one network protocol in the plurality of network protocols. In one possible design, the data capture request includes the request type, a data parameter of the captured data, and authentication information of the client, and before the server invokes a corresponding processor to capture data based on the request type carried in the data capture request, the method further includes: The server performs identity verification based on the identity verification information; Correspondingly, the server calls a corresponding processor to perform data capture based on the request type carried in the data capture request, and the method comprises the following steps: and after the identity verification is passed, the server calls a processor corresponding to the request type to perform data capture based on the data parameters of the captured data, wherein the data parameters of the captured data comprise the data capture position, the data information of the captured data and the capture mode of the data. In one possible design, the request type is a database query request, an HTTP request, or a file download request. In one possible design, after the server converts the captured data into a uniform intermediate format, the method further includes: The server compresses and encrypts the data converted into the unified intermediate format; Correspondingly, the server sends the data converted into the uniform intermediate format to the client through one network protocol in the plurality of network protocols, and the method comprises the following steps: And the server transmits the data converted into a unified intermediate format and subjected to compression and encryption processing to the client th