KR-20260063223-A - Apparatus for matching URLs at high speed and method thereof
Abstract
A high-speed URL matching device and a method thereof are disclosed. The present invention can increase the matching speed by parsing an input URL to distinguish host and path information and registering a host map and a URL hash map, and when a search target URL is received, attempting URL matching in the registered host map and URL hash map, prioritizing matching the longest path based on path length, and then applying a policy (Action) according to the matching result.
Inventors
- 김용훈
- 정원용
- 문신일
- 최간호
Assignees
- (주) 시스메이트
Dates
- Publication Date
- 20260507
- Application Date
- 20241030
Claims (12)
- In a URL matching method using a URL matching device, A step of parsing the input URL to separate host and path information and calculating the depth of the path; Step of registering a host map for the host information of the input URL; If the input URL includes path information, a step of registering a URL hash map for that path; A step of receiving a search target URL, matching a host in a registered host map, and if a match is found, attempting path matching in the URL hash map of the corresponding path, prioritizing the matching of the longest path based on path length; and Step of applying a policy (Action) based on the matching result; A URL matching method characterized by including
- In claim 1, the step of registering a host map A URL matching method characterized by determining a method for registering a host map based on whether an input URL contains only host information or includes host and path information, and whether the host of the input URL overlaps with an existing host.
- In claim 1, the step of registering a host map A URL matching method characterized by managing the host's path URL count information to prevent duplicate URL registration.
- In claim 1, the step of registering a URL hash map A URL matching method characterized by searching for a URL hash map that matches the step according to the URL step (Step or Depth) and registering a URL in the searched URL hash map.
- In claim 1, the step of registering a URL hash map A URL matching method characterized by preventing memory waste by pre-allocating the size of a URL hash map buffer according to the length of the input URL.
- In claim 1, the step of matching with priority to the longest path A step of checking the PATH URL count value of the host map when a host matches in the host map; A step of transmitting the policy of the host map when the path URL count value of the host map is 0; and If the path URL count value of the host map is greater than 0, a step of searching the URL hash map to check for path matching; A URL matching method characterized by including
- In claim 1, the step of matching with priority to the longest path A URL matching method characterized by attempting to match by going down to shorter paths step by step when the longest path matching attempt fails during the URL matching process.
- In claim 1, the step of matching with priority to the longest path A step of attempting path matching at the URL stage of the longest path using the hash value of the URL with the longest path, and, upon successful matching, transmitting the policy of the URL hash map corresponding to the longest path; A step of repeatedly attempting path matching by descending the URL levels until matching is successful when the longest path fails; A step of transmitting the policy of the URL hash map of the URL step that succeeded in matching when a match is successful at a predetermined URL step; and Finally, a step to transmit the policy from the host map if path matching fails in the URL hash map; A URL matching method characterized by including
- A URL parsing unit that parses an input URL to distinguish host and path information and calculates the depth of the path; A host map register that registers a host map for the host information of an input URL; A URL hash map register that registers a URL hash map for a path when the input URL includes path information; A URL matching unit that, upon receiving a search target URL, matches a host in a registered host map, and if a match is found, attempts path matching in the URL hash map of the corresponding path, prioritizing the matching of the longest path based on path length; and Policy application unit that applies policies based on matching results; A URL matching device characterized by including
- In claim 9, the URL matching part If a host matches in the host map, check the PATH URL count value in the host map, and If the path URL count value of the host map is 0, pass the host map policy, and A URL matching device characterized by searching a URL hash map to check for path matching when the path URL count value of a host map is greater than 0.
- In claim 9, the URL matching part A URL matching device characterized by attempting to match by going down to shorter paths step by step when the longest path matching attempt fails during the URL matching process.
- In claim 9, the URL matching part Attempt path matching at the URL stage of the longest path using the hash value of the URL with the longest path, and if the match is successful, pass the policy of the URL hash map corresponding to the longest path, and If the longest path fails to match, attempt path matching by repeatedly descending URL levels until matching is successful; however, if matching is successful at a predetermined URL level, transmit the policy of the URL hash map of the URL level where the match was successful. A URL matching device characterized by transmitting a policy from a host map when path matching fails in the URL hash map.
Description
High-speed URL matching apparatus and method thereof The present invention relates to a URL search technology, and more specifically, to a technology capable of searching for URLs at high speed using a URL map. A URL (Uniform Resource Locator) is an address that specifies the location of resources (web pages, images, files, etc.) on the web. A URL consists of various components, which are used to identify and access specific resources. The components of a URL include the scheme, host, and path. A scheme is the beginning of a URL that defines the protocol to be used. Examples include http, https, ftp, etc. A host is the domain name or IP address of the server where a resource is located. For example, www.example.com. A PATH specifies the path to a specific resource within the server. For example, /path/to/resource. Users can request resources by entering a URL when accessing a web page, URLs can be used to call API endpoints in communication between a server and a client, and URLs can also be used to create links to other pages or files. FIG. 1 is a diagram illustrating the configuration of a URL matching device according to an embodiment of the present invention. FIG. 2 is a drawing illustrating the detailed configuration of the control unit of FIG. 1 according to an embodiment of the present invention. FIGS. 3 and 4 are drawings illustrating an example of registering a host map according to an embodiment of the present invention. FIG. 5 is a diagram illustrating an example of URL hash map registration according to an embodiment of the present invention, FIG. 6 is a diagram illustrating the flow of a URL matching method according to an embodiment of the present invention. FIG. 7 is a diagram illustrating a URL search process according to the URL step of a search target URL according to an embodiment of the present invention. FIG. 8 is a diagram illustrating an example of a URL search process for a search target URL according to an embodiment of the present invention. The advantages and features of the present invention and the methods for achieving them will become clear by referring to the embodiments described below in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below but can be implemented in various different forms. These embodiments are provided merely to ensure that the disclosure of the present invention is complete and to fully inform those skilled in the art of the scope of the invention, and the present invention is created only by the scope of the claims. Throughout the specification, the same reference numerals refer to the same components. In describing the embodiments of the present invention, if it is determined that a detailed description of known functions or configurations may unnecessarily obscure the essence of the invention, such detailed description will be omitted. Furthermore, the terms described below are terms created to reflect the functions in the embodiments of the present invention, and these may vary depending on the intentions or conventions of the user or operator. Therefore, their creation should be based on the content throughout this specification. Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings. However, the embodiments of the present invention exemplified below may be modified in various different forms, and the scope of the present invention is not limited to the embodiments described below. The embodiments of the present invention are provided to more completely explain the present invention to those skilled in the art to which this invention pertains. Before describing the present invention, terms used in the specification are defined. A URL consists of a host and a path. The host is the domain name, and the path represents the location of a resource. For example, if the URL is www.naver.com/video/stream/player, the host is www.naver.com and the path is /video/stream/player. A URL can consist of only a host or both a host and a path. A URL Step or Depth is a unit value of a URL path. The number of steps can be indicated using a “/” separator. For example, www.naver.com/video is Step 1 or 1st level, and www.naver.com/video/stream is Step 2 or 2nd level. A hash is a value generated by combining the host and path of a URL. It uses a hash algorithm, such as CRC32, to generate a unique hash value. A Policy (Action) is the method to be processed for a URL when the entered search target URL matches. Examples of data processing methods include blocking, bypassing, control, redirection, and notification. A buffer pool is a collection of unused buffers (memory space where URLs are stored). FIG. 1 is a diagram illustrating the configuration of a URL matching device according to one embodiment of the present invention. Referring to FIG. 1, the URL matching device (1) may be a computing device. The URL matching device (1) may be impl