JP-7854735-B2 - Document search device
Inventors
- 遠野 裕平
Assignees
- 弁理士法人きさ特許商標事務所
Dates
- Publication Date
- 20260507
- Application Date
- 20240930
Claims (6)
- An interface display unit that displays a screen on a display device that includes a first field displaying search targets containing multiple sentences, Memory unit and, A source text processing unit that stores the source text specified by the user in the storage unit and outputs the source text as the search target to the first field, A query acquisition unit that acquires a string consisting of multiple words included in the search target displayed in the first field, and which is selected by the user in the first field, as a query; The system includes a search unit that searches for sentences similar to the query from the source text based on the similarity between multiple sentences contained in the source text and the query, and outputs the search results to the first field, The query acquisition unit acquires the string selected by the user in the first field from the search results output to the first field after the search unit has performed the search, as the query . The search unit, after the query acquisition unit has acquired the string included in the search results as the query, searches the original text for sentences similar to the query based on the similarity between the multiple sentences included in the original text and the query acquired from the string included in the search results, and outputs the search results to the first field. Document search device.
- The document search device according to claim 1, wherein the search unit records the queries used for the search as a query log.
- The query acquisition unit reads a query description file containing multiple strings, and acquires the multiple strings as multiple queries. The document search device according to claim 1 or 2, wherein the search unit repeatedly performs searches based on a plurality of queries acquired by the query acquisition unit.
- It has a result storage unit that outputs multiple result files as a result of repeated searches. The document search device according to claim 3, wherein each of the multiple result files has each of the multiple strings used as multiple queries set as its file name.
- The document search device according to claim 1 or 2, wherein the search unit outputs to the screen the search results a plurality of sentences contained in the original text sorted in descending order of similarity.
- The system includes a conversion unit that removes hiragana and punctuation from the string selected by the user, replaces each consecutive block of the same character type with a regular expression representing that character type, thereby converting a portion of the string selected by the user into a formal expression for pattern matching, and obtains a query that includes the formal expression for pattern matching. The document search device according to claim 1 or 2, wherein the search unit searches the original text for strings that match a query containing a formal expression for performing the pattern matching.
Description
This disclosure relates to a document retrieval device for searching for documents. Generally, when drafting or translating patent specifications, keyword searches are sometimes used to determine how words described in the patent specification are used elsewhere within the specification. Patent Document 1 discloses a device that stores link information between specified words and the locations in the patent specification where those words appear, by performing a keyword search on the target patent specification using those specified words. Japanese Patent Publication No. 2002-149704 This is a diagram showing the configuration of a document search system according to Embodiment 1.This is a hardware configuration diagram showing a document search device according to Embodiment 1.This is a schematic diagram showing the screen displayed by the document search software according to Embodiment 1.This is a flowchart showing the document search method performed by the document search device according to Embodiment 1.This flowchart shows the flow of the search process performed by the document search device according to Embodiment 1.This figure shows how the search target, saved as the original text, is displayed in the first field of the screen in Embodiment 1.This figure shows how a portion of the search targets displayed in the first field has been selected by the user in Embodiment 1.This figure shows the first field of the screen displaying the search results and the second field displaying the query in Embodiment 1.This figure shows how a portion of the search results displayed in the first field has been selected by the user in Embodiment 1.This figure shows the first field of the screen displaying the search results and the second field displaying the query in Embodiment 1.This figure shows how a portion of the search results displayed in the first field has been selected by the user in Embodiment 1.This figure shows how the query is displayed in the second field of the screen in Embodiment 1.This figure shows the first field of the screen where the search results are displayed and the query is shown in the second field, in Embodiment 1. Embodiment 1. Figure 1 is a diagram showing the configuration of a document search system 1 according to Embodiment 1. The document search system 1 includes, for example, a document search device 2, an input device 3, and a display device 4. The document search system 1 is a system for searching for strings that meet certain conditions from documents to be searched. Hereinafter, searching for strings that meet certain conditions from documents to be searched will be referred to as "document search" or "document retrieval." The documents to be searched are, for example, patent specifications for filing a patent application. However, the documents to be searched are not limited to patent specifications, as long as they contain text that includes multiple sentences. Here, "sentence" refers to a series of words or strings separated by any delimiter. In the following explanation, the delimiter is used as an example to refer to a period, but is not limited to this. In addition, strings that constitute clauses such as "when," and "in the case of," may be treated as delimiters. In this case, a part of a compound sentence can be treated in the same way as a "sentence" here. The document search device 2 is, for example, a PC terminal owned by a user performing a document search. The document search device 2 has software installed for performing document searches. Hereinafter, this software will be referred to as the document search software. The document search software includes a document search program containing instructions to be executed by the computer, and configuration data for controlling the document search program. The document search device 2 has a control unit 5 and a storage unit 6. The control unit 5 has, as functional units, an interface display unit 11, a query acquisition unit 12, a conversion unit 13, a search unit 14, a source text processing unit 15, a result storage unit 16, and a query log operation unit 17. The storage unit 6 stores data related to the processing of the document search software: source text 21, a query log 22, a query description file 23, and a result file 24. The source text 21, query log 22, query description file 23, and result file 24 will be described later along with the description of the functional units of the control unit 5. Input device 3 is a device, such as a mouse or keyboard, connected to document search device 2 for inputting data into document search device 2. Display device 4 is a display, etc., connected to document search device 2, which displays a screen for operating document search software based on instructions from document search device 2. Figure 2 is a hardware configuration diagram showing a document search device 2 according to Embodiment 1. The document search device 10 is composed of a processor 41 and a memory 42