CN-116665222-B - Image information extraction method and device

CN116665222BCN 116665222 BCN116665222 BCN 116665222BCN-116665222-B

Abstract

The embodiment of the invention discloses an image information extraction method and device. After the image information extraction method obtains the image to be identified and the configuration information, a target basic extractor is called according to the configuration information to extract information of the image to be identified so as to determine each target text in the image to be identified. The configuration information is used for representing calling requirements of at least one target basic extractor, and each target basic extractor is used for extracting target text of a corresponding information type. By the method, the image information extraction cost can be reduced, and the image information extraction efficiency can be improved.

Inventors

ZHAO FEI
XIA WEI

Assignees

拉扎斯网络科技（上海）有限公司

Dates

Publication Date: 20260505
Application Date: 20230523

Claims (14)

1. An image information extraction method, characterized in that the method comprises: acquiring an image to be identified; Acquiring configuration information, wherein the configuration information is used for representing calling requirements for at least one target basic extractor, each target basic extractor is used for extracting target texts of corresponding information types, each target basic extractor is preset with a unique corresponding keyword or named entity type, and the configuration information comprises initial text content and end text content; calling each target basic extractor to extract information of the image to be identified so as to determine each target text in the image to be identified; the calling each target basic extractor to extract information of the image to be identified comprises the following steps: Positioning a start area and an end area in the text area to be identified according to the start text content and the end text content, and defining a screening range based on the spatial position relation between the start area and the end area in the image; Determining a text region which is in the screening range in the image to be identified and contains a corresponding keyword or a corresponding type naming entity as a target text region to be identified; invoking the target basic extractor to extract the text region to be identified of the target; determining a text region to be recognized, which is matched with the text region to be recognized of the target, in response to the fact that the target text is not detected in the text content of the text region to be recognized of the target; determining a corrected text region to be identified in the image to be identified, wherein the corrected text region to be identified comprises the target text region to be identified and the matched text region to be identified; And calling a corresponding target basic extractor to extract information of the corrected text region to be identified.
2. The method of claim 1, wherein determining a target to-be-identified text region corresponding to each of the target base extractors from the text content in the to-be-identified text region comprises: And determining target text areas to be identified corresponding to the target basic extractors in the text areas to be identified according to the text content in response to the configuration information not including screening range information.
3. The method of claim 1, wherein the determining at least one text region to be identified and text content in each of the text regions to be identified in the image to be identified comprises: determining at least one text region to be recognized in the image to be recognized; and carrying out text recognition on each text region to be recognized to determine text content in each text region to be recognized.
4. A method according to claim 3, wherein prior to text recognition of each of the text regions to be recognized to determine text content in each of the text regions to be recognized, the method further comprises: determining the text direction of each text region to be identified; And for each text region to be identified, rotating the text region to be identified according to the text direction so as to enable the text direction of the text region to be identified to be horizontal.
5. The method of claim 1, wherein determining a target to-be-identified text region corresponding to each of the target base extractors from the text content in the to-be-identified text region comprises: and for each target basic extractor, determining the text region to be identified, which contains the corresponding keyword in the text content, as a target text region to be identified, which corresponds to the target basic extractor.
6. The method of claim 1, wherein determining a target to-be-identified text region corresponding to each of the target base extractors from the text content in the to-be-identified text region comprises: And for each target basic extractor, determining the text region to be identified, which contains the named entity of the corresponding type, in the text content as a target text region to be identified corresponding to the target basic extractor.
7. The method of claim 5, wherein the invoking the corresponding target base extractor to extract information of the target text region to be identified comprises: acquiring text characteristics of a target text corresponding to the keywords; Acquiring the position relation between the keywords and the target text; And detecting target texts corresponding to the keywords in the text contents of the text region to be identified according to the text characteristics and/or the position relation.
8. The method of claim 1, wherein the determining a region of text to be identified that matches the target region of text to be identified comprises: Determining a text region to be recognized, which is matched with the text region to be recognized of the target, according to the target position information and the target distance information; The target position information is position information between the target text region to be identified and each text region to be identified, and the target distance information is distance information between the target text region to be identified and each text region to be identified.
9. The method according to any one of claims 1-8, wherein after determining each of the target text in the image to be identified, the method further comprises: and outputting each target text.
10. The method of any of claims 1-8, wherein the configuration information is further used to characterize call requirements for at least one target global image preprocessor; Before invoking each target basic extractor to extract information of the image to be identified, the method further comprises: And calling the target global image preprocessor to perform image preprocessing on the image to be identified according to the configuration information.
11. The method according to any one of claims 1-8, further comprising: and responding to the received information extraction request, and acquiring the image to be identified and the configuration information according to the information extraction request.
12. An image information extraction apparatus, characterized in that the apparatus comprises: the image acquisition unit is used for acquiring an image to be identified; The system comprises a configuration information acquisition unit, a processing unit and a processing unit, wherein the configuration information acquisition unit is used for acquiring configuration information, the configuration information is used for representing the calling requirement of at least one target basic extractor, each target basic extractor is used for extracting target texts with corresponding information types, each target basic extractor is preset with a unique corresponding keyword or named entity type, the configuration information comprises screening range information, and the screening range information comprises initial text content and terminal text content; The extraction unit is used for calling each target basic extractor to extract information of the image to be identified so as to determine each target text in the image to be identified; The extraction unit is further configured to: Positioning a start area and an end area in the text area to be identified according to the start text content and the end text content, and defining a screening range based on the spatial position relation between the start area and the end area in the image; Determining a text region which is in the screening range in the image to be identified and contains a corresponding keyword or a corresponding type naming entity as a target text region to be identified; invoking the target basic extractor to extract the text region to be identified of the target; determining a text region to be recognized, which is matched with the text region to be recognized of the target, in response to the fact that the target text is not detected in the text content of the text region to be recognized of the target; determining a corrected text region to be identified in the image to be identified, wherein the corrected text region to be identified comprises the target text region to be identified and the matched text region to be identified; And calling a corresponding target basic extractor to extract information of the corrected text region to be identified.
13. A computer readable storage medium, on which computer program instructions are stored, which computer program instructions, when executed by a processor, implement the method of any one of claims 1-11.
14. An electronic device, the device comprising: a memory for storing one or more computer program instructions; a processor, the one or more computer program instructions being executed by the processor to implement the method of any of claims 1-11.

Description

Image information extraction method and device Technical Field The invention relates to the technical field of information, in particular to an image information extraction method and device. Background Currently, a wide application scenario exists for the task of extracting key information (Key Information Extraction, KIE). Specifically, the key information extraction task is to extract specific text from an image in a structured manner, and information in any format, such as form information, bill information or certificate information, in the image can be extracted through the key information extraction. In the prior art, the task of extracting the key information is usually realized by adopting template matching or utilizing a related model, but the summarization of the template and the training of the model not only consume higher cost, but also consume a great deal of time. Meanwhile, the summarized templates or the trained models can be generally applied only in the current scene, but cannot be commonly used among different scenes, so that the image information extraction cost in the prior art is high, and the efficiency is low. Disclosure of Invention In view of this, the embodiments of the present invention provide an image information extraction method and apparatus, so as to reduce the cost of image information extraction and improve the efficiency of image information extraction. In a first aspect, an embodiment of the present invention provides an image information extraction method, including: acquiring an image to be identified; Acquiring configuration information, wherein the configuration information is used for representing the calling requirement of at least one target basic extractor, and each target basic extractor is used for extracting target text of a corresponding information type; and calling each target basic extractor to extract information of the image to be identified so as to determine each target text in the image to be identified. In a second aspect, an embodiment of the present invention provides an image information extraction apparatus, including: the image acquisition unit is used for acquiring an image to be identified; The system comprises a configuration information acquisition unit, a target basic extractor and a processing unit, wherein the configuration information acquisition unit is used for acquiring configuration information, the configuration information is used for representing the calling requirement of at least one target basic extractor, and each target basic extractor is used for extracting target text of a corresponding information type; an extracting unit for calling each target basic extractor to extract information of the image to be identified to determine each target text in the image to be identified In a third aspect, embodiments of the present invention provide a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method according to any of the first aspects. In a fourth aspect, an embodiment of the present invention provides an electronic device, including: a memory for storing one or more computer program instructions; a processor, the one or more computer program instructions being executed by the processor to implement the method of any of the first aspects. According to the image information extraction method, after the image to be identified and the configuration information are obtained, a target basic extractor is called according to the configuration information to extract information of the image to be identified so as to determine each target text in the image to be identified. The configuration information is used for representing calling requirements of at least one target basic extractor, and each target basic extractor is used for extracting target text of a corresponding information type. By the method, the image information extraction cost can be reduced, and the image information extraction efficiency can be improved. Drawings The above and other objects, features and advantages of the present invention will become more apparent from the following description of embodiments of the present invention with reference to the accompanying drawings, in which: fig. 1 is a schematic diagram of an application system of an image information extraction method according to an embodiment of the present invention; FIG. 2 is a schematic diagram of a system frame of an image information extraction method according to an embodiment of the present invention; FIG. 3 is a flowchart of an image information extraction method according to an embodiment of the present invention; FIG. 4 is a flowchart of a target text extraction method according to an embodiment of the present invention; FIG. 5 is a schematic diagram of an image to be identified according to an embodiment of the present invention; FIG. 6 is a schematic diagram of an image to be identified according to an