KR-20260063625-A - METHOD AND SYSTEM FOR REGENERATING TITLE FOR CONTENT BASED ON LARGE LANGUAGE MODEL
Abstract
A method and system for regenerating a title for content based on a large language model are disclosed. A title regeneration method according to one embodiment may include the steps of: identifying a first title of a first content in a content database in which content exposed through at least one first service is registered; generating a prompt that reflects a predefined title feature and the first title of the first content according to a content exposure surface of a second service different from the at least one first service; inputting the prompt into a title improvement model based on a large language model to generate a second title of the first content for exposure on the content exposure surface of the second service; and storing the second title of the first content in an inference result database.
Inventors
- 가순원
- 박보연
- 박대철
- 손보경
- 박태림
- 강재욱
Assignees
- 네이버 주식회사
Dates
- Publication Date
- 20260507
- Application Date
- 20241030
Claims (18)
- In a method for regenerating a title of a computer device comprising at least one processor, A step of verifying the first title of the first content in a content database in which the content exposed through at least one first service is registered, by the above-mentioned at least one processor; A step of generating a prompt by the at least one processor that reflects a predefined title feature and a first title of the first content according to a content exposure surface of a second service different from the at least one first service; A step of generating a second title of the first content for exposure on a content exposure surface of the second service by inputting the prompt into a large language model-based title improvement model by the at least one processor; and Step of storing the second title of the first content in the inference result database by the above at least one processor A title regeneration method including
- In paragraph 1, The above second service includes a content recommendation service for recommending user-customized content to the user, and The above title characteristics are defined according to the content exposure platform of the above content recommendation service. A title regeneration method characterized by
- In paragraph 1, A method for regenerating a title characterized by including at least one of the following features: (1) removal of flowery language, (2) prohibition of repetition of the same word, (3) prohibition of repetition of synonyms, (4) prohibition of simple keyword listing, (5) addition of context-based separators, and (6) separation using separators for proper nouns.
- In paragraph 1, A title regeneration method characterized by including at least one of the following title features: (1) conversion of keywords, (2) use of implied keywords, (3) retention of personal names, and (4) retention of keywords indicating that multiple items are grouped and explained.
- In paragraph 1, A title regeneration method characterized by including at least one of the following features: (1) removal of emojis based on their relevance to the title, (2) retention of emojis representing at least one of mood, atmosphere, and context, and (3) prohibition of repeating the same emoji more than a preset number of times.
- In paragraph 1, The step of verifying the first title of the above-mentioned first content is, A title regeneration method characterized by confirming the title of a selected content as the first title of the first content based on at least one of the length of the title, whether there is a case requiring improvement of the title, and whether the meaning of the title can be understood by the title improvement model.
- In paragraph 6, A title regeneration method characterized by including at least one of the above cases requiring improvement, such as whether the same keyword is repeated, whether the title is without spaces, or whether the title contains flowery language.
- In paragraph 6, A title regeneration method characterized by the fact that the ability to understand the above meaning includes whether the title consists of any one of the following: a sequence of emojis, a sequence of special characters, a sequence of consonants or vowels, alien language, a sequence of meaningless numbers, or a sequence of meaningless foreign languages.
- In paragraph 1, The above-mentioned saving step is, A title regeneration method characterized by storing the second title of the first content in the inference result database depending on whether a major defect is detected in the second title of the first content.
- In Paragraph 9, A title regeneration method characterized by detecting at least one of the following situations: a first situation in which the same token in the second title is repeated more than a preset number of times; a second situation in which the second title contains an empty value; a third situation in which the second title contains the content of a prompt; a fourth situation in which the meaning of the second title cannot be understood; a fifth situation in which the second title contains processing content or guidance text; a sixth situation in which the second title contains a new context not given in the first title; and a seventh situation in which personal information masking symbols are included more than a preset number.
- In Paragraph 10, Major defects of the above-mentioned first situation, the above-mentioned second situation, the above-mentioned third situation, and the above-mentioned seventh situation are detected based on preset rules, and The above-mentioned fourth, fifth, and sixth situations are detected using a large language model-based filter. A title regeneration method characterized by
- In Paragraph 10, A title regeneration method characterized by detecting the above-mentioned fourth situation based on the occurrence rate of major defects calculated through the result of detecting major defects using a large language model-based filter for each of the titles randomly sampled from the above-mentioned inference result database.
- A computer program stored on a computer-readable recording medium to be combined with a computer device to execute the method of any one of claims 1 to 12 on the computer device.
- In a title regeneration system implemented by at least one computer device, The above at least one computer device includes at least one processor implemented to execute computer-readable instructions, and By the above at least one processor, Check the first title of the first content in the content database where the content exposed through at least one first service is registered, and Generating a prompt that reflects a predefined title feature and a first title of the first content according to a content exposure surface of a second service different from at least one first service, and Input the above prompt into a title improvement model based on a Large Language Model to generate a second title of the above first content for exposure on the content exposure surface of the above second service, and Storing the second title of the above-mentioned first content in the inference result database A title regeneration system featuring
- In Paragraph 14, The above second service includes a content recommendation service for recommending user-customized content to the user, and The above title characteristics are defined according to the content exposure platform of the above content recommendation service. A title regeneration system featuring
- In Paragraph 14, A title regeneration system characterized by including at least one of the following title features: (1) removal of flowery language, (2) prohibition of repetition of the same word, (3) prohibition of repetition of synonyms, (4) prohibition of simple keyword listing, (5) addition of context-based separators, and (6) separation using separators for proper nouns.
- In Paragraph 14, A title regeneration system characterized by including at least one of the following title features: (1) conversion of keywords, (2) use of implicit keywords, (3) retention of person names, and (4) retention of keywords indicating that multiple items are grouped and explained.
- In Paragraph 14, A title regeneration system characterized by including at least one of the following features: (1) removal of emojis based on their relevance to the title, (2) retention of emojis representing at least one of mood, atmosphere, and context, and (3) prohibition of repeating the same emoji more than a preset number of times.
Description
Method and System for Regenerating Titles for Content Based on Large Language Model The following description concerns a method and system for regenerating titles for content based on a massive language model. The titles of content exposed online are created by the content creators. For example, content creators may write titles that include various keywords frequently searched by people so that their content can be easily found by search engines. However, in certain services, using titles of content optimized for Search Engine Optimization (SEO) as is actually results in a poor user experience. For instance, in services that automatically recommend personalized content, SEO-optimized titles are difficult for users to recognize quickly, leading to a problem where selection rates drop. [Prior Art No.] Korean Patent Publication No. 10-2019-0037056 FIG. 1 is a drawing illustrating an example of a network environment according to an embodiment of the present invention. FIG. 2 is a block diagram illustrating an example of a computer device according to an embodiment of the present invention. FIG. 3 is a drawing illustrating an example of a title regeneration system according to an embodiment of the present invention. FIG. 4 is a diagram illustrating an example of input filtering of a title improvement model in an embodiment of the present invention. FIG. 5 is a diagram illustrating an example of output filtering of a title improvement model in an embodiment of the present invention. FIGS. 6 and 7 are drawings illustrating an example of a process for detecting defects in the title of improvement using a major defect detector in an embodiment of the present invention. FIG. 8 is a diagram illustrating an example of the service application process of a title improvement model in an embodiment of the present invention. FIG. 9 is a flowchart illustrating an example of a title regeneration method according to an embodiment of the present invention. Hereinafter, embodiments will be described in detail with reference to the attached drawings. A title regeneration system according to embodiments of the present invention may be implemented by at least one computer device. In this case, a computer program according to an embodiment of the present invention may be installed and run on the computer device, and the computer device may perform a title regeneration method according to embodiments of the present invention under the control of the run computer program. The above-described computer program may be stored on a computer-readable recording medium to be combined with the computer device to execute the title regeneration method on the computer. FIG. 1 is a diagram illustrating an example of a network environment according to an embodiment of the present invention. The network environment of FIG. 1 illustrates an example including a plurality of electronic devices (110, 120, 130, 140), a plurality of servers (150, 160), and a network (170). FIG. 1 is an example for explaining the invention, and the number of electronic devices or servers is not limited to that shown in FIG. 1. Furthermore, the network environment of FIG. 1 is merely an example of one of the environments applicable to the present embodiments, and the environments applicable to the present embodiments are not limited to the network environment of FIG. 1. Multiple electronic devices (110, 120, 130, 140) may be fixed terminals or mobile terminals implemented as computer devices. Examples of multiple electronic devices (110, 120, 130, 140) include smartphones, mobile phones, navigation systems, computers, laptops, digital broadcasting terminals, PDAs (Personal Digital Assistants), PMPs (Portable Multimedia Players), tablet PCs, etc. For example, FIG. 1 shows the shape of a smartphone as an example of an electronic device (110), but in embodiments of the present invention, the electronic device (110) may substantially refer to one of various physical computer devices capable of communicating with other electronic devices (120, 130, 140) and/or servers (150, 160) via a network (170) using a wireless or wired communication method. The communication method is not limited and may include not only communication methods utilizing communication networks (e.g., mobile communication networks, wired internet, wireless internet, broadcasting networks) that the network (170) may include, but also short-range wireless communication between devices. For example, the network (170) may include any one or more networks such as a PAN (personal area network), LAN (local area network), CAN (campus area network), MAN (metropolitan area network), WAN (wide area network), BBN (broadband network), and the Internet. Additionally, the network (170) may include any one or more network topologies such as a bus network, a star network, a ring network, a mesh network, a star-bus network, a tree or hierarchical network, but is not limited thereto. Each of the servers (150, 160) may be implem