CN-115994273-B - Community message classification method, device, equipment and storage medium
Abstract
The invention discloses a community message classification method, device, equipment and storage medium. The method comprises the steps of obtaining at least one message to be corrected, wherein the message to be corrected comprises a message type before correction and a message type after correction, the message type before correction is obtained based on a community message classification model output, the community message classification model is obtained based on historical message training in a historical message data set, selecting target correction messages from the at least one message to be corrected according to comparison conditions of the message type before correction and the message type after correction among the messages to be corrected, generating at least one candidate similar type group, updating the historical message data set according to the consistency condition of the target correction messages in the candidate similar type group and the historical messages in the historical message data set, and training the community message classification model by adopting the updated historical message data set to update the community message classification model for message classification. The embodiment of the invention improves the classification accuracy of the community information.
Inventors
- LI HONGYONG
- HUANG GUAN
- CHEN MAI
Assignees
- 深圳十方融海科技有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20230131
Claims (8)
- 1. A community message classification method, comprising: The method comprises the steps of obtaining at least one message to be corrected, wherein the message to be corrected comprises a message category before correction and a message category after correction, the message category before correction is obtained based on a community message classification model output, and the community message classification model is obtained based on historical message training in a historical message data set; selecting a target correction message from at least one message to be corrected according to the comparison condition of the message category before correction and the message category after correction among the messages to be corrected; generating at least one candidate similar class group, wherein the message similarity of the target correction messages in the same candidate similar class group is adjacent; Updating the historical message data set according to the consistency condition of the target correction message in the candidate similar type group and the historical message in the historical message data set; training the community information classification model by adopting the updated historical information data set so as to update the community information classification model for information classification; Wherein selecting a target correction message from at least one message to be corrected according to the comparison condition of the message category before correction and the message category after correction between the messages to be corrected, comprises: Combining the message types before correction and the message types after correction among the messages to be corrected, which correspond to the same message to be corrected, to obtain at least one candidate message data set; determining a candidate message data set meeting a second preset message quantity threshold condition as a target message data set; determining a message to be corrected in the target message data set as a target correction message; Wherein updating the historical message data set according to the consistency of the target correction message in the candidate similar category group and the historical message in the historical message data set comprises: if the candidate similar type group has the same target correction message as the history message in the history message data set, the message type of the same target correction message is adopted to update the message type of the corresponding history message in the history message data set; And if the target correction message which is different from the historical message in the historical message data set exists in the candidate similar type group, adding the different target correction message and the corresponding message type thereof into the historical message data set as historical message data.
- 2. The method of claim 1, further comprising, after said generating at least one candidate set of similarity categories: determining a candidate similar class group meeting a first preset message quantity threshold condition in each candidate similar class group as a target similar class group; Determining at least one historical message to be corrected, which is the same as the pre-correction message category of the target correction message in the target similar category group, in the historical message data set; Determining a target historical message according to the similarity between the target correction message in the target similar type group and each historical message to be corrected; and updating the message category of the target historical message by adopting the corrected message category of the target corrected message so as to update the historical message data set.
- 3. The method of claim 2, wherein the determining the target history message according to the similarity between the target correction message and each of the history messages to be corrected in the target similarity class group comprises: Determining message vector values corresponding to the target correction messages in the target similar class group respectively; determining a target average vector value corresponding to the target similar class group according to each message vector value; determining a history vector value corresponding to each history message to be corrected respectively; and determining a target historical message according to the target average vector value and each historical vector value.
- 4. The method of claim 1, wherein the generating at least one candidate set of similarity categories comprises: vectorizing each target correction message to obtain a target vector value corresponding to each target correction message; And determining the similarity between the target correction messages according to the target vector values of the target correction messages, and generating at least one candidate similarity class group according to the similarity between the target correction messages.
- 5. The method of claim 4, wherein said determining a similarity between each of said target correction messages based on a target vector value for each of said target correction messages and generating at least one candidate set of similarity categories based on the similarity between each of said target correction messages comprises: selecting any two target correction messages from the target correction messages as a first message and a second message; determining a first target similarity between the first message and a second message according to a first vector value of the first message and a second vector value of the second message; If the first target similarity meets a preset similarity threshold condition, dividing the first message and the second message into the same candidate similarity class group, and determining a first average vector value of the first vector value and the second vector value; Traversing other target correction messages in sequence, and determining second target similarity among the first message, the second message and the third message according to the first average vector value and a third vector value of the traversed third message, wherein the other target correction messages comprise other target correction messages except the first message and the second message; And if the second similarity meets the preset similarity threshold condition, adding the third message to the candidate similar class group which is the same as the first message or the second message until the traversal is finished, and obtaining at least one candidate similar class group similar to the message class.
- 6. A community message classification apparatus, comprising: The system comprises a to-be-corrected message acquisition module, a message correction module and a message correction module, wherein the to-be-corrected message comprises a pre-correction message category and a post-correction message category; The target correction message selecting module is used for selecting a target correction message from at least one message to be corrected according to the comparison condition of the message category before correction and the message category after correction among the messages to be corrected; The system comprises a category group generation module, a category group generation module and a category group generation module, wherein the category group generation module is used for generating at least one candidate similar category group, and the message similarity of target correction messages in the same candidate similar category group is adjacent; A data set updating module, configured to update the historical message data set according to a condition that the target correction message in the candidate similarity class group is consistent with the historical message in the historical message data set; the message classification module is used for training the community message classification model by adopting the updated historical message data set so as to update the community message classification model and is used for classifying the messages; the target correction message selecting module includes: A candidate message data set determining unit, configured to combine the message types before correction and after correction between the messages to be corrected, where the message types correspond to the same message to be corrected, to obtain at least one candidate message data set; A target message data set determining unit configured to determine, as a target message data set, a candidate message data set satisfying a second preset message amount threshold condition; A target correction message determining unit, configured to determine a message to be corrected in the target message data set as a target correction message; wherein the data set updating module comprises: a message type updating unit, configured to update a message type of a corresponding history message in the history message dataset by using a message type of the same target correction message if the target correction message which is the same as the history message in the history message dataset exists in the candidate similar type group; and the message adding unit is used for adding different target correction messages and corresponding message types thereof into the historical message data set as historical message data if the target correction messages which are different from the historical messages in the historical message data set exist in the candidate similar type group.
- 7. An electronic device, the electronic device comprising: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the community message classification method of any of claims 1-5.
- 8. A computer-readable storage medium storing computer instructions for causing a processor to implement the community message classification method of any one of claims 1-5 when executed.
Description
Community message classification method, device, equipment and storage medium Technical Field The present invention relates to the field of data processing, and in particular, to a community message classification method, apparatus, device, and storage medium. Background In the field of community management technology, it is required that a community of users follow up messages at the same time by a community staff. The user density of service in unit time is larger, the messages are more, the messages of the users need to be labeled, and the messages are classified and displayed according to the labels, so that community staff can know the users more quickly and make targeted replies or marketing. The existing community classification mode comprises a manual labeling mode, but the mode is low in efficiency and high in labor cost. The community classification mode also comprises a mode of automatically classifying by a pre-trained message classification model, but with the increasing of the message quantity and the message type, the classification capability of the message classification model is obviously reduced, and higher labor cost is required to be input to maintain and update the message classification model so as to ensure the classification accuracy of the message classification model. Disclosure of Invention The invention provides a community message classification method, device, equipment and storage medium, which are used for realizing automatic maintenance of a classification model for classifying community messages, reducing labor input cost and improving classification accuracy of the classification model on the community messages. According to an aspect of the present invention, there is provided a community message classification method, the method including: The method comprises the steps of obtaining at least one message to be corrected, wherein the message to be corrected comprises a message category before correction and a message category after correction, the message category before correction is obtained based on a community message classification model output, and the community message classification model is obtained based on historical message training in a historical message data set; selecting a target correction message from at least one message to be corrected according to the comparison condition of the message category before correction and the message category after correction among the messages to be corrected; generating at least one candidate similar class group, wherein the message similarity of the target correction messages in the same candidate similar class group is adjacent; Updating the historical message data set according to the consistency condition of the target correction message in the candidate similar type group and the historical message in the historical message data set; training the community information classification model by adopting the updated historical information data set so as to update the community information classification model for information classification. According to another aspect of the present invention, there is provided a community message classification apparatus, comprising: The system comprises a to-be-corrected message acquisition module, a message correction module and a message correction module, wherein the to-be-corrected message comprises a pre-correction message category and a post-correction message category; The target correction message selecting module is used for selecting a target correction message from at least one message to be corrected according to the comparison condition of the message category before correction and the message category after correction among the messages to be corrected; The system comprises a category group generation module, a category group generation module and a category group generation module, wherein the category group generation module is used for generating at least one candidate similar category group, and the message similarity of target correction messages in the same candidate similar category group is adjacent; A data set updating module, configured to update the historical message data set according to a condition that the target correction message in the candidate similarity class group is consistent with the historical message in the historical message data set; and the message classification module is used for training the community message classification model by adopting the updated historical message data set so as to update the community message classification model and is used for classifying the messages. According to another aspect of the present invention, there is provided an electronic apparatus including: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the community message classifi