Search

CN-121979977-A - Multi-mode session data analysis method and system based on time axis fusion analysis

CN121979977ACN 121979977 ACN121979977 ACN 121979977ACN-121979977-A

Abstract

The invention relates to the technical field of session data analysis, and discloses a multi-modal session data analysis method and a system based on time axis fusion analysis, wherein multi-modal session data are obtained, and each modal session data in the multi-modal session data has a corresponding modal session time axis; the method comprises the steps of calibrating all modal conversation time shafts according to a preset anti-interference fitting algorithm, generating a unified wall clock time shaft of multi-modal conversation data, identifying key information nodes in the multi-modal conversation data based on the unified wall clock time shaft, calculating emotion vectors corresponding to each moment in the multi-modal conversation data to obtain emotion intensity curves of the multi-modal conversation data, and generating an interactive emotion map of the multi-modal conversation data according to all the key information nodes and the emotion intensity curves. Therefore, the implementation of the invention can improve the traceability of the analysis result while improving the analysis accuracy and objectivity of the multi-mode session data.

Inventors

  • LI WEI

Assignees

  • 广州悦数信息科技有限公司

Dates

Publication Date
20260505
Application Date
20251229

Claims (10)

  1. 1. A multi-modal session data analysis method based on time axis fusion analysis, the method comprising: Acquiring multi-mode session data, wherein each mode session data in the multi-mode session data has a corresponding mode session time axis; Calibrating all modal conversation time shafts according to a preset anti-interference fitting algorithm, and generating a unified wall clock time shaft of the multi-modal conversation data; based on the unified wall clock time axis, identifying key information nodes in the multi-mode session data, wherein each key information node has a corresponding characteristic identifier; based on the unified wall clock time axis, calculating emotion vectors corresponding to each moment in the multi-mode session data to obtain emotion intensity curves of the multi-mode session data; And generating an interactive emotion map of the multi-mode session data according to all the key information nodes and the emotion intensity curves, wherein the interactive emotion map comprises emotion change trend and key information distribution conditions of the multi-mode session data.
  2. 2. The method for analyzing multi-modal session data based on time axis fusion analysis according to claim 1, wherein the calibrating all the modal session time axes according to a preset anti-interference fitting algorithm, and generating a unified wall clock time axis of the multi-modal session data, comprises: for each modal session data, extracting an event anchor point set of the modal session data; Matching the event anchor point sets of all the modal session data to obtain associated event anchor point pair sets among different modal session data; Generating global mapping relations between all modal conversation time shafts in the multi-modal conversation data and standard reference wall clock time shafts according to the associated event anchor point pair sets and a preset anti-interference fitting algorithm; and generating a unified wall clock time axis of the multi-mode session data according to the global mapping relation.
  3. 3. The method for multi-modal session data analysis based on time-axis fusion analysis of claim 2, wherein prior to the generating the unified wall clock time axis of the multi-modal session data according to the global mapping relationship, the method further comprises: Calculating a fitting residual error of each associated event anchor point pair in the associated event anchor point pair set according to the global mapping relation; Analyzing distribution characteristics of the fitting residual errors according to all the fitting residual errors; judging whether a characteristic mutation target point exists in the fitting residual distribution characteristics; If the characteristic mutation target point exists, determining a calibration reference modal session time axis in all the modal session time axes according to the characteristic mutation target point, dividing the calibration reference modal session time axis into a corresponding number of time periods according to the characteristic mutation point, and generating a local mapping relation corresponding to each time period based on all the associated event anchor points and the preset anti-interference fitting algorithm correspondingly contained in the time period for each time period; Generating a unified wall clock time axis of the multi-mode session data according to all the local mapping relations; And if the multi-mode session data does not exist, triggering and executing the operation of generating the unified wall clock time axis of the multi-mode session data according to the global mapping relation.
  4. 4. A multi-modal session data analysis method based on time-axis fusion analysis according to any one of claims 1-3, wherein the identifying key information nodes in the multi-modal session data based on the unified wall clock time axis includes: extracting a semantic clause sequence in the multi-mode session data based on the unified wall clock time axis; Extracting text content characteristics and context association characteristics of each semantic clause in the semantic clause sequence; And matching key information nodes of the semantic clause based on the text content characteristics and the context associated characteristics.
  5. 5. The method for analyzing multi-modal session data based on time axis fusion analysis according to claim 4, wherein the calculating, based on the unified wall clock time axis, emotion vectors corresponding to each moment in the multi-modal session data to obtain emotion intensity curves of the multi-modal session data includes: For each semantic clause, calculating an emotion vector of the semantic clause, wherein the emotion vector is used for representing the intensity distribution condition of the corresponding semantic clause in different emotion dimensions; mapping the emotion vector of each semantic clause to the moment corresponding to the semantic clause on the unified wall clock time axis to generate a discrete emotion vector sequence; and carrying out time sequence smoothing processing on the discrete emotion vector sequence to generate an emotion intensity curve of the multi-mode session data.
  6. 6. The method for analyzing multi-modal session data based on time-axis fusion analysis according to claim 4, wherein the generating the interactive emotion map of the multi-modal session data according to all the key information nodes and the emotion intensity curves includes: For each key information node, in the unified wall clock time axis, matching the key information node with the emotion vector of the emotion intensity curve at the same moment; Based on the context association characteristics of the key information node, calculating a context consistency metric corresponding to the key information node, wherein the context consistency metric is used for representing the consistency degree of the semantic clauses corresponding to the key information node and adjacent semantic clauses on a theme; Calculating a fusion priority value of the key information node based on the feature identifier of the key information node, the emotion vector corresponding to the matching and the context consistency metric, wherein the fusion priority value is used for representing the importance degree of the corresponding key information node in the multi-mode session data; generating corresponding interactable marks on a time axis of the emotion intensity curve according to the fusion priority values of all the key information nodes, wherein each interactable mark is associated with a characteristic mark of the corresponding key information node and is used for serving as an interaction interface for triggering access to the corresponding key information node; And integrating the emotion intensity curves with the interactive marks with the characteristic identifiers of all the key information nodes to generate an interactive emotion map of the multi-mode session data.
  7. 7. A multi-modal conversational data analysis system based on a timeline fusion analysis, the system comprising: The system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring multi-mode session data, and each mode session data in the multi-mode session data has a corresponding mode session time axis; the calibration module is used for calibrating all the modal conversation time shafts according to a preset anti-interference fitting algorithm and generating a unified wall clock time shaft of the multi-modal conversation data; The identification module is used for identifying key information nodes in the multi-mode session data based on the unified wall clock time axis, and each key information node has a corresponding characteristic identifier; The computing module is used for computing emotion vectors corresponding to each moment in the multi-mode session data based on the unified wall clock time axis to obtain an emotion intensity curve of the multi-mode session data; The generation module is used for generating an interactive emotion map of the multi-mode session data according to all the key information nodes and the emotion intensity curves, wherein the interactive emotion map comprises emotion change trend and key information distribution conditions of the multi-mode session data.
  8. 8. The system for analyzing multi-modal session data based on time axis fusion analysis of claim 7, wherein the calibration module calibrates all of the modal session time axes according to a preset anti-interference fitting algorithm, the specific manner of generating a unified wall clock time axis of the multi-modal session data comprising: for each modal session data, extracting an event anchor point set of the modal session data; Matching the event anchor point sets of all the modal session data to obtain associated event anchor point pair sets among different modal session data; Generating global mapping relations between all modal conversation time shafts in the multi-modal conversation data and standard reference wall clock time shafts according to the associated event anchor point pair sets and a preset anti-interference fitting algorithm; and generating a unified wall clock time axis of the multi-mode session data according to the global mapping relation.
  9. 9. A multi-modal conversational data analysis system based on a timeline fusion analysis, the system comprising: A memory storing executable program code; A processor coupled to the memory; the processor invokes the executable program code stored in the memory to perform the multimodal session data analysis method based on timeline fusion analysis as recited in any of claims 1-6.
  10. 10. A computer storage medium storing computer instructions which, when invoked, are operable to perform the multi-modal session data analysis method based on timeline fusion analysis of any of claims 1-6.

Description

Multi-mode session data analysis method and system based on time axis fusion analysis Technical Field The invention relates to the technical field of session data analysis, in particular to a multi-mode session data analysis method and system based on time axis fusion analysis. Background In the field of current session data analysis, particularly in the context of user interviews, market research, customer service quality inspection, etc., it is common to rely on analysts to manually process session records. The traditional method mainly focuses on qualitative interpretation of text content, obtains the text by means of a voice transcription technology, and an analyst needs to read the transcribed text one by one, qualitatively interprets key information from the text by means of personal experience and subjectively describes emotion fluctuation in the conversation process. However, practice finds that the method has several remarkable defects that firstly, the manual labeling efficiency is low, mass data is difficult to deal with, secondly, subjective judgment standards of different analysts are different, so that definition of key information is fuzzy, consistency and repeatability of analysis results are poor, effective comparison between different projects cannot be carried out, and furthermore, emotion analysis is mostly scattered and qualitative description and lack of quantitative expression on a continuous time axis, so that accurate tracking of the track and amplitude of emotion change and dynamic correlation between the track and amplitude and key events is difficult. Therefore, the technical scheme for improving the traceability of the analysis result is particularly important while improving the analysis accuracy and objectivity of the multi-mode session data. Disclosure of Invention The invention provides a multi-mode session data analysis method and a system based on time axis fusion analysis, which can improve the traceability of analysis results while improving the analysis accuracy and objectivity of multi-mode session data. In order to solve the technical problem, the first aspect of the present invention discloses a multi-modal session data analysis method based on time axis fusion analysis, the method comprising: Acquiring multi-mode session data, wherein each mode session data in the multi-mode session data has a corresponding mode session time axis; Calibrating all modal conversation time shafts according to a preset anti-interference fitting algorithm, and generating a unified wall clock time shaft of the multi-modal conversation data; based on the unified wall clock time axis, identifying key information nodes in the multi-mode session data, wherein each key information node has a corresponding characteristic identifier; based on the unified wall clock time axis, calculating emotion vectors corresponding to each moment in the multi-mode session data to obtain emotion intensity curves of the multi-mode session data; And generating an interactive emotion map of the multi-mode session data according to all the key information nodes and the emotion intensity curves, wherein the interactive emotion map comprises emotion change trend and key information distribution conditions of the multi-mode session data. As an optional implementation manner, in the first aspect of the present invention, the calibrating all the modal session time axes according to a preset anti-interference fitting algorithm, and generating the unified wall clock time axis of the multi-modal session data includes: for each modal session data, extracting an event anchor point set of the modal session data; Matching the event anchor point sets of all the modal session data to obtain associated event anchor point pair sets among different modal session data; Generating global mapping relations between all modal conversation time shafts in the multi-modal conversation data and standard reference wall clock time shafts according to the associated event anchor point pair sets and a preset anti-interference fitting algorithm; and generating a unified wall clock time axis of the multi-mode session data according to the global mapping relation. As an optional implementation manner, in the first aspect of the present invention, before the generating, according to the global mapping relationship, a unified wall clock time axis of the multimodal session data, the method further includes: Calculating a fitting residual error of each associated event anchor point pair in the associated event anchor point pair set according to the global mapping relation; Analyzing distribution characteristics of the fitting residual errors according to all the fitting residual errors; judging whether a characteristic mutation target point exists in the fitting residual distribution characteristics; If the characteristic mutation target point exists, determining a calibration reference modal session time axis in all the modal session time axes accord