RU-2861594-C1 - METHOD AND SYSTEM FOR AUTONOMOUS CONDUCTING OF RESEARCH BASED ON AUTOMATED ANALYSIS OF SOCIAL MEDIA AND MASS MEDIA DATA
Abstract
FIELD: computing. SUBSTANCE: carrying out the steps of receiving a user request in natural language and analysing its complexity to determine a research strategy; using a request processing module: extracting key entities and the context of the request; generating an expanded set of search terms, including synonyms and related concepts; converting the request into a plurality of sub-requests for different types of data sources; by means of a data collection module in a streaming mode, collecting data from multiple sources of social media and news publications in accordance with the sub-requests; by means of an analytical core, iteratively: analysing incoming data to determine thematic patterns and sentiment; assessing the sufficiency of the collected information; when data is insufficient, automatically generating additional search queries; formulating and testing hypotheses through targeted additional data collection; generating a structured report with links to supporting sources. EFFECT: increasing the completeness and accuracy of the analysis of social media and mass media, accelerating the data processing process, reducing the probability of erroneous conclusions and ensuring the scalability of the system for processing large volumes of data. 14 cl, 3 dwg
Inventors
- Ostrovskii Grigorii Igorevich
- Arushanov Khachatur Renuarovich
Dates
- Publication Date
- 20260506
- Application Date
- 20250709
Claims (20)
- 1. A system for automated analysis of data from social networks and mass media (MMA), containing:
- a data collection module configured to connect to multiple data sources, including at least one social network and at least one news source, and configured to retrieve data based on input search queries in a streaming mode with gradual data loading;
- a request processing module associated with the data collection module, configured to:
- • transforming a user's natural language query into a set of search instructions for different types of data sources;
- • automatic generation of synonyms, related terms and spelling variations for key query entities;
- • dynamic reformulation of search queries based on intermediate analysis results;
- an analytical core with artificial intelligence, containing at least one trained neural network for semantic analysis of text data, associated with a data collection module and a query processing module, wherein the analytical core is configured to:
- • iterative assessment of the sufficiency of the collected data according to the criteria of saturation of new thematic categories and stabilization of tonal indicators;
- • autonomous planning of the sequence of analysis stages with the possibility of dynamically adjusting the plan based on intermediate results;
- • independent generation of hypotheses and their testing through additional targeted data collection;
- • generating conclusions with links to supporting sources;
- interface module associated with the analytical core;
- a control and orchestration module capable of autonomously controlling the execution of a request by coordinating the operation of all modules and maintaining a single data exchange protocol;
- a knowledge base and data warehouse designed to store collected data, intermediate results, and log all completed operations to ensure transparency of the analysis process.
- 2. The system according to claim 1, wherein the data collection module is configured to operate in a streaming mode with gradual loading of data.
- 3. The system of claim 1, wherein the analytical core is configured to begin processing without waiting for all data to be collected.
- 4. The system of claim 1, wherein the interface module is configured to provide results to the user in the form of structured reports in natural language with visualizations and citations from original sources.
- 5. The system according to claim 1, in which the interface module is configured to allow the user to ask clarifying questions based on the result obtained.
- 6. The system of claim 1, wherein the analytical core is configured to generate a sequence of analysis stages based on the content of the request, wherein each subsequent stage is initiated automatically depending on the results of the previous stage.
- 7. The system of claim 1, wherein the query processing module is capable of contextual adaptation by automatically adding relevant terms based on a thematic analysis of the query.
Description
AREA OF TECHNOLOGY This group of inventions relates to computer technologies used in social media monitoring and analytics, where large volumes of user messages, comments, news articles, and other content require intelligent processing. The group of inventions is designed for automated analysis of social media and mass media content. LEVEL OF TECHNOLOGY Traditional social media analysis involves collecting mentions, calculating metrics (likes, shares, etc.), and manually analyzing trends. Artificial intelligence (AI) takes this task to a new level: using machine learning and natural language processing, it not only collects statistics but also uncovers hidden trends and insights. Unlike superficial monitoring, such as counting mentions, this approach allows for capturing the essence of discussions and audience sentiment. Using modern models, including generative AI, AI can quickly process a stream of data and detect key themes, sentiment, and relationships, providing a more complete picture of opinions across different communities. Automated social media and news analysis solutions are rapidly developing, with both products from major AI labs and specialized platforms already available on the market. OpenAI Deep Research is a tool from OpenAI designed for deep, multi-step internet research. According to OpenAI, Deep Research can reduce analysis tasks that previously took weeks to hours through a combination of automation and advanced reasoning engines. Unlike standard chatbots, it doesn't limit itself to a single response: the system breaks complex queries into sequential steps, actively searches online for information, repeatedly analyzes the documents found, and synthesizes them into a detailed report, displaying its reasoning process. However, this service is currently in limited release and is available for a fee of approximately $200 per month with a limit of 100 queries. Furthermore, Deep Research is designed primarily for English-language sources and focuses on general internet research, but does not specifically specialize in social media or local media data. Google Gemini Deep Research is an agent integrated into the Google Gemini ecosystem. Google markets it as a feature that gives the bot greater autonomy in performing tasks. The user asks a question in natural language, after which the agent generates a multi-step research plan, presents it for user approval, and then autonomously conducts a deep analysis of web content. Google Deep Research integrates with the Gemini subscription and is already available globally, but currently primarily for English-language queries. Google makes the user an observer of the plan, which reduces the risk of the agent misinterpreting the query. Gemini Deep Research covers a wide range of topics, but is not specifically designed for social media analysis; it relies on publicly available web content. Therefore, in the social media sphere, it may face limited coverage, as closed communities requiring API access and local platforms are less accessible. Brand Analytics (BrandGPT) is a specialized social media monitoring platform popular in Russia and the CIS, which recently received its own AI module, BrandGPT. Brand Analytics initially provides the ability to collect and structured analyze data from social networks and online media. The system covers key sources: major social networks, blogs, forums, review sites, Telegram channels, and news sites. Traditionally, the platform provides metrics and visualization tools: it automatically detects the language and sentiment of messages, calculates reach, highlights the most popular posts, creates trend charts, and allows for filtering and categorization of mentions. However, the interpretation of results and the drawing of conclusions remain the responsibility of the user. BrandGPT is a built-in virtual assistant designed to speed up data analysis for the user. BrandGPT is trained on Brand Analytics' own data and is directly connected to the accumulated message database. The model specializes in social media, understands industry terminology, and supports each summary with examples of specific messages for transparent conclusions. However, there are limitations: BrandGPT doesn't operate completely autonomously, but rather assists the user. Furthermore, BrandGPT currently only supports the analysis of Russian-language messages. Other social monitoring tools with AI elements exist on the global market, such as Brandwatch, Sprout Social, Meltwater, and Talkwalker. Their common concept is to collect brand mentions and perform sentiment analysis using big data. Modern versions of these products use AI to identify sentiment, trending topics, and engagement metrics. However, they typically focus more on dashboards and statistics than on generating standalone analytical reports. Users must still interpret graphs and derive insights. Furthermore, the depth of understanding in traditional systems is limited by a set of rules or train