CN-121984797-A - Conference summary generation system, method, device and equipment
Abstract
The application discloses a conference summary generation system, a conference summary generation method, a conference summary generation device and conference summary generation equipment. The system collects conference voice data streams through intelligent glasses, sends the conference voice data streams to a server through a client, converts the voice data streams into text streams and sends the text streams back to the client, the client displays the text streams, meanwhile, the intelligent glasses receive user shooting instructions, collect image data of conference fragments and send the image data to the client, the client displays the image data at corresponding text stream positions, and after receiving conference summary generated based on voice transfer texts and provided by the server, obtains target conference summary items corresponding to image shooting time, inserts the image data as image data corresponding to the target conference summary items into conference summary items, and displays the conference summary items. By adopting the processing mode, the lightweight performance of the speech conference summary is considered, and the problem that the speech conference summary is difficult to understand in the later period because of the lack of pictures is solved.
Inventors
- TANG XIAOJUN
- CHEN JIANPING
- CHEN LUJUN
Assignees
- 上海千问智联人工智能科技有限责任公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251230
Claims (14)
- 1. A conference summary generation system, comprising: The intelligent glasses are used for collecting conference voice data streams, sending the voice data streams to the client, receiving shooting instructions, collecting image data of conference fragments and sending the image data to the client; The client is used for receiving the voice data stream and sending the voice data stream to the server; receiving a text stream corresponding to the voice data stream provided by a server side, and displaying the text stream; receiving the image data and displaying the image data at a corresponding text position according to the image shooting time, receiving a conference summary provided by a server, acquiring a target conference summary item corresponding to the image shooting time, inserting the image data into the conference summary as the image data corresponding to the target conference summary item, displaying the conference summary; the server side is used for receiving the voice data stream, converting the voice data stream into a text stream and providing the text stream for the client side, generating a conference summary according to the voice transfer text and providing the conference summary for the client side.
- 2. A conference summary generation system, comprising: The intelligent glasses are used for collecting conference voice data, receiving shooting instructions, collecting image data of conference fragments, recording image shooting time, and sending the voice data, the image data and the image shooting time to a client; The client is used for receiving the voice data, the image data and the image shooting time and sending the voice data to the server, receiving a conference summary provided by the server, acquiring a target conference summary item corresponding to the image shooting time, and taking the image data as the image data corresponding to the target conference summary item and inserting the image data into the conference summary; the server is used for receiving the voice data, converting the voice data into texts, generating meeting summary according to the voice transfer texts and providing the meeting summary for the client.
- 3. A conference summary generation system, comprising: The intelligent glasses are used for collecting conference voice data, receiving shooting instructions, collecting image data of conference fragments, recording image shooting time, and sending the voice data, the image data and the image shooting time to a server; The server is used for receiving the voice data, the image data and the image shooting time, converting the voice data into texts, generating conference summary according to the texts, acquiring a target conference summary item corresponding to the image shooting time, and inserting the image data serving as the image data corresponding to the target conference summary item into the conference summary.
- 4. A method for generating a meeting summary, comprising: Collecting conference voice data streams and sending the voice data streams to a client; Receiving a shooting instruction; And collecting image data of the conference fragment and sending the image data to the client.
- 5. The method of claim 4, wherein the step of determining the position of the first electrode is performed, The shooting instructions comprise a shooting start instruction and a shooting end instruction; the receiving shooting instruction, collecting the image data of the conference fragment, includes: Acquiring the shooting start instruction and acquiring video data of a conference fragment; And acquiring the shooting ending instruction, and stopping acquiring video data.
- 6. The method of claim 5, wherein the start shooting instruction is from a video start shooting operation key of the smart glasses and the end shooting instruction is from a video end shooting operation key of the smart glasses.
- 7. The method of claim 4, wherein the step of determining the position of the first electrode is performed, The shooting instruction comprises a shooting start instruction; the receiving shooting instruction, collecting the image data of the conference fragment, includes: and acquiring the shooting start instruction and acquiring photo data of the conference fragment.
- 8. The method of claim 7, wherein the start shooting instruction is from a photo shooting operation key of the smart glasses.
- 9. A method for generating a meeting summary, comprising: Receiving conference voice data streams sent by intelligent glasses, and sending the voice data streams to a server; receiving a text stream corresponding to the voice data stream provided by a server side, and displaying the text stream; receiving image data of a conference fragment sent by intelligent glasses, and displaying the image data at a corresponding text position according to image shooting time; Receiving a meeting summary provided by a server; acquiring a target conference summary item corresponding to the image shooting time; Inserting the image data into the conference summary as the image data corresponding to the target conference summary item; And displaying the meeting summary.
- 10. A method for generating a meeting summary, comprising: Collecting conference voice data; Receiving a shooting instruction, collecting image data of a conference fragment, and recording image shooting time; and transmitting the voice data, the image data and the image capturing time.
- 11. A method for generating a meeting summary, comprising: Receiving conference voice data, image data of conference fragments and image shooting time sent by intelligent glasses; The voice data are sent to a server, and a meeting summary provided by the server is received; acquiring a target conference summary item corresponding to the image shooting time; Inserting the image data into the conference summary as the image data corresponding to the target conference summary item; And displaying the meeting summary.
- 12. A method for generating a meeting summary, comprising: Receiving conference voice data, image data of a conference fragment and image shooting time; Converting the voice data into text; generating a meeting summary according to the text; acquiring a target conference summary item corresponding to the image shooting time; and inserting the image data into the conference summary as the image data corresponding to the target conference summary item.
- 13. A conference summary generation apparatus, comprising: The voice transmission unit is used for receiving conference voice data streams sent by the intelligent glasses and sending the voice data streams to the server; The text display unit is used for receiving a text stream corresponding to the voice data stream provided by the server and displaying the text stream; The image processing unit is used for receiving the image data of the conference fragment sent by the intelligent glasses and displaying the image data at the corresponding text position according to the image shooting time; The conference summary processing unit is used for receiving conference summary provided by a server, acquiring a target conference summary item corresponding to the image shooting time, inserting the image data serving as image data corresponding to the target conference summary item into the conference summary, and displaying the conference summary.
- 14. An electronic device, comprising: processor, and Memory for storing a program for implementing the method according to any one of claims 4 to 12, the device being powered on and running the program of the method by the processor.
Description
Conference summary generation system, method, device and equipment Technical Field The application relates to the technical field of data processing, in particular to a conference summary generation system, a conference summary generation method, a conference summary generation device and electronic equipment. Background Conference is an important way of organizing collaboration at present, and is increasingly used in daily life and office processes. In order to record conference contents, special persons are often required to collect and arrange the conference contents, so that the input cost is high and the efficiency is low. Along with the continuous development of voice technology, the technology of recording through equipment, finishing conference contents by means of voice-to-text function and automatically generating conference summary begins to appear. In the course of the conference summary, the voice is converted into a text as the content of the conference summary. However, when meeting important meeting contents, corresponding meeting content pictures still need to be recorded, and the meeting content pictures are used as a part of meeting summary, so that the readability of the meeting summary can be remarkably improved. At present, a typical manner of generating a conference summary including important conference content pictures is to collect conference voices and collect videos of complete conference sharing content, and when generating the conference summary, extract video pictures including important conference content corresponding to conference summary items from the complete videos. However, the applicant of the present application finds that the existing scheme has at least the following problems that, because a complete conference video including conference sharing content needs to be transmitted, pictures of important conference content are filtered from the complete video which is contaminated with a large amount of invalid information, more network traffic is consumed, the efficiency of generating conference summary is affected, and the application requirement of generating conference summary in real time cannot be met. Disclosure of Invention The application provides a conference summary generation system, which aims to solve the problem of low conference summary generation efficiency in the prior art. The application further provides a conference summary generating device and electronic equipment. The application provides a conference summary generation system, which comprises: The intelligent glasses are used for collecting conference voice data streams, sending the voice data streams to the client, receiving shooting instructions, collecting image data of conference fragments and sending the image data to the client; The client is used for receiving the voice data stream and sending the voice data stream to the server; receiving a text stream corresponding to the voice data stream provided by a server side, and displaying the text stream; receiving the image data and displaying the image data at a corresponding text position according to the image shooting time, receiving a conference summary provided by a server, acquiring a target conference summary item corresponding to the image shooting time, inserting the image data into the conference summary as the image data corresponding to the target conference summary item, displaying the conference summary; the server side is used for receiving the voice data stream, converting the voice data stream into a text stream and providing the text stream for the client side, generating a conference summary according to the voice transfer text and providing the conference summary for the client side. The application provides a conference summary generation system, which comprises: The intelligent glasses are used for collecting conference voice data, receiving shooting instructions, collecting image data of conference fragments, recording image shooting time, and sending the voice data, the image data and the image shooting time to a client; The client is used for receiving the voice data, the image data and the image shooting time and sending the voice data to the server, receiving a conference summary provided by the server, acquiring a target conference summary item corresponding to the image shooting time, and taking the image data as the image data corresponding to the target conference summary item and inserting the image data into the conference summary; the server is used for receiving the voice data, converting the voice data into texts, generating meeting summary according to the voice transfer texts and providing the meeting summary for the client. The application provides a conference summary generation system, which comprises: The intelligent glasses are used for collecting conference voice data, receiving shooting instructions, collecting image data of conference fragments, recording image shooting time, and sending the voice