CN-121979855-A - Method, device, computer equipment, medium and product for searching object position
Abstract
The embodiment of the application discloses a retrieval method, a device, computer equipment, a medium and a product of an article position, wherein the method comprises the steps of responding to the triggering operation of a user on a central control screen, waking up the central control screen, receiving a voice instruction containing article placement information input by the user, converting the voice instruction into an audio file and associating a timestamp, controlling a camera to acquire image data containing an article placement scene, analyzing the audio file to extract keywords, carrying out object recognition on the image data to acquire article category and position information, carrying out multi-mode association matching on the keywords, the article category and the position information to determine and record the storage position of a target article, storing the audio file, the image data, the keywords and the storage position of the target article in a database, receiving an article inquiry instruction input by the user on the central control screen, retrieving in the database based on the article inquiry instruction, and outputting a retrieval result associated with the storage position of the target article.
Inventors
- WANG YONGFEI
- TANG JIE
- CHEN DAOYUAN
- FU JIAJUN
Assignees
- 珠海格力电器股份有限公司
- 珠海联云科技有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251219
Claims (12)
- 1. A method of recording and retrieving a location of an item, the method comprising: In response to triggering operation of a user on a central control screen, waking up the central control screen and receiving a voice instruction containing article placement information input by the user, converting the voice instruction into an audio file and associating a time stamp; controlling a camera to acquire image data containing an object placement scene; Analyzing the audio file to extract keywords, and carrying out object recognition on the image data to obtain article category and position information; Performing multi-mode association matching on the keywords, the article category and the position information to determine and record the storage position of the target article; Storing the audio file, the image data, the keywords and the storage positions of the target objects in a database in an associated manner; And receiving an article inquiry instruction input by a user at the central control screen, searching in the database based on the article inquiry instruction, and outputting a search result associated with the storage position of the target article.
- 2. The method of claim 1, wherein the waking up the central control screen in response to a user's trigger operation on the central control screen and receiving a voice command containing item placement information entered by a user, converting the voice command into an audio file and associating a timestamp, comprises: Receiving a wake-up signal sent by a user through a voice instruction or physical action, and activating the recording function of the voice center control screen; collecting voice signals describing the names and the placement positions of the articles by a user through the audio input equipment of the voice central control screen; converting the voice signal into an audio file in a digital format, and adding a time stamp representing the recording time to the audio file; Under the condition that the voice signal does not contain preset key intention information, prompting a user to supplement the key intention information through the audio output equipment of the voice center control screen.
- 3. The method of claim 1, wherein the controlling the camera to acquire image data containing the item placement scene comprises: When the voice center control screen is awakened, controlling the camera to shoot a first panoramic image with a first preset parameter; In the process of receiving the voice command, dynamically adjusting the angle and focal length of the camera according to the content of the voice command; shooting a second partial image containing the target object placement area with the adjusted second preset parameters; And storing the first panoramic image and the second local image together as the image data.
- 4. The method of claim 1, wherein parsing the audio file to extract keywords and object recognition of the image data to obtain item category and location information comprises: performing voice recognition on the audio file, and converting voice content into text information; Extracting keywords representing the object attribute and the position attribute from the text information; Uploading the image data to a preset identification server, and identifying a plurality of objects in the image data through an object detection model; location information is generated for each item identified that includes a category label and a coordinate location in the image.
- 5. The method of claim 1, wherein the multi-modal associative matching of the keywords with the item categories and the location information to determine and record a storage location of a target item comprises: matching the similarity between the item names in the keywords and the item categories; marking the coordinate position of the object in the image data as the storage position of the target object under the condition that one object successfully performs similarity matching between the object name and the object class exists; Matching the position description in the keyword with the position information under the condition that a plurality of articles successfully matched with the article category in similarity are present; And marking the coordinate position of the object in the image data as the storage position of the target object in the condition that one object successfully matched with the position information exists in the position description in the keyword.
- 6. The method of claim 5, wherein the method further comprises: marking a plurality of candidate positions in the image data for a user to select and confirm under the condition that a plurality of articles successfully matched with the position information exist in the position description in the keyword; and determining the storage position of the target object according to the selection result of the user on a plurality of candidate positions.
- 7. The method of claim 1, wherein the storing the audio file, the image data, the keywords, and the storage location association of the target item to a database comprises: establishing an association index among the audio file, the image data, the keywords and the storage positions of the target object; storing the associated index and the corresponding file data to a local database or a cloud database; performing intelligent abstract processing on the audio file, extracting key information fragments and generating abstract text or abstract audio; and receiving a custom label added by a user for the target article, and binding and storing the custom label and the association index.
- 8. The method of claim 1, wherein the receiving the item query instruction entered by the user, retrieving in the database based on the item query instruction, and outputting the retrieval result associated with the storage location of the target item, comprises: Receiving the article inquiry instruction input by a user through voice or touch mode; Identifying query keywords in the article query instruction, and performing fuzzy matching retrieval in the association index of the database; Retrieving the image data successfully matched with the query keyword and the storage position of the target object; And displaying a visual card containing the image data and the position marks on the voice central control screen, and playing the related audio file or abstract information through audio output equipment.
- 9. A device for retrieving a location of an article, the device comprising: The recording module is used for responding to the triggering operation of a user on the central control screen, waking up the central control screen, receiving a voice instruction containing article placement information input by the user, converting the voice instruction into an audio file and associating a timestamp; controlling a camera to acquire image data containing an object placement scene; Analyzing the audio file to extract keywords, and carrying out object recognition on the image data to obtain article category and position information; Performing multi-mode association matching on the keywords, the article category and the position information to determine and record the storage position of the target article; Storing the audio file, the image data, the keywords and the storage positions of the target objects in a database in an associated manner; and the retrieval module is used for receiving an article inquiry instruction input by a user on the central control screen, retrieving in the database based on the article inquiry instruction and outputting a retrieval result associated with the storage position of the target article.
- 10. A computer device comprising a memory and a processor, the memory storing a computer program executable on the processor, characterized in that the processor, when executing the program, implements the steps of the method of retrieving the location of an item as claimed in any one of claims 1 to 8.
- 11. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program, when executed by a processor, realizes the steps in the method of retrieving the position of an item according to any one of claims 1 to 8.
- 12. A computer program product comprising a non-transitory computer readable storage medium storing a computer program which, when read and executed by a computer, implements the steps of the method of retrieving a location of an item according to any one of claims 1 to 8.
Description
Method, device, computer equipment, medium and product for searching object position Technical Field The present application relates to, but not limited to, the field of application services, and in particular, to a method, apparatus, computer device, medium, and product for retrieving a location of an article. Background With the rapid development of smart home and voice interaction technology, the demand of users for auxiliary management of articles in daily life by voice assistants is increasing. Currently, users often lack effective recording means when placing items (e.g., keys, remote controls, etc.), resulting in difficulty in subsequent searches. Although some voice assistants in the prior art have voice memo or voice command recording functions, these functions can only record text information input by a user through voice recognition, cannot be directly related to the actual placement position of an article, and cannot provide visual audio or visual prompts when the user needs. Furthermore, conventional voice assistants lack efficient retrieval mechanisms for historical audio, resulting in inefficient information retrieval. Thus, the prior art has significant shortcomings in item placement recording, voice interaction guidance, and audio retrieval. Disclosure of Invention In view of this, embodiments of the present application at least provide a method, an apparatus, a computer device, a medium, and a product for retrieving a location of an article. The technical scheme of the embodiment of the application is realized as follows: In one aspect, an embodiment of the present application provides a method for retrieving a location of an article, where the method includes: In response to triggering operation of a user on a central control screen, waking up the central control screen and receiving a voice instruction containing article placement information input by the user, converting the voice instruction into an audio file and associating a time stamp; controlling a camera to acquire image data containing an object placement scene; Analyzing the audio file to extract keywords, and carrying out object recognition on the image data to obtain article category and position information; Performing multi-mode association matching on the keywords, the article category and the position information to determine and record the storage position of the target article; Storing the audio file, the image data, the keywords and the storage positions of the target objects in a database in an associated manner; And receiving an article inquiry instruction input by a user at the central control screen, searching in the database based on the article inquiry instruction, and outputting a search result associated with the storage position of the target article. In another aspect, an embodiment of the present application provides another device for retrieving a location of an article, where the method includes: The recording module is used for responding to the triggering operation of a user on the central control screen, waking up the central control screen, receiving a voice instruction containing article placement information input by the user, converting the voice instruction into an audio file and associating a timestamp; controlling a camera to acquire image data containing an object placement scene; Analyzing the audio file to extract keywords, and carrying out object recognition on the image data to obtain article category and position information; Performing multi-mode association matching on the keywords, the article category and the position information to determine and record the storage position of the target article; Storing the audio file, the image data, the keywords and the storage positions of the target objects in a database in an associated manner; and the retrieval module is used for receiving an article inquiry instruction input by a user on the central control screen, retrieving in the database based on the article inquiry instruction and outputting a retrieval result associated with the storage position of the target article. In yet another aspect, an embodiment of the present application provides a computer device including a memory and a processor, where the memory stores a computer program executable on the processor, and where the processor implements some or all of the steps in a method for retrieving a location of an item described above when the program is executed. In yet another aspect, embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs some or all of the steps in a method of retrieving a location of an item described above. In yet another aspect, embodiments of the present application provide a computer program comprising computer readable code which, when run in a computer device, causes a processor in the computer device to perform some or all of the steps of a method f