US-20260128024-A1 - MUSIC VISUALIZATION BY TRANSLATING MUSIC CUES TO VISUAL CUES

US20260128024A1US 20260128024 A1US20260128024 A1US 20260128024A1US-20260128024-A1

Abstract

A method and system for visualizing music using a perceptually conformal translation system are provided. A music source file is input into a processor configured to carry out a series of steps on music perception audio cues identified within the music, to translate those cues into visual cues, with rules for spatial arrangement, to ultimately generate a visual representation of the music on a display device, time synchronized with the audio music. The translation system produces an enhanced experience of music perception for the user listening to the music while viewing the display. The rules for spatial arrangement may allocate visual cues translated from music source components and music representation components to different parts of the display. The effect of that is to use spatial information to enhance music perception. The display may be enhanced by including video of the music performance or a music video associated with that performance.

Inventors

John Fargo LATHROP
Fred Jay CUMMINS

Assignees

NEW RESONANCE, LLC

Dates

Publication Date: 20260507
Application Date: 20251230

Claims (20)

1 - 69 : (canceled)
70 . A computer-implemented method of presenting a visualization of a piece of music on a visual display, the method comprising: (a) providing a translation system, wherein the translation system comprises: a. a set of psychoacoustic cues, each psychoacoustic cue selected from a set of psychoacoustic cues, each psychoacoustic cue representing a distinct characteristic of the piece of music specific to music perception, b. an assignment of each selected psychoacoustic cue, or a rule-based combination of psychoacoustic cues, to a visual cue to provide a one-to-one correspondence between each selected psychoacoustic cue, or rule-based combination of psychoacoustic cues, and each visual cue, the assignment including rules for the spatial arrangement of visual cues on the visual display, wherein the assignment specifically aids in music perception, and wherein the assignment accounts for complexity, structure, and tempo of the piece of music and size and resolution of the visual display, c. where each visual cue specifically and individually represents a psychoacoustic cue and is different from a visual inference of that psychoacoustic cue based only on visual depiction of the basic audio cues of the notes involved in the psychoacoustic cue, those basic audio cues comprising pitches, times of onset and duration, amplitudes over time, and any note-specific embellishments, and d. where the translation system provides the ability to allocate visual cues associated with different representations of the music to different parts of the visual display, those representations including musical movements, phrases, chords, and notes, those representations can appear and disappear in the course of a musical piece, those parts of the visual display including spatial sections and layers; (b) extracting the selected psychoacoustic cues from the piece of music and translating, as a function of the translation system, the extracted psychoacoustic cues to corresponding visual cues and their spatial arrangement; and (c) presenting the visualization of the piece of music on the visual display by causing display of the visual cues on the visual display as the piece of music is being played, so that one or more persons sees the corresponding visual cues time synchronized with the piece of music as they hear the piece of music.
71 . The computer-implemented method of claim 70 , wherein representations can include pitch range and time displayed.
72 . The computer-implemented method of claim 70 , wherein representations can include time-streaming and non-time-streaming formats, including those two formats appearing on the same display.
73 . The computer-implemented method of claim 70 , wherein each psychoacoustic cue involves multiple notes, as opposed to cues describing individual notes.
74 . The computer-implemented method of claim 70 , wherein the translation system performs a translation of music in audio form into a visually perceived version of that music that is perceptually conformal such that the visually perceived version visually represents music that is heard at a perceptual level, specifically to aid in music perception.
75 . The computer-implemented method of claim 70 further comprising assembling a number of translation systems, and enabling the user to select among those translation systems, based on the type of music or other descriptors of music to be visualized, or on the preference of the user, then applying the selected translation system as specified in claim 70 , specifically to aid in music perception.
76 . The computer-implemented method of claim 75 , wherein the translation system is selected or identified in a process that combines the translation system selection responses of more than one user, for example taking the form of Internet postings or surveys of users'preferred translation systems or their resulting visualizations, then combining those or simply picking the most popular responses.
77 . The computer-implemented method of claim 70 further enabling the user to enter his own translation system, based on the type of music or other descriptors of music to be visualized, or on the preference of the user, then applying the entered translation system as specified in claim 70 , specifically to aid in music perception.
78 . The computer-implemented method of claim 70 further comprising assembling a number of translation systems, and enabling the user to select a translation system selection algorithm, based on the type of music or other descriptors of music to be visualized, or on the preference of the user, then applying the selected algorithm to select a translation system to then apply as specified in claim 70 , specifically to aid in music perception.
79 . The computer-implemented method of claim 78 , wherein the translation system selection algorithm is selected or identified in a process that combines the translation system selection algorithm selection responses of more than one user, for example taking the form of Internet postings or surveys of users'preferred translation systems selection algorithms or their resulting visualizations, then combining those or simply picking the most popular responses.
80 . The computer-implemented method of claim 70 further comprising assembling a number of translation systems, and enabling the user to enter a translation system selection algorithm, based on the type of music or other descriptors of music to be visualized, or on the preference of the user, then applying the entered algorithm to select a translation system to then apply as specified in claim 70 , specifically to aid in music perception.
81 . The computer-implemented method of claim 70 further comprising accepting from a user inputs that cause generation of a music visualization track characterizing an audio music track, the visualization track and audio music track stored as a time synchronized pair of tracks, then upon user request providing the visualization time synchronized to the music.
82 . The computer-implemented method of claim 70 further comprising providing to a user, for a piece of music selected by the user, a psychoacoustic cue track or equivalent data file characterizing the music, then responding to the selection by that user of that music, and the selection by that user of a translation system that maps those psychoacoustic cues to visual cues, upon those user selections providing the resulting visualization time synchronized to the music.
83 . The computer-implemented method of claim 70 , wherein the assignment of psychoacoustic cues to visual cues is adjusted to account for one or more of the complexity, structure, tempo of the piece of music, the size of the visual display and resolution of the visual display, the adjustments made to account for basic features of the piece of music, and in response to changes in the music as the piece of music is played, as called for, specifically to aid in music perception, such adjustments made either by algorithms or by user choice, or by a combination of both. Wherein such adjustments include the complete appearance or disappearance of allocations of visual cues to different parts of the visual display, and the complete appearance or disappearance of videos.
84 . The computer-implemented method of claim 70 further comprising: applying signal cancellation to enhance analysis of a time sample of the piece of music by cancelling out of the time sample specific features that have been identified from one or more previous time samples.
85 . The computer-implemented method of claim 70 further comprising: employing machine learning to recognize relationships and patterns in music source data that enable improvement in the detection and extraction of psychoacoustic cues, wherein such cues are specific to music perception, whether or not the relationships and patterns have been recognized in music literature.
86 . The computer-implemented method of claim 70 wherein the display of the visual cues occurs time synchronized with the audio presentation of the piece of music as it is being played, the synchronization achieved by delaying the audio presentation with a time delay approximately equal to the time taken to perform the operations set forth in claim 70 .
87 . The computer-implemented method of claim 70 wherein a time sequence of one or more of the selected psychoacoustic cues is represented as a time-streaming sequence of corresponding visual cues on the visual display, specifically to enhance music perception, whereby the time-streaming sequence comprises generating visual cues that first appear at the time the corresponding audio cues first appear, at one part of the visual display, and shift along the display on a path toward a vanishing point, line or lines on the display, that shift monotonic with time.
88 . The computer-implemented method of claim 70 wherein the one-to-one correspondence further comprises one or both of two correspondences: orthogonal correspondence between two orthogonally related psychoacoustic cues and the two corresponding visual cues wherein the two corresponding visual cues are also orthogonally related to each other, and ordinal correspondence for a psychoacoustic cue as applied to two music entities so that the ordinal relationship between the psychoacoustic cue values for the two music entities is preserved in the relationship between the two corresponding visual cue values for the two entities.

Description

CLAIM OF PRIORITY This application is a continuation-in-part of and claims priority to U.S. patent application Ser. No. 19/025,813, filed on Jan. 16, 2025, which application is a continuation of U.S. patent application Ser. No. 17/209,037, filed Mar. 22, 2021, now U.S. Pat. No. 12,230,239, which application is a continuation of U.S. utility application Ser. No. 16/074,077, filed Jul. 30, 2018, now U.S. Pat. No. 10,978,033, which application claims the benefit of priority to PCT/US2017/016756 filed on Feb. 6, 2017 and also claims the benefit of priority to U.S. provisional application Ser. No. 62/292,193, filed Feb. 5, 2016, which application is incorporated herein by reference in its entirety. COPYRIGHT AUTHORIZATION A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. The foregoing authorization refers to the image of FIG. 1 showing the notice ©istock.com/fizkes (2015). Such image is licensed by the applicant from the copyright owner. TECHNICAL FIELD The technology described herein generally relates to the visualization of music, in particular the translation, or mapping, of music into a corresponding visual form that can be displayed on a screen BACKGROUND Music is a rich and varied artform: the mere divisions of musical structure into broad categories of melody, rhythm and harmony does not do justice to the full complexity of the musical experience. Such broad categories are overly simplistic as a way to explain and capture a person's reactions and impressions when listening to a piece of music. Consequently, there have been many attempts to reinforce the effects of a piece of music on a listener by deriving a visual accompaniment. A person's sight and hearing are the two primary senses for appreciating artistic creations. However, while it is not difficult to embellish a person's experience of a visual event by adding musical accompaniment and there are many ways to do that, the opposite-to positively augment a listener's experience of music by adding effective imagery has posed challenges. Many techniques have been developed to accomplish visual renditions of music. Most music visualization systems are based on the division of an audio signal into certain of its constituent frequency bands, followed by the translation of the information from those frequency bands into a visualizable form. The earliest attempts to do this were very simple, and converted music into arrays of colored lights, where the colors of the lights correlated with various frequencies in the music and the lights were turned on and off as and when the frequencies were heard. Examples of such approaches are described in U.S. Pat. No. 1,790,903 to Craig, U.S. Pat. No. 3,851,332 to Dougherty, U.S. Pat. No. 4,928,568 to Snavely, and U.S. Pat. No. 3,228,278 to Wortman. Ultimately, such devices-which were often referred to as “color organs” (a now generally accepted term for a device that represents sound and/or accompanies music in a visual medium)-could not adequately represent the full texture of a piece of music. Attempts were made to capture other aspects of musical form such as variations in amplitude, as well as to attempt a more continuously variable display than was possible with discrete lights. For example: U.S. Pat. No. 4,645,319 to Fekete describes a system in which projectors driven by a color organ reflect the spectral content of an audio source; U.S. Pat. No. 3,241,419 to Gracey describes processing of audio frequency signals to produce an undulating light image pattern on a display; U.S. Pat. No. 3,806,873 to Brady relates to an audio-to-video translating system that includes a time shift feature allowing visual representation of audio signal duration; U.S. Pat. No. 4,614,942 to Molinaro describes a visual sound device in which amplitude variations within an audio signal are translated into a varying visual amplitude output on a display; U.S. Pat. No. 4,394,656 to Goettsche describes a real-time light-modulated sound display in which the audio signal spectrum is visually displayed according to the discrete frequency bands in the spectrum; and U.S. Pat. No. 4,440,059 to Hunter, which describes a color organ employing a system of voltage controlled oscillators to selectively illuminate LED lights along a pair of orthogonal axes; One of the first attempts to visualize music electronically is described in U.S. Pat. No. 4,081,829 to Brown, which presents an apparatus that connects an audio source to a color television and provides a visual representation of the audio signal on the television screen. The representation was dynamic insofar as the image on the display varies with respect to shape, color, and luminance, dep