AudioRadar A metaphorical visualization for the navigation of large music collections Otmar Hilliges, Phillip Holzer, René Klüber, Andreas Butz Ludwig-Maximilians-Universität München
AudioRadar – An Introduction � AudioRadar is a new interface to � Visualize � Browse � Organize Music Collections. � AudioRadar is based on similarity of songs. � AudioRadar visualizes similarity by proximity. Vancouver, 07/ 24/ 2006 2 / 29
Music S imilarity “ ‘ The Blues’ might be Rose’ s crowning career achievement: It’ s an epic combination of mid-period S tevie Wonder, early Elton John, and side two of ‘ In Through the Out Door’ ” . Vancouver, 07/ 24/ 2006 3 / 29
How do we explain music ? � Music is very complex and difficult to explain. � S imilarity is a very common metric � S ounds j ust like… � Is a mixture between… � Reminds you of… � Enables us to get a feeling for the music without actually hearing it. Vancouver, 07/ 24/ 2006 4 / 29
But – How do we consume digital music? � Music Collections are increasing in size (1000 to >10.000). � Current player software relies on metadata for organization. � Browsing music collections degrades to scrolling endless lists. � Large collections require better navigation mechanism. Vancouver, 07/ 24/ 2006 5 / 29
Implications - S tatistics Average collection size 3,542 Largest Collection 50,458 Active songs (80% of plays) 23% Songs never played 64% Study: Paul Lamere, Sun Microsystems. Data Courtesy of iPod Registry Vancouver, 07/ 24/ 2006 6 / 29
Implications on Collection Navigation � Meta information is assigned to music rather then derived from it. � Artist/ Title etc. give little information on how a song sounds. � Classification into genres is troublesome. Vancouver, 07/ 24/ 2006 7 / 29
8 / 29 Similarity Based Browsing of Music Collections Vancouver, 07/ 24/ 2006
AudioRadar – Our Approach � We don’ t rely on metadata. � We especially don’ t rely on genres. � We don’ t rely on lists and textual information. Vancouver, 07/ 24/ 2006 9 / 29
AudioRadar – Our Approach � We derive a set of meaningful descriptive features from the audio stream. � We visualize music collections based on similarity/ proximity. Vancouver, 07/ 24/ 2006 10 / 29
AudioRadar – The Metaphor � We use a radar as visual metaphor. � The currently playing song is the centroid. � S imilar songs are grouped around the centroid in the near vicinity. � The more similar a song, the closer it is placed to the center. Vancouver, 07/ 24/ 2006 11 / 29
AudioRadar – The Metaphor Vancouver, 07/ 24/ 2006 12 / 29
Interface Understandability � For users to understand the radar interface two things are most important: � The measured similarity must be as close as possible to the subj ectively perceived similarity. � The songs must be placed Correctly � Meaningful � Vancouver, 07/ 24/ 2006 13 / 29
14 / 29 Automatic Audio Analysis Placement Strategies and Vancouver, 07/ 24/ 2006
Automatic Audio Analysis � We extract a set of descriptive features from the audio stream. � Tempo � Tonality � Harmony � Rhythm patterns Vancouver, 07/ 24/ 2006 15 / 29
Dimensions We calculate a four dimensional vector space � � Fast vs. S low � Melodic vs. Rhythmic � Clean vs. Rough � Calm vs. Turbulent Vancouver, 07/ 24/ 2006 16 / 29
Placement S trategies � Different strategies are possible to calculate proximity and placement on the radar � Choosing the right strategy is crucial for the understanding of the songs’ relationships. Vancouver, 07/ 24/ 2006 17 / 29
Dimensionality Problem � General problem of displaying a high dimensional space on a 2D screen. � In our case 4D space <-> 2D display. � Desired: No expressivity loss of the visualization. Vancouver, 07/ 24/ 2006 18 / 29
Naïve Approach � Easiest but correct method is to omit 2 dimensions. � Position of items on the 2D plane can be calculated directly from their values in the original space. leads to information loss. Vancouver, 07/ 24/ 2006 19 / 29
Placement S trategies I � Another approach is to find a proj ection from 4D to 2D � Proj ection onto 2D Cartesian coordinate system. Vancouver, 07/ 24/ 2006 20 / 29
Placement S trategies II Maximum value placement � Meets subj ective similarity � measurement better. Leads to visual clutter. � Vancouver, 07/ 24/ 2006 21 / 29
Placement S trategies III � S ector is chosen on maximum value � To avoid visual clutter we compute an offset using the second highest value. � This placement matches subj ective similarity perception even if inexact. Vancouver, 07/ 24/ 2006 22 / 29
23 / 29 Mood Based Playlist Generation Vancouver, 07/ 24/ 2006
Playlist Generation � S tandard playlists are containers for a set of artists/ genres/ decade. � We want to listen to music that fits our mood. � We might not know how a song/ artist/ genre actually sounds. Vancouver, 07/ 24/ 2006 24 / 29
Mood based playlist generation Vancouver, 07/ 24/ 2006 25 / 29
26 / 29 Future Work Conclusion and Vancouver, 07/ 24/ 2006
Conclusion � S imilarity in music is a very human concept. � We created the first functional player fully relying on this concept. � We found and applied a coherent visual metaphor to display music similarity. � We extended the concept into mood based playlist generation. Vancouver, 07/ 24/ 2006 27 / 29
Issues and Future Work � Feature extraction algorithms are very basic and produce faulty results. � The dimensions clean vs. rough and turbulent vs. calm are problematic. � Playlist generation could be improved e.g. drawing border around regions of interest. � We want to explore fuzzy search methods for music retrieval. Vancouver, 07/ 24/ 2006 28 / 29
29 / 29 Any Questions? Thank You! - Vancouver, 07/ 24/ 2006
Recommend
More recommend