1
play

1 Music IR Music? Music IR Music? Music - Sound Music - Sound - - PDF document

Lead-in Who am I? Music Information Retrieval Vienna University of Technology http://www.tuwien.ac.at http://www.ifs.tuwien.ac.at/mir Faculty of Computer Science http://www.cs.tuwien.ac.at Department of Software Technology and


  1. Lead-in Who am I? Music Information Retrieval � Vienna University of Technology http://www.tuwien.ac.at http://www.ifs.tuwien.ac.at/mir • Faculty of Computer Science http://www.cs.tuwien.ac.at – Department of Software Technology and Interactive Systems Andreas Rauber http://www.isis.tuwien.ac.at » Software and Information Engineering Group Department of Softwaretechnology and http://www.ifs.tuwien.ac.at Interactive Systems - Andreas Rauber Vienna University of Technology http://www.ifs.tuwien.ac.at/~andi Machine Learning, Neural Networks http://www.ifs.tuwien.ac.at/~andi Text Mining, Digital Libraries Music Retrieval Digital Preservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lead-in Lead-in Activities Who else is MIR@ifs? � Audio Feature Extraction � � Thomas Lidy Music Classification � PlaySOM: Organisation of Music Archives � Robert Neumayer � PocketSOM: Browsing Music on Mobile Devices � Rudolf Mayer � 3D Worlds for Music � Jakob Frank � Audio Segmentation � Chord Detection � Other members Former members Blind Source Separation � Veronika Zenz Markus Frühwirth Text and Music (Lyrics, Bio, ...) � Peter Hlavac Elias Pampalk Ewald Peiszer Stefan Leitich Andreas Scharf David Laister Andrei Grecu & Doris Baum & others others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chorus Music IR – Music? What is „Music“? � Lead-in � Music, of course! � Chorus Audio: wav, au, mp3, ... � Verse 1: Music-IR Symbolic: MIDI, mod, ... � Verse 2: Audio Features www.samplesmith.com Scores: Scan, MusicXML � Verse 3: Classification and Benchmarking www.westminster.gov.uk � Verse 4: Clustering & Browsing � Text � Community data � Video/Images � Verse 5: Some other applications – Song lyrics – Playlists – Album covers � Fade-out – Artis Biographies – Market basket – Music videos – Websites: – Band evolution Fanpages, Album Reviews, Genre descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

  2. Music IR – Music? Music IR – Music? Music - Sound Music - Sound - Loudness http:// www.phys.unsw.edu.au/jw/hearing.html � Sound as acoustic wave Source of sound sound pressure sound pressure level pascal dB re 20 µPa � Characterized by the properties of waves immediate soft tissue damage 50000 approx. 185 (frequency/wavelength, amplitude) threshold of pain 100 134 hearing damage during short-term effect 20 approx. 120 � Frequency: pitch jet engine, 100 m distant 6–200 110–140 – Humans can hear approx. 20Hz-20kHz jack hammer, 1 m distant / discotheque 2 approx. 100 hearing damage during long-term effect 0.6 approx. 85 – speech: 200Hz-8kHz major road, 10 m distant 0.2–0.6 80–90 � Amplitude: Loudness passenger car, 10 m distant 0.02–0.2 60–80 µ Pa – measured as pressure in micropascal TV set at home level, 1 m distant 0.02 ca. 60 normal talking, 1 m distant 0.002–0.02 40–60 20 µ Pa – hearing threshold: approx. very calm room 0.0002–0.0006 20–30 – logarithmic decibel scale leaves noise, calm breathing 0.00006 10 auditory threshold at 2 kHz 0.00002 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Music IR – Music? Music IR – Music? Music - Sound Music - Sound � � Different file formats for storing sound: Nyquist sampling theorem: Exact reconstruction of a continuous-time baseband signal from its – lossless formats samples is possible if the signal is bandlimited and the sampling • WAV (may hold compressed audio, but usually lossless PCM) frequency is greater than twice the signal bandwidth. • FLAC, Shorten, Monkey's Audio, ATRAC Advanced Lossless, Apple Lossless, WMA Lossless, TTA – lossy formats � is the Nyquist frequency, i.e. a signal with a specific frequency • MP3 must be sampled with twice that frequency for reconstruction. • ATRAC � More on sound, sound pressure, hearing thresholds, etc. later when • AAC we talk about feature extraction from sound. • Ogg Vorbis • WMA • ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Music IR – Music? Music IR – Music? Music - Sound - PCM Music - Sound - MP3 � PCM: Pulse Code Modulation � Actually: MPEG-1 Audio Layer 3 � Digital representation of an analog signal where the magnitude of � Developed by a groups around Fraunhofer, Thomson, the signal is sampled regularly at uniform intervals, then quantized AT&T Bell Labs, several patent issues pending to a series of symbols � Lossy compression, based on psycho-acostic models � Used in WAV, CD-recordings, ... – differential encoding of stereo signal (lossless) � Quantization error: chosing discrete – focus on audible frequencies value near the analog signal – masking effects for each sample � – adaptive bit-depth encoding Any frequency above or equal to – quantization and huffman-encoding 1/2 sampling frequency is lost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

  3. Music IR – Music? Music IR – Music? Music - Sound - MP3 What is „Music“? � ID3-Tags � Music, of course! � Added later-on to allow embedding of meta data – Audio: wav, au, mp3, ... – Symbolic: MIDI, mod, ... � ID3v1: 30 char per entry, few standard fields www.samplesmith.com – Scores: Scan, MusicXML � ID3v2.4: UTF-8 support, tags at beginning of file www.westminster.gov.uk � Text � Community data � Video/Images � Used by search engines – Song lyrics – Playlists – Album covers – Artis Biographies – Market basket – Music videos – Websites: – Band evolution Fanpages, Album Reviews, Genre descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Music IR – Music? Music IR – Music? Musical Instrument Digital Interface - MIDI Musical Instrument Digital Interface - MIDI � Some MIDI examples � Symbolic Music File Format (from: http://www.borg.com/~jglatt/files/midifile.htm) � Dave Smith, proposed in 1981 – Orchestral: Bach: Branderburg Concerto 4 – Orchestral: Star Treck Theme: Next Generation � MIDI specification 1.0 in 1983 – Classic: Beethoven: Für Elise � Interacting with keyboard produces messages – 1950's Rock&Roll: Bill Haley: Rock Around the Clock – 1950's Rock&Roll: Jerry Lee Louis: Great Balls of Fire – Note-On , Aftertouch , and Note-Off – Pop: Elton John: Don't Let the Sun Go Down – 127 note pitches – Pop: Phil Colins: Another Day in Paradise � Sequence of control commands – Heavy Metal: Queen: Another One Bites the Dust – Heavy Metal: Van Halen: Jump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Music IR – Music? Music IR – Music? MOD MOD � Similar to MIDI, but � Some examples (from http://modarchive.org) � stores audio samples together with control instructions – Classical: Dark Castle (Part 1) – Classical: Canon in D � should sound the same on every player – Classical: Beethoven: Für Elise � a.k.a. tracker modules (first ever module creating program – Guitar: Sweet Lorraine was Soundtracker, created by Karsten Obarski 1987) – Latin: Heart and Soul – Techno: 10KBlur – Disco: Rob Hubbard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Recommend


More recommend