8. Audio databases About digital audio: Advent of digital audio CD - PowerPoint PPT Presentation

8. Audio databases About digital audio: � Advent of digital audio CD in 1983. � Order of magnitude improvement in overall sound quality and signal-to-noise ratio over the best analog systems. � Wide bandwidth required in on-line transmission. Converting an analog signal into digital form: � Linear Pulse Code Modulation (PCM) � Two-stage process: (a) Sampling : Observing the signal amplitude at certain time intervals; typical sampling frequencies: 16-48 kHz (b) Quantization : discrete scale for observed amplitudes, typically 16 bits per sample → 65536 possible values. � Audio-CD: 16-bit samples at 44.1 kHz rate, with two (stereo) channels: 2 x 16 x 44 100 ≈ 1.4 Mbits per second MMDB-8 J. Teuhola 2012 184

Illustration of audio concepts amplitude wavelength time sampling interval MMDB-8 J. Teuhola 2012 185

Audio compression techniques (a) Delta modulation : � Extremely simple, used sometimes for speech coding � 1-bit quantizer for amplitude differences: 0 = - ∆ , 1=+ ∆ (b) Adaptive Differential Pulse Code Modulation (ADPCM) � The next sample value is predicted on the basis of recent history; the prediction error is quantized and coded � Used mainly for speech coding, e.g. ITU-T G.726 (c) Subband coding � Division of the signal into frequency components (bands) � Encoding of bands separately � E.g. ITU-T recommendation G.722: High-quality speech at 64 Kbits per second MMDB-8 J. Teuhola 2012 186

MPEG audio � Sampling rates 32, 44.1 or 48 kHz (or half of these); samples processed in frames ; 384/1152 samples per frame. � Subband coding with a bank of 32 filters, each with a bandwidth of 1/64 of the sampling frequency. � Samples coded with variable quantization steps. � Psychoacoustics uses the masking properties of the human ear � Compressed bitrates range from 32 to 224 Kbits per second. Compression factor from 2.7 to 24. � MPEG Layer I: best for bitrates > 128 Kbits per sec (per channel). � MPEG Layer II: best for bitrates ≈ 128 Kbits per sec (per channel). � MPEG Layer III: best for bitrates ≈ 64 Kbits per sec (per channel) = MP3 music in the Internet (compression ≈ 12:1). Discrete Cosine Transform (DCT) on subband signals. MMDB-8 J. Teuhola 2012 187

Audio data retrieval (a) Based on metadata � Additional attributes can be attached to voice data (such as to images and video), e.g. speaker, date, duration, composer, orchestra, instrument, ... � Attributes can be connected to the whole audio sequence or some parts of it (e.g. parts of a symphony). � General document retrieval techniques usually apply. MMDB-8 J. Teuhola 2012 188

Audio data retrieval (cont.) (b) Speech recognition : � Proximity search of the waveform; feature extraction e.g. from coefficients of DCT-transformed signal. � Some fuzzyness involved � Simple application: � Giving voice commands to a user interface. � Advanced application: � Parsing of spoken sentences and conversion e.g. to database queries � Can be coupled with natural language understanding techniques. � Usually based on a predefined set of patterns and associated phonetic rules. MMDB-8 J. Teuhola 2012 189

Audio data retrieval (cont.) (c) Speaker recognition : � Application: security systems. � Sensitive to the physical condition (e.g. flu) of the speaker. � Variations: � Text-dependent recognition (simpler): Restricted set of possible words/sentences Comparison of digital waveforms. � Text-independent recognition (more difficult): Based e.g. on voice pitch recognition. More elaborate sentences from particular users must be stored, and complex verification algorithms are run against the spoken samples. MMDB-8 J. Teuhola 2012 190

Audio data retrieval (cont.) (d) Recognition and retrieval of songs (recorded music) Query input alternatives: � Query-by-humming : Succeeds for clearly distinguishable melodies (or themes), in spite of small pitch errors. Similarity measure uses some kind of edit distance � Tapping the tempo : Complements humming/singing � Playing a ( virtual ) keyboard Output: � Ranked list of candidate songs Example search engine: � Musipedia (http://www.musipedia.org/) MMDB-8 J. Teuhola 2012 191

Encoding and retrieval of (synthetic) music � Music encoding: � For digital electronic instruments (no singing!) � Timing of note-on/note-off events, � Control of instrument and playback parameters (pitch, loudness) � Can be played with a syntherizer � Encoding formats: � MIDI (Musical Instrument Digital Interface) � MPEG-4 SA (Structured Audio) � Music XML (Notes represented using structured markup) � Retrieval criteria: � Notes: Generalization of string matching (but: polyphony!) � Time-dependent parameters: Instruments, tempo, volume, ... � Textual metadata: Title, composer, artist, genre, date, ... MMDB-8 J. Teuhola 2012 192

Indexing of audio data � Indexing of metadata (external attributes): � As with any other documents: Inverted indexes, multi- attribute indexes, signature files, etc. � Indexing of audio signal: � First split into segments (= frames, windows). Segmentation requires some rules, e.g. ‘quiet’ zones are possibly good split points. � Transformation (e.g. DCT) of each segment into features � A multidimensional index is built from groups of the features (e.g. main DCT coefficients). � Proximity queries (nearest neighbor, or k nearest neighbors of the query sample) should be supported by the index. MMDB-8 J. Teuhola 2012 193

8. Audio databases About digital audio: Advent of digital audio CD - PowerPoint PPT Presentation

8. Audio databases About digital audio: Advent of digital audio CD in 1983. Order of magnitude improvement in overall sound quality and signal-to-noise ratio over the best analog systems. Wide bandwidth required in on-line

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Cirrus Audio Solutions Cirrus Audio Solutions Home Audio Portable Audio Personal CD Player

Module 3: Creating and Managing Databases Overview Creating Databases Creating

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video

Audio and Speech August 13, 2001 Audio 2 Digital sound anti-aliasing amplifier codec filter

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio :

GEMS/Food Databases and GEMS/Food Databases and GEMS/Food Databases and in the Food Supply

Image Databases Image Databases Image Databases Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Lecture 10: Larger-than-Memory Databases 1 / 53 Larger-than-Memory Databases Recap

Databases and PHP Accessing databases from PHP PHP & Databases l PHP can connect to

3. Text and document databases Normal databases: formatted records; document databases:

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

ARREL AUDIO ML-118 Mid-Side Unit Livio Argentini, Marco Re ARREL AUDIO Rome Via Arnoldo

Adaptive Differential Pulse Code Adaptive Differential Pulse Code Modulation Modulation

and Monitoring Findings Federal Funding Conference March 2020 Audit and Monitoring Requirements

Understanding and Modeling of WiFi Signal Based Human Activity Recognition Wei Wang , Alex X.

Processes vs. System s for Tracking & Analyzing Project Expenditures and Accom plishm ents

Wireless Communication Systems @CS.NCTU Lecture 6: Image Instructor: Kate Ching-Ju Lin (

Contents List of algorithms iii 14 Image data compression 1 14.1 Image data properties 5

OUTLINE What is Wireless? Analog & Digital Information Sources An Overview of

Mobile Communications Networks Wireless Transmission Manuel P. Ricardo Faculdade de Engenharia

8. Audio databases About digital audio: Advent of digital audio CD - PowerPoint PPT Presentation

8. Audio databases About digital audio: Advent of digital audio CD in 1983. Order of magnitude improvement in overall sound quality and signal-to-noise ratio over the best analog systems. Wide bandwidth required in on-line

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Cirrus Audio Solutions Cirrus Audio Solutions Home Audio Portable Audio Personal CD Player

Module 3: Creating and Managing Databases Overview Creating Databases Creating

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video

Audio and Speech August 13, 2001 Audio 2 Digital sound anti-aliasing amplifier codec filter

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio :

GEMS/Food Databases and GEMS/Food Databases and GEMS/Food Databases and in the Food Supply

Image Databases Image Databases Image Databases Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Lecture 10: Larger-than-Memory Databases 1 / 53 Larger-than-Memory Databases Recap

Databases and PHP Accessing databases from PHP PHP &amp; Databases l PHP can connect to

3. Text and document databases Normal databases: formatted records; document databases:

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

ARREL AUDIO ML-118 Mid-Side Unit Livio Argentini, Marco Re ARREL AUDIO Rome Via Arnoldo

Adaptive Differential Pulse Code Adaptive Differential Pulse Code Modulation Modulation

and Monitoring Findings Federal Funding Conference March 2020 Audit and Monitoring Requirements

Understanding and Modeling of WiFi Signal Based Human Activity Recognition Wei Wang , Alex X.

Processes vs. System s for Tracking &amp; Analyzing Project Expenditures and Accom plishm ents

Wireless Communication Systems @CS.NCTU Lecture 6: Image Instructor: Kate Ching-Ju Lin (

Contents List of algorithms iii 14 Image data compression 1 14.1 Image data properties 5

OUTLINE What is Wireless? Analog &amp; Digital Information Sources An Overview of

Mobile Communications Networks Wireless Transmission Manuel P. Ricardo Faculdade de Engenharia

Databases and PHP Accessing databases from PHP PHP & Databases l PHP can connect to

Processes vs. System s for Tracking & Analyzing Project Expenditures and Accom plishm ents

OUTLINE What is Wireless? Analog & Digital Information Sources An Overview of