Introduction Purpose • Brief introduction to: – Digital Audio – Digital Video CS525u – Perceptual Quality Multimedia Computing – Network Issues – The “Science” (or lack of) in “Computer Science” • Get you ready for research papers! Introduction • Introduction to: – Silence detection (for project 1) Groupwork Introduction Outline • Let’s get started! • Background • Consider audio or video on a computer – Digitial Audio (Linux MM, Ch2) – Examples you have seen, or – Graphics and Video (Linux MM, Ch4) – Guess how it might look – Multimedia Networking (Kurose, Ch6) • What are two conditions that degrade quality? • Audio Voice Detection (Rabiner) • MPEG (Le Gall) – Giving technical name is ok • Misc – Describing appearance is ok Digital Audio Digital Sampling • Sound produced by variations in air pressure • Sample rate determines number of discrete – Can take any continuous value values – Analog component � Computers work with digital – Must convert analog to digital – Use sampling to get discrete values 1
Digital Sampling Digital Sampling • Half the sample rate • Quarter the sample rate Sample Size Sample Rate • Samples have discrete values • Nyquist’s Theorem: to accurately reproduce signal, must sample at twice the highest frequency • Why not always use high sampling rate? – Requires more storage – Complexity and cost of analog to digital hardware � How many possible values? – Typically want an adequate sampling rate � Sample Size � Common is 256 values from 8 bits Sample Size Introduction Outline • Quantization error from rounding • Background – Ex: 28.3 rounded to 28 – Digitial Audio (Linux MM, Ch2) • Why not always have large sample size? – Graphics and Video (Linux MM, Ch4) – Storage increases per sample – Multimedia Networking (Kurose, Ch6) • Audio Voice Detection (Rabiner) – Analog to digital hardware becomes more • MPEG (Le Gall) expensive • Misc 2
Review Groupwork • What is the relationship between samples • Think of as many uses of computer audio as and fidelity? you can • Which require a high sample rate and large – Why not always have a high sample frequency? sample size? Which do not? Why? – Why not always have a large sample size? Back of the Envelope Calculations More Back of the Envelope Calculations • Telephones typically carry digitized voice • Can only represent 4 KHz frequencies (why?) • 8 KHz (8000 samples per second) • Human ear can perceive 10-20 KHz • 8-bit sample size – Used in music • For 10 seconds of speech: • CD quality audio: – 10 sec x 8000 samp/sec x 8 bits/samp – sample rate of 44,100 samples/sec = 640,000 bits or 80 Kbytes – sample size of 16-bits – Fit 3 minutes on floppy – 60 min x 60 secs/min x 44,100 samp/sec x 2 • Fine for voice, but what about music? bytes/samples x 2 channels = 635,040,000 or about 600 Mbytes • Can use compression to reduce Audio Compression MIDI • Musical Instrument Digital Interface • Above sampling assumed linear scale with respect to intensity – Protocol for controlling electronic musical • Human ear not keen at very loud or very quiet instruments • MIDI message • Companding uses modified logarithmic scale to greater range of values with smaller – Which device sample size – Key press or key release – µ-law effectively stores 12 bits of data in 8- – Which key bit sample – How hard (controls volume) – Used in U.S. telephones • MIDI file can play ‘song’ to MIDI device – Used in Sun computer audio – MP3 for music 3
Sound File Formats Example Sound Files • Raw data has samples (interleaved w/stereo) • Need way to ‘parse’ raw audio file • Typically a header – Sample rate – Sample size – Number of channels – Coding format – … • Examples: – .au for Sun µ-law, .wav for IBM/Microsoft Graphics and Video Outline “A Picture is Worth a Thousand Words” • Introduction • People are visual by nature • Many concepts hard to explain or draw – Digital Audio (Linux MM, Ch2) • Pictures to the rescue! – Graphics and Video (Linux MM, Ch4) • Sequences of pictures can depict motion – Multimedia Networking (Kurose, Ch6) • Audio Voice Detection (Rabiner) – Video! • MPEG (Le Gall) • Misc Monochrome Display Graphics Basics • Computer graphics (pictures) made up of pixels – Each pixel corresponds to region of memory – Called video memory or frame buffer • Write to video memory – monitor displays with raster cannon • Pixels are on (black) or off (white) – Dithering can appear gray 4
Grayscale Display Color Displays • Combine red, green and blue • 24 bits/pixel, 2 24 = 16 million colors • Bit-planes • But now requires 3 bytes required per pixel – 4 bits per pixel, 2 4 = 16 gray levels Video Palettes Video Wrapup • Still have 16 million colors, only 256 at a time • xdpyinfo • Complexity to lookup, color flashing • Can dither for more colors, too Introduction Outline • Background – Digitial Audio (Linux MM, Ch2) – Graphics and Video (Linux MM, Ch4) – Multimedia Networking (Kurose, Ch6) • (6.1 to 6.3) • Audio Voice Detection (Rabiner) • MPEG (Le Gall) • Misc 5
Internet Traffic Today Multimedia on the Internet • Internet dominated by text-based applications • Multimedia not as sensitive to loss – Email, FTP, Web Browsing • Very sensitive to loss – Words from sentence lost still ok – Example: lose a byte in your blah.exe – Frames in video missing still ok program and it crashes! • Multimedia can be very sensitive to delay • Not very sensitive to delay – Interactive session needs one-way delays – 10’s of seconds ok for web page download less than 1 second! • New phenomenon is jitter! – Minutes for file transfer – Hours for email to delivery Jitter Classes of Internet Multimedia Apps • Streaming stored media • Streaming live media • Real-time interactive media Jitter-Free Streaming Stored Media Streaming Live Media • Stored on server • “Captured” from live camera, radio, T.V. • Examples: pre-recorded songs, famous • 1-way communication, maybe multicast • Examples: concerts, radio broadcasts, lectures, video-on-demand • RealPlayer and Netshow lectures • Interactivity, includes pause, ff, rewind… • RealPlayer and Netshow • Delays of 1 to 10 seconds or so • Limited interactivity… • Not so sensitive to jitter • Delays of 1 to 10 seconds or so • Not so sensitive to jitter 6
Hurdles for Multimedia on the Internet Real-Time Interactive Media • IP is best-effort • 2-way communication – No delivery guarantees • Examples: Internet phone, video conference – No bandwidth guarantees • Very sensitive to delay – No timing guarantees < 150ms very good • So … how do we do it? < 400ms ok – Not too well for now > 400ms lousy – This class is largely about techniques to make it better! Multimedia on the Internet The Media Player • The Media Player • End-host application • Streaming through the Web – Real Player, Windows Media Player • The Internet Phone Example • Needs to be pretty smart • Decompression (MPEG) • Jitter-removal (Buffering) • Error correction (Repair, as a topic) • GUI with controls (HCI issues) – Volume, pause/play, sliders for jumps Streaming through a Plug-In Streaming through a Web Browser Must still use TCP! Must download whole file first! 7
Streaming through the Media Player An Example: Internet Phone • Specification • Removing Jitter • Recovering from Loss Internet Phone: Removing Jitter Internet Phone: Specification • Use header information to reduce jitter • 8 Kbytes per second, send every 20 ms – Sequence number and Timestamp – 20 ms * 8 kbytes/sec = 160 bytes per packet • Header per packet – Sequence number, time-stamp, playout delay • End-to-End delay of 150 – 400 ms • UDP • Two strategies: – Can be lost – Can be delayed different amounts –Fixed playout delay –Adaptive playout delay Fixed Playout Delay Adaptive Playout Delay 8
Internet Phone: Recovering from Loss 9
Recommend
More recommend