Control Mechanisms for Packet Audio in the Internet zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA France zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Jean-Chrysostome Bolot And& Vega-Garcia INRIA B. P. 93 The current Internet provides zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA 06902 Sophia-Antipolis Cedex the zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA bolot , avega}@sophia.inria.fr { lay and loss distributions 12, 211. These characteris- Abstract single class best effort tics are not known in advance since they depend on a service. From an application’s point of view, this ser- unknown) behavior of other connections (apriori vice amounts in practice to providing channels with time- throughout the network. This makes it essentially im- varying characteristics such as delay and loss distributions. possible to provide performance guarantees such as One way to support real time applications such as interac- minimum loss rate or maximum delay. Thus, it is tive audio given this service is to use control mechanisms not clear how well applications with minimum guar- that adapt the audio coding and decoding processes based anteed requirements such as audio applications can on the characteristics of the channels, the goal begin to work over the Internet. Experimental evidence sug- maximize the quality of the audio delivered to the destina- gests that, although the quality of the audio delivered tions. In this paper, we describe and analyze a set of such by Internet tools has improved, audio quality is still control mechanisms. They include a jitter control mecha- mediocre in many audio conferences. This is clearly a nism and combined error and rate control mechanism. a concern since audio quality has been found to be more These mechanisms have been implemented and evalu- important than video quality or audio/video synchro- ated over the Internet and the MBone. Experiments indi- nization to successfully carry out collaborative work cate that they make it possible to establish and maintain reasonable quality audioconferences even across fairly con- ~ 5 1 . It should be pointed out that bad audio quality is gested connections. often caused by problems having little to do with ei- ther the network service or the audio tools themselves. 1 Introduction The transmission of voice over packet switched net- The experience accumulated with the audiocasting of MICE [20] and IETF meetings suggests that badly works was an active research area in the late 70’s and tuned or set up microphones and speakers are respon- the early 80’s (291. Much of the work then focused on using packet switching for both voice and data in a sin- sible for many such problems. However, all these can the MBone [7]. A variety of audio tools such as zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA [17] or zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA be addressed by users at their own sites. Furthermore, gle network. Packet voice, and more generally packet audio applications, have recently become again of in- their impact is expected to decrease as users become terest. This interest has been fueled by the availability familiar with the tools and the tools themselves be- of supporting hardware (microphones now come stan- come more user friendly. In any case, the most per- dard with most workstations), of increased bandwidth sistent problems with audio quality are caused by the throughout the Internet, and by the development of network, or rather by the impact of traffic in the net- work on the stream of audio packets. Two approaches vat active research area zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Nevot [24] have been available for a few years, have emerged to tackle this problem. and they have been used to audiocast conferences. Re- One approach is to extend current protocols and cently, several more tools have been announced, which switch scheduling disciplines to provide the desired claim to provide toll-quality workstation or PC audio requirements. This approach requires that admission over the Internet for a fraction of the cost of a tele- control, reservation, and/or sophisticated scheduling phone call (see [5] for pointers to these tools and other mechanisms be implemented in the network. These information related to packet audio). mechanisms are not yet implemented in the Internet, 232 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA and their design, analysis, and evaluation is still an However, the Internet provides a simple single class best effort service. From a connection’s point of view, [26]. Thus, we have not pursued the best effort service amounts in practice to offering a this approach so far. 0743-166W96 $5.00 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA channel with time-varying characteristics such as de- Another approach is to adapt applications to the 2c.4.1 1996 IEEE 0
sender zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA service provided by the network. This amounts in practice to adapting applications to the time-varying characteristics of the connection over which the appli- cation data packets are sent, the goal being to maxi- mize the quality of the data delivered to the destina- tions. Experimental evidence suggests that the quality of the audio depends essentially on the number of lost wonshctio zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA packets and on the delay variations between successive packets. Thus, the most important network character- istics for aiidio applications are the delay variance (or jitter), and the loss distributions. Furthermore, for - live audio applications such as audioconferences, the " . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . " " " " " average end-to-end delay must be small to allow inter- audio playout + audio developed zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA actions between participants. buffer , , output The goal then in this approach is to develop mech- 1 anisms that attempt to eliminate or at least minimize the impact of packet loss and delay jitter on the qual- ity of the audio delivered to the destinations. We have a set of such mechanisms. One mechanism adjusts the playout time of audio packets at the desti- receiver schemes nation, the objective being to minimize the impact of delay jitter. A second mechanism adds redundancy in- Figure 1: Structure of the audio tool formation in the audio packets sent by the source, the objective being to minimize the impact of packet loss. A third mechanism controls the rate at which pack- kHz sampled speech with bit rates varying from a few ets are sent over a connection, the objective being to kb/s to 64 kb/s. Specifically, they include a 64-kb/s p- match the send rate to the capacity of the connection law PCM, various adaptive delta modulation (ADM) and hence to minimize packet loss. The second and coders with rates varying from 16 kb/s (for ADM2) third mechanisms both attempt to minimize the im- to 56 kb/s (for ADMG), a 13 kb/s GSM coder, and pact of packet loss, and they really are two sides of a a 4.8 kb/s LPC low bit rate coder. Work is under- joint errorjrate control mechanism. way to include wideband speech coders. The PCM, These mechanisms have been implemented in a new ADMG, ADM5, and GSM coders deliver high qual- audio tool developed at INRIA. For lack of space (and ity audio with MOS scores above 3.5. The ADM2, as suggested by reviewers) we do not describe in the ADM3, and LPC coders delivers audio with a some- paper the jitter control mechanism. We focus instead what lower quality. However, even a mediocre low bit on the rate and error control mechanisms. In Sec- rate coder tiirns out to be useful for error control pur- tion 2, we describe the structure of the audio tool. poses (refer to Section 3). The boxes in the figure In Section 3, we characterize the loss process of au- which involve one of the control mechanisms of inter- dio packets, and describe and evaluate a packet loss est in the paper have been highlighted. They include recovery scheme. In Section 4, we describe and eval- the redundancy box (which involves the error control uate a joint error and rate control scheme. Section 5 mechanism), the congestion information and feedback concludes t+he paper. information boxes (which involve the error/rate con- trol mechanism), and the playout buffer box (which 2 The audio tool involves the jitter control mechanism). The structure of the audio tool is shown in Figure 1 below. It is being developed within the MICE project The audio packets are sent from the source to the The coding schemes available at this time use zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA destination(s) using IP (or its multicast extension), in collaboration with a group at University College UDP, and RTP. To each audio packet is associated a London (UCL). Work at UCL has focused on device- 8- zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA timestamp and a sequence number. The timestamp is independent audio input, efficient mechanisms for si- used to measure end-to-end delays, and the sequence lence detection, automatic gain control, and echo can- number is used to detect packet losses. cellation, and on the evaluation of the auditory qual- ity of the signal delivered to the destinations. Work at A loss recovery mechanism 3 INRIA has focused on coding schemes, and on jitter, Anecdotal evidence suggests that audio quality is rate, and error control mechanisms. still mediocre in many audio connections because of 2c.4.2 233
Recommend
More recommend