Multimedia Systems WS 2010/2011 31.01.2011 M. Rahamatullah Khondoker (Room # 36/410 ) University of Kaiserslautern Department of Computer Science Integrated Communication Systems ICSY http://www.icsy.de
Outline Review of Compression: Dictionary Methods LZ77 LZ78 LZW Abraham Lempel and Jacob Ziv 2 M. Rahamatullah Khondoker, University of Kaiserslautern
Dictionary-based Compression
Dictionary-based Compression Introduction Huffman algorithm depends on the statistical model of the data Dictionary-based methods • Do not use statistical model of the data • Do not employ variable-length codes • Are based on the assumption that a typical data file has redundancies in the form of patterns and repetitions of data symbols • Selects strings of symbols from the input and employs a dictionary (static or dynamic) to encode each string as a token • Achieve compression when there are many repetitions in the input file and the size of the token is smaller than the matched string • Add the symbols in the raw form if not found in the dictionary • Write the raw items and tokens on the output such that the decoder will be able to distinguish them • Take into account what to do when the dictionary is filled up 4 M. Rahamatullah Khondoker, University of Kaiserslautern
Dictionary-based Compression Methods Several dictionary based compression methods exists. Among them, we will discuss LZ77 • Is simple but not efficient because its output tokens are triplets and therefore large LZ78 • Generates tokens that are pairs. Therefore, more efficient than LZ77 LZW • Outputs single-item tokens. Therefore, most efficient 5 M. Rahamatullah Khondoker, University of Kaiserslautern
LZ77 method
LZ77 (sliding window) LZ77 (LZ1) method Uses part of the previously-seen input stream as dictionary Maintains a window to the input stream and shifts the input in that window from right to left as strings of symbols are being encoded Based on sliding window Window has two parts: search buffer (thousands of bytes long) and look-ahead buffer (tens of bytes long) Search buffer contains the current dictionary and includes symbols that have recently been input and encoded Look-ahead buffer contains text to be encoded next The encoder search for e in the search buffer and prepares the token (offset, length, next symbol in the look-ahead buffer) = (16,3,e) 7 M. Rahamatullah Khondoker, University of Kaiserslautern
LZ77 (sliding window) Example string: “sir sid eastman easily teases sea sick seals….” The first five steps of the encoding is shown below 8 M. Rahamatullah Khondoker, University of Kaiserslautern
LZ78 method
LZ78 LZ78 (LZ2) method Uses no search buffer, look-ahead buffer or sliding window Uses a dictionary of previously encountered strings which is empty in the beginning and its size depends on the size of the available memory Outputs two-field tokens (index, symbol). The index is a pointer to the dictionary and the symbol is the code of a symbol Does not delete any entry from the dictionary 10 M. Rahamatullah Khondoker, University of Kaiserslautern
LZ78 Example string: sir sid eastman easily teases sea sick seals….” The following table shows some steps of encoding the string 11 M. Rahamatullah Khondoker, University of Kaiserslautern
LZW method
LZW LZW method A popular variant of LZ78, developed by Terry Welch in 1984 Its main feature is to eliminate the second field of the token. Thus the token consists of just a pointer to the dictionary It starts by initializing the dictionary to all the symbols in the alphabet Because the dictionary is initialized, the next input character will always be found in the dictionary 13 M. Rahamatullah Khondoker, University of Kaiserslautern
LZW Example of LZW method Index Entry 0 NULL I In dict? New entry? Output 1 SOH s Y … si N 256-si 115 (s) 32 SP … i Y 97 a ir N 257-ir 105 (i) 98 b r Y 99 c r¶ N 258-r¶ 114 (r) … ¶ Y 107 k 108 l ¶s N 259-¶s 32 (¶) … … 255 255 The above table is constructed from the string “sir sid eastman …” 14 M. Rahamatullah Khondoker, University of Kaiserslautern
Thanks for your attention Any questions, comments or concerns?
M. Rahamatullah Khondoker, M.Sc. Integrated Communication Systems ICSY University of Kaiserslautern Department of Computer Science P.O. Box 3049 D-67653 Kaiserslautern Phone: +49 (0)631 205-26 43 Fax: +49 (0)631 205-30 56 Email: khondoker@informatik.uni-kl.de Internet: http://www.icsy.de
Recommend
More recommend