text
play

Text 1. A text is a sequence of characters 2. Each character is - PDF document

Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus 02 Information theory 02.03 Representation of non-numerical sets Texts Images Signals (Audio/Video)


  1. Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus 02 Information theory 02.03 Representation of non-numerical sets • Texts • Images • Signals (Audio/Video) • Redundancy and compression alessandro bogliolo isti information science and technology institute 1 /12 Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus Text 1. A text is a sequence of characters 2. Each character is taken from a finite alphabete 3. Using a constant-size encoding for the characters, a text is encoded as a concatenation of character codes 4. ASCII: 7-bit encoding 5. Extended ASCII: 8-bit encoding alessandro bogliolo isti information science and technology institute 2 /12

  2. Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus Images 1. An image is a matrix of points with assigned colors 2. An image contains infinite points and each point may take infinite colors 3. Both space and color discretization required 4. Discretized points are called pixels 5. Pixels are organized on a matrix 6. Using a constant size encoding for each pixel, an image is a concatenation of pixels, to be read in a given order alessandro bogliolo isti information science and technology institute 3 /12 Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus Color (gray) levels 1111 1110 The encoding associates a unique code with an 1101 interval of gray levels 1100 1011 All gray levels within the interval are associated 1010 with the same code, thus loosing information 1001 1000 The original gray level cannot be exactly 0111 reconstructed from the code 0110 0101 Encoding associates each code with a unique gray 0100 level (representative of a class) 0011 0010 0001 0000 alessandro bogliolo isti information science and technology institute 4 /12

  3. � � � Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus 2D images size n n log n Gray level x y 2 lev n lev n x x n y y pixel alessandro bogliolo isti information science and technology institute 5 /12 Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus Example 100x100x8 bit 100x100x1 bit 50x50x1 bit 50x50x8 bit 10x10x8 bit 10x10x1 bit alessandro bogliolo isti information science and technology institute 6 /12

  4. � � � � � � � � � Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus Analog and digital signals • Signal : time-varying physical quantity – Analog : continuous-time, continuous-value – Digital : discrete-time, discrete-value • The digital encoding of a continuous signal entails: – Sampling (i.e., time discretization) – Quantization (i.e., value discretization) size s T s rate size Sampling rate Sample size Duration alessandro bogliolo isti information science and technology institute 7 /12 Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus Audio: time series value time size s T s s T log n rate size rate 2 lev alessandro bogliolo isti information science and technology institute 8 /12

  5. � � � � � � � � � � � Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus Video size s T s s T log n n n rate size rate 2 col x y s rate = frame rate n col = number of colors color n x n y = frame size n y n x time alessandro bogliolo isti information science and technology institute 9 /12 Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus Redundancy • Redundant encoding : encoding that makes use of more than the minimum number of digits required by an exact encoding N log M S • Motivations for redundancy: – Providing more expressive/natural encoding/decoding rules – Reliability (error detection) Ex: parity encoding – Noise immunity / fault tolerance (error correction) Ex: triplication alessandro bogliolo isti information science and technology institute 10 /12

  6. Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus Redundancy: examples • Parity encoding : – A parity bit is used to guarantee that all codewords have an even number of 1’s – Single errors are detected by means of a parity check Irredundant codeword parity check 0 00101 0010 error 1 01101 • Triple redundancy : – Each character is repeats 3 times – Single errors are corrected by means of a majority voting 000000111000 error 000000111010 voting result 0 0 1 0 alessandro bogliolo isti information science and technology institute 11 /12 Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus Compression • Lossy compression – Compression achieved at the cost of reducing the accuracy of the representation – The original representation cannot be restored – Always effective • Lossless compression – Compression achieved by either removing redundancy or leveraging content-specific opportunities – The original representation can be restored – Not always effective alessandro bogliolo isti information science and technology institute 12 /12

Recommend


More recommend