 
              3. Back to fundamentals Bits, bytes and all that CL1 2003/4-3 1
2 Once upon a time … CL1 2003/4-3
Analogue and Digital • Modern computers are digital computers • Given a quantity (e.g. voltage somewhere) – Decide if there’s a voltage (1) or no voltage (0) • Fast; lots of very simple decisions made very quickly • Easy to transmit - degradation has to be extreme before you get error • Alternative were analogue computers – Very restricted; utterly unlike digital computers – Analogue computation lurks in the sidelines CL1 2003/4-3 3
The binary bit (see handout) • The basic unit handled by all processes in a digital computer is the binary bit . • It may take the value 0 or 1. • 3 bits: 000, 001, 010, 011, 100, 101, 110, 111 – 8 (2 3 ) possibilities. Represent numbers 0 - 7 – ... Or numbers –4 to +3, or 0 to 1 in steps of 1/8 – … or North, North-East, East, … – … red, green, blue, cyan, yellow, magenta, black, white … etc. etc. – any collection of 8 things. CL1 2003/4-3 4
Collections of bits - 1 • 8 bits together can represent 256 (2 8 ) possible values: – Numbers 0 to 255; (or –128 to +127); or a character. – A group of 8 bits is called a byte or an octet • 10 bits together can hold 2 10 or 1024 possible values; 1024 is close to 1000 and ‘K’ in computer terminology usually means 1024 rather than 1000. • Powers of 2 are ‘round’ numbers in computing terms • Binary numbers are unwieldy (2003 10 = 11111010011 2 ) and are usually represented in octal (base 8) or hexadecimal (base 16) notation. CL1 2003/4-3 5
Base N examples In normal (base 10) arithmetic, 1234 stands for… • 1 * 1000 + 2 * 100 + 3 * 10 + 4 * 1 i.e.. • = 1 * 10 3 + 2 * 10 2 + 3 * 10 1 + 4 * 10 0 In Base 2, 1101 means • 1 * 2 3 + 1 * 2 2 + 0 * 2 1 + 1 * 2 0 (= thirteen) In Base 8 (octal), 123 means • 1 * 8 2 + 2 * 8 1 + 3 * 8 0 (= eighty-three) In Base 16 (hexadecimal), 1D means • 1 * 16 1 + D * 16 0 (= twenty-nine) [‘D’ = 13] CL1 2003/4-3 6
Hexadecimal • Hexadecimal arithmetic is base 16 • Digits are 0-9 and A-F • There is a relationship between hexadecimal and binary = each hex digit represents four binary bits • 2003 10 is 11111010011 2 which is 111| 1101 | 0011 = 7D3 in hexadecimal • Written 0x7D3, #7D3 etc. • (similarly octal digits each represent 3 binary bits) CL1 2003/4-3 7
Note … • You will not be asked to do hex or octal arithmetic in an exam but might be asked to ‘count on your fingers’ in binary, hex or octal • Key issue here is to recognise these numbers when you see them and have some idea of what lies behind them • Why for instance numbers like 32, 64, 255 (256) are significant • See handout for more background CL1 2003/4-3 8
Collections of bits - 2 • The maximum number of bits a processor can handle in a single operation is called the word length and in modern machines is typically 32 or 64. • 2 word length - 1 is the largest (unsigned) integer the computer can conveniently store exactly. • The power of a processor depends on the raw processor speed and the word length among other things. • The speed of a computer system depends on many other factors as well . CL1 2003/4-3 9
Examples • GENERAL PROTECTION FAULT, addr = 2FEF 3C00 [“Blue Screen of Death”; hex address of where error occurred]. • <body bgcolor=“#F0F0C0”> [hex number describing amounts of Red, Green and Blue (RGB) in a Web page colour, range 0-255] • 129.215.128.57 [Internet ‘IP’ address; each number will be between 0 and 255] • Computer ‘MAC’ address: 00-C0-86-4A-07-F6 CL1 2003/4-3 10
Units of measure 2 0 10 0 1 byte - 1 2 10 ~10 3 1 Kilobyte Kb 1 Thousand 2 20 ~10 6 1 Megabyte Mb 1 Million 2 30 ~10 9 1 Gigabyte Gb 1 Billion (US) 2 40 ~10 12 1 Terabyte Tb 1,000 Billion 2 50 ~10 15 1 Petabyte Pb 1,000,000 Billion CL1 2003/4-3 11
A few sample data sets 300 page novel 0.5 Mb screen capture of a PC screen at high resolution 2.3 Mb 3 minute sound track recorded as MP3 2 Mb 30 second video clip at low resolution 2 Mb Edinburgh telephone directory 20 Mb Typing 4 characters/second, 8 hrs/day for a year 42 Mb Corporate Customer database, e-Science n Tb Library of Congress 100 Tb Video of your entire life 1 Pb 12 Human brain, big e-science n Pb
Size limits • A computer needs to be able to index each separate byte in memory or a file (e.g. byte 123 in file ‘fred’). • The limits on memory and file sizes (also disk blocks) will be binary ‘round numbers’. • This is why you may find memory on a PC is limited to 4Gb or hard disks on old PCs are limited to 2Gb etc. etc. CL1 2003/4-3 13
Characters are coded • One per byte, ASCII encoding: • A 0x41 (0)100 0001 • B 0x42 (0)100 0010 • C 0x43 (1)100 0011 • c 0x63 (0)110 0011 (lower-case) • Ctrl-A 0x01 (1)000 0001 • Can only represent 128 characters this way • 8 th bit is reserved for error detection - parity • Even parity � even number of ‘1’s in code CL1 2003/4-3 14
Some characters don’t travel well • ASCII 7 bits+1 128 values • €, ¥, §, à, é, ç as coded need 8 bits • Alternatives: • Latin 1(extended ASCII) 8 bits 256 values • Unicode 16 bits 65536 values • But then … • “$” � “local currency symbol” (£,$, …) etc. CL1 2003/4-3 15
Email character quoting • I went to see “Une Femme Française” at the Filmhouse. Tickets were £6.50 Becomes • I went to see =B4=B4Une Femme Fran=E7aise=B4=B4 at the Filmhouse. Tickets were =A36.50. Or.. • <html><body>I went to see “Une Femme Fran{aise” at the Filmhouse. Tickets were £6.50.</html></body> CL1 2003/4-3 16
Encoding Binary files • Need general mechanism for sending programs, images etc. as text mail attachments – 1. Method of encoding binary file – 2. Some means of telling far end what it is • Multimedia Internet Mail Extension (MIME) CL1 2003/4-3 17
18 CL1 2003/4-3
Your frog is in the mail … From: "John Butler" <jhb@ed.ac.uk> To: <asw@dcs.ed.ac.uk> Subject: frog Date: Mon, 01 Oct 2002 11:45:36 +0100 date, timezone MIME-Version: 1.0 Content-Type: image/gif; file type name=“FROG.GIF" Content-Transfer-Encoding: base64 file encoding Content-Disposition: attachment; filename=“FROG.GIF" R0lGODdhgwBwAPcAAP///97W1s7Gxntzc+/e3pyMjJSEhMalpYxzc7WUlKWEhJxzc3taWiEYGBAI CBgICGsYGGsICL0YEM4YELUQCL0QCJwYELUYEIwQCJwQCLVaUnMQCKWlpZRCObUpGFIQCMaUjL2E e4xSSrVCMaUpGM5rWoQhEM6Ec61jUsYxEGtKQlo5MTEQCLU5GNY5EHtCMa1aQq1SOaU5GNZzUmMx Ib1aOXsxGOeljLVzWoxKMc5KGJRrWueEWkohEMZKGM69tWtaUt6Uc1IxISkYEJxSMc7OzrWEa2NC MYRSOc5rOdZaGLWlnM6Uc85aGK2UhM6ce4xjSrVzSrVaIZyEc4RrWs6EUueUWrVrOcZjIdZrIc6t lJRzWu+te7WEWuela5xrQsZ7OdZ7MZxaIcZrIca9tYR7c0IxIb1rIb21rbWtpVpSSufGpdathHtj Sr2Ua96te86UWhgQCEoxGNaMQt6MOdZ7IWtjWue9jOe1e86cY8aUWoRSGLWUa4xjMc6MObV7MdaM Ma2chOe1c96ta9alY6V7Qr2MSs6EIYRzWq2MWt6lUs6MKd6UKda9lO+9a86cSr2MOeelOZyEWt61 a97OrfferefOnOfGhM6ta2tSIc6cOZxzIbWEIe/n1rWtnJSMe4yEc1JKOda9hO/OhN61WtalOc6c … etc. … encoded file CL1 2003/4-3 19
Base64 Binary file encoding • Take 3 bytes of the file (3 * 8 = 24 bits) • Encode as 4 * 6-bit characters (4 * 6 = 24); add ‘0’ • Repeat. • File increases in size by 25% but doesn’t get mangled en route • MIME-type lets receiving mail process know what sort of file it is • Alternatives: uuencode (Unix), binhex (Mac) CL1 2003/4-3 20
Key Points • Computers manipulate bits (0,1) very quickly • Binary numbers are unwieldy, use hex or octal instead – understand counting in hex etc. • Characters encoded as ASCII • Need some encoding to send non-text information over email, e.g. MIME-encoding • Recoginse encoding when you see it CL1 2003/4-3 21
Recommend
More recommend