FY04: Introduction to the use of computers jennifer george Acknowledgement Jeremy Gow jennifer george 1
Last Week Lecture Mass storage: hard disks, optical, flash Huge increases in capacity over years Filesystems Files and directories Unix, OS X and Windows all different Windows also uses drives Can be shared over network jennifer george Last week’s Lab Linux Server PuTTY SSH VNC emacs jennifer george 2
jennifer george Today Measuring digital data Bits Bytes Kilobytes Megabytes ... SI and Binary units jennifer george 3
More for today Binary files Hexadecimal Text files Character sets Text encodings ASCII, Unicode jennifer george The Analogue World Information is continuous (smoothly, without breaks) jennifer george 4
The Digital World Information is discontinuous (broken into chunks) Modern computing is digital (not analogue) jennifer george Bits: The foundation of digital computing A bit is smallest possible chunk of information the difference between two possibilities on/off, up/down, yes/no, heads/tails... Traditionally 0 or 1 (Binary digIT) Unit of storage (written b) Space used to store something as 0s and 1s jennifer george 5
Everything digital is made of bits jennifer george Bytes A byte is 8 bits Written B, so 8b = 1B Unit of storage This image is 7395b, about 924B Related units nybble: 4 bits (0.5 bytes) crumb: 2 bits (0.25 bytes) jennifer george 6
Binary: Numbers as bits Representing numbers using bits 117 = 64 + 32 + 16 + 4 + 1 A full byte is 255 = 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1 jennifer george Binar ary: y: Powers s of 2 Binary based on powers of 2 117 = 2 6 + 2 5 + 2 4 + 2 2 + 2 0 A full byte is (2 8 - 1) = 2 7 + 2 6 + 2 5 + 2 4 + 2 3 + 2 2 + 2 1 + 2 0 jennifer george 7
Group oup exercise: cise: Your ur Age in Binary In groups of 4 or 5 Work out your individual ages in binary Work out your combined age in binary I’m 100001 (tomorrow I’ll be 100010) jennifer george The Kilobyte (kB) 1000 bytes 8000 bits Half a page of text A small icon About 7 magnetic swipe cards jennifer george 8
The Megabyte (MB) One millon bytes (1,000,000 = 10 6 ) 1000 kilobytes A thick book A minute of MP3 (128 kb/s) 6 sec of CD audio A digital photo (a few MB) jennifer george The Gigabyte (GB) One billion bytes (1,000,000,000 = 10 9 ) 1000 megabytes TV quality film (a few GB) 17 hours of MP3 (128kb/s) English Wikipedia (2.7 GB) The Human Genome (3 GB) jennifer george 9
The Terabyte (TB) One trillion bytes (1,000,000,000,000 = 10 12 ) 1000 gigabytes Library of Congress (20TB of text) YouTube (600 TB in 2006) jennifer george The Petabyte (PB) One quadrillion bytes (1,000,000,000,000,000 = 10 15 ) 1000 terabytes Large Hadron Collider (15 PB/year) Google storage (??? PB) All printed material (200 PB) jennifer george 10
Beyond the Petabyte Exabyte (10 18 ) A year of US telephone calls (9.25 EB) Zettabyte (10 21 ) All electronic data (1.8 ZB by 2011) 1 gram of DNA (2.25 ZB) “All words ever spoken” as 32kb/s audio (42 ZB) Yottabyte (10 24 ) The internet? jennifer george Group exercise How much data do you own? In groups of 3 or 4 Estimate how much digital data you each own Photos, music etc. What takes up the most space? Laptops, iPods, phones... 1 GB = 1000 MB 1 MB = 1000 kB jennifer george 11
SI Prefixes Le Système International d'Unités Many uses: kilobits, kilobytes, kilometres, ... 1 kilobyte = 1000 bytes jennifer george Binary Prefixes Based on powers of 2 (like binary) Used for data only More convenient when using binary addresses 1 kilobyte = 1024 bytes jennifer george 12
SI versus Binary Each unit now has two different meanings Is a kilobyte 1000 or 1024 bits? Binary kB 2.4% larger than SI kB jennifer george IEC Binary Prefixes Attempt in 1999 to resolve ambiguity Rename binary prefixes (for bytes only) kilobyte becomes kibibyte jennifer george 13
Binary files Files are zeros and ones (grouped into bytes) Designed to be interpreted in some way Text (bytes → characters) Image (bytes → pixels) MP3 files (bytes → sounds) ... Each uses a different encoding (stuff → bytes) jennifer george Binary: Numbers as bits Representing numbers using bits 117 = 64 + 32 + 16 + 4 + 1 A full byte is 255 = 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1 jennifer george 14
Hexadecimal Binar ary y for human ans Binary is hard for people to read & write Can translate to hexadecimal (base-16) 01111010 →7A jennifer george Hexadecimal Conver ertin ting g to and d from m binary Each hexadigit represents four bits Two hexadigits is one byte, e.g. 7A → 0111 1010 jennifer george 15
Hexadecimal Example ple jennifer george Text files Text files contain a sequence of characters e.g. emails, web pages, ... They are binary files + a text encoding Encoding defines byte for each character Encodings may have different character sets jennifer george 16
ASCII Charact aracter er set et American Standard Code for Information Interchange 128 characters Printing characters (inc. space) !”#$%&’()*+, - ./0123456789:;<=>?@ABCDEFGHIJKLMN OPQRSTUVWXYZ[\] ^_`abcdefghijklmnopqrstuvwxyz{|}~ 32 control characters Tab, line feed, bell, ... (mostly obsolete) jennifer george ASCII Enco coding ding A character is a single byte Printing characters... jennifer george 17
ASCII Example ple jennifer george Unicode Un Univer ersal sal Charact racter er Set et Over 100,000 characters From world and historical scripts Alphabetic characters Technical & mathematical symbols Combination characters (ligatures, accents) Control characters (new line etc.) jennifer george 18
Un Unicode de http://unicode.org/charts/ jennifer george Unicode Latin tin ch charact racter ers jennifer george 19
Unicode Arabi abic c ch charac racter ers jennifer george Unicode CJK K ch charac racter ers jennifer george 20
Unicode Georgia orgian n ch charact racter ers jennifer george Unicode Choic oice e of enco codings dings UCS-4 (simple) 4 bytes per character UTF-16 (e.g. Windows) Usually 2 bytes, some use 4 UTF-8 (e.g. Unix) ASCII characters need 1 byte (compatible!) Others need 2, 3 or 4 bytes jennifer george 21
Text encoding Example ple Encode the string “£4 = €5” jennifer george Word processing files Word processing applications Microsoft Word, Open Office Writer, Pages, Star Office, Abiword, KWord, ... Used to represent text, but large amounts of formatting information include graphics, charts and more don’t usually use standard text encoding jennifer george 22
Group activity Your name in binary (ASCII encoding) jennifer george Summary Binary files Hexadecimal makes binary easier to read Text files = binary file + text encoding Encodings have different character sets ASCII and Unicode Reading: Brookshear §1.4 jennifer george 23
Reading http://en.wikipedia.org/wiki/Orders_of_m agnitude_(data) http://en.wikipedia.org/wiki/Binary_prefix jennifer george 24
Recommend
More recommend