Language and Language and Computers – where to start? Language and Outline Language and Computers Computers Computers Topic 1: Text and Topic 1: Text and Topic 1: Text and Speech Encoding Speech Encoding Speech Encoding Writing systems Writing systems Writing systems Alphabetic Alphabetic Alphabetic Syllabic Syllabic Syllabic Logographic Logographic Writing systems Logographic ◮ If we want to do anything with language, we need a way Systems with unusual Systems with unusual Systems with unusual realization realization realization Language and Computers (Ling 384) to represent language. Relation to language Relation to language Relation to language Comparison of systems Comparison of systems Comparison of systems Encoding written language Topic 1: Text and Speech Encoding Encoding written ◮ We can interact with the computer in several ways: Encoding written Encoding written language language language ASCII ◮ write or read text ASCII ASCII Spoken language Unicode Unicode Unicode ◮ speak or listen to speech Typing it in Typing it in Typing it in Adriane Boyd ∗ Spoken language Spoken language Spoken language Transcription ◮ Computer has to have some way to represent Transcription Transcription Department of Linguistics, OSU Relating written and spoken language Why speech is hard to Why speech is hard to Why speech is hard to represent represent represent Autumn 2005 ◮ text Articulation Articulation Articulation Acoustics ◮ speech Acoustics Acoustics Relating written and Relating written and Relating written and spoken language spoken language spoken language From Speech to Text From Speech to Text From Speech to Text From Text to Speech From Text to Speech From Text to Speech ∗ The course was created by Markus Dickinson, Detmar Meurers and Chris Brew. 1 / 59 2 / 59 3 / 59 Language and Alphabetic systems Language and Alphabet example: Fraser Language and Writing systems used for human languages Computers Computers Computers Topic 1: Text and Topic 1: Text and Topic 1: Text and Speech Encoding Speech Encoding Speech Encoding What is writing? An alphabet used to write Lisu, a Tibeto-Burman language spoken by “a system of more or less permanent marks used Writing systems Writing systems about 657,000 people in Myanmar, India, Thailand and in the Chinese Writing systems Alphabetic Alphabets (phonemic alphabets) Alphabetic Alphabetic to represent an utterance in such a way that it can provinces of Yunnan and Sichuan. Syllabic Syllabic Syllabic Logographic Logographic Logographic be recovered more or less exactly without the Systems with unusual Systems with unusual Systems with unusual realization ◮ represent all sounds, i.e., consonants and vowels realization realization intervention of the utterer.” Relation to language Relation to language Relation to language Comparison of systems Comparison of systems Comparison of systems ◮ Examples: Etruscan, Latin, Korean, Cyrillic, Runic, (Peter T. Daniels, The World’s Writing Systems) Encoding written Encoding written Encoding written language International Phonetic Alphabet language language ASCII ASCII ASCII “Words that stay.” Unicode Unicode Unicode Typing it in Typing it in Typing it in (-Jen (Jim Henson), The Dark Crystal) Abjads (consonant alphabets) Spoken language Spoken language Spoken language Transcription Transcription Transcription Why speech is hard to Why speech is hard to Why speech is hard to Different types of writing systems are used: ◮ represent consonants only (sometimes plus selected represent represent represent Articulation Articulation Articulation vowels; vowel diacritics generally available) Acoustics Acoustics Acoustics ◮ Alphabetic Relating written and Relating written and Relating written and ◮ Examples: Arabic, Aramaic, Hebrew spoken language spoken language spoken language ◮ Syllabic From Speech to Text From Speech to Text From Speech to Text From Text to Speech From Text to Speech From Text to Speech ◮ Logographic Much of the information on writing systems and the graphics used are (from: http://www.omniglot.com/writing/fraser.htm) taken from the amazing site http://www.omniglot.com. 4 / 59 5 / 59 6 / 59 Language and Language and Language and Abjad example: Phoenician A note on the letter-sound correspondence More examples for non-transparent letter-sound Computers Computers Computers Topic 1: Text and Topic 1: Text and Topic 1: Text and correspondences Speech Encoding Speech Encoding Speech Encoding An alphabet used to write Phoenician, created between the 18th and 17th ◮ Alphabets use letters to encode sounds (consonants, centuries BC; assumed to be the forerunner of the Greek and Hebrew Writing systems Writing systems Writing systems Alphabetic Alphabetic Alphabetic vowels). alphabet. Syllabic Syllabic Syllabic French Logographic Logographic Logographic ◮ But the correspondence between spelling and Systems with unusual Systems with unusual Systems with unusual realization realization realization (1) a. tailles → [taj] Relation to language Relation to language Relation to language pronounciation in many languages is quite complex, Comparison of systems Comparison of systems Comparison of systems i.e., not a simple one-to-one correspondence. b. ´ etais , ´ etait , ´ etaient → [etE] Encoding written Encoding written Encoding written language language language ASCII ◮ Example: English ASCII ASCII Unicode Unicode Unicode Typing it in Typing it in Typing it in ◮ same spelling – different sounds: ough : ought , cough , Irish Spoken language Spoken language Spoken language tough , through , though , hiccough Transcription Transcription Transcription Why speech is hard to ◮ silent letters: knee , knight , knife , debt , psychology , Why speech is hard to (2) a. Baile A’tha Cliath (Dublin) → [bl’a: kli uh] Why speech is hard to represent represent represent Articulation Articulation Articulation mortgage b. samhradh (summer) → [sauruh] Acoustics Acoustics Acoustics ◮ one letter – multiple sounds: exit , use Relating written and Relating written and Relating written and c. scri’obhaim (I write) → [Sgri:m] ◮ multiple letters – one sound: the , revolution spoken language spoken language spoken language From Speech to Text From Speech to Text From Speech to Text ◮ alternate spellings: jail or gaol ; but not possible seagh From Text to Speech From Text to Speech From Text to Speech (from: http://www.omniglot.com/writing/phoenician.htm) for chef (despite sure , dead , laugh ) What is the notation used within the [ ] ? 7 / 59 8 / 59 9 / 59
Recommend
More recommend