language and computers where to start outline
play

Language and Computers where to start? Outline Computers - PowerPoint PPT Presentation

Language and Language and Language and Language and Computers where to start? Outline Computers Computers Computers Topic 1: Text and Topic 1: Text and Topic 1: Text and Speech Encoding Speech Encoding Speech Encoding Writing


  1. Language and Language and Language and Language and Computers – where to start? Outline Computers Computers Computers Topic 1: Text and Topic 1: Text and Topic 1: Text and Speech Encoding Speech Encoding Speech Encoding Writing systems Writing systems Writing systems Alphabetic Alphabetic Alphabetic Syllabic Syllabic Syllabic ◮ If we want to do anything with language, we need a way Writing systems Logographic Logographic Logographic Systems with unusual Systems with unusual Systems with unusual realization realization realization to represent language. Linguistics 384: Language and Computers Relation to language Relation to language Relation to language Comparison of systems Comparison of systems Comparison of systems Encoding written language ◮ We can interact with the computer in several ways: Topic 1: Text and Speech Encoding Encoding written Encoding written Encoding written language language language ◮ write or read text ASCII ASCII ASCII Spoken language Unicode Unicode Unicode ◮ speak or listen to speech Typing it in Typing it in Typing it in Scott Martin ∗ Spoken language Spoken language Spoken language ◮ Computer has to have some way to represent Dept. of Linguistics, OSU Transcription Transcription Relating written and spoken language Transcription Why speech is hard to Why speech is hard to Why speech is hard to ◮ text Spring 2008 represent represent represent Articulation Articulation Articulation ◮ speech Acoustics Acoustics Acoustics Relating written and Relating written and Relating written and spoken language spoken language spoken language From Speech to Text From Speech to Text From Speech to Text From Text to Speech From Text to Speech From Text to Speech ∗ The course was created by Chris Brew, Markus Dickinson and Detmar Meurers. 1 / 59 2 / 59 3 / 59 Writing systems used for human languages Language and Alphabetic systems Language and Alphabet example: Fraser Language and Computers Computers Computers Topic 1: Text and Topic 1: Text and Topic 1: Text and Speech Encoding Speech Encoding An alphabet used to write Lisu, a Tibeto-Burman language spoken by Speech Encoding What is writing? about 657,000 people in Myanmar, India, Thailand and in the Chinese Writing systems Writing systems Writing systems Alphabets (phonemic alphabets) Alphabetic Alphabetic Alphabetic provinces of Yunnan and Sichuan. “a system of more or less permanent marks used Syllabic Syllabic Syllabic Logographic Logographic Logographic to represent an utterance in such a way that it can Systems with unusual Systems with unusual Systems with unusual ◮ represent all sounds, i.e., consonants and vowels realization realization realization be recovered more or less exactly without the Relation to language Relation to language Relation to language ◮ Examples: Etruscan, Latin, Korean, Cyrillic, Runic, Comparison of systems Comparison of systems Comparison of systems intervention of the utterer.” Encoding written Encoding written Encoding written International Phonetic Alphabet (Peter T. Daniels, The World’s Writing Systems) language language language ASCII ASCII ASCII Unicode Unicode Unicode Typing it in Abjads (consonant alphabets) Typing it in Typing it in Different types of writing systems are used: Spoken language Spoken language Spoken language Transcription Transcription Transcription ◮ represent consonants only (sometimes plus selected ◮ Alphabetic Why speech is hard to Why speech is hard to Why speech is hard to represent represent represent Articulation Articulation Articulation vowels; vowel diacritics generally available) ◮ Syllabic Acoustics Acoustics Acoustics ◮ Examples: Arabic, Aramaic, Hebrew Relating written and Relating written and Relating written and ◮ Logographic spoken language spoken language spoken language From Speech to Text From Speech to Text From Speech to Text From Text to Speech From Text to Speech From Text to Speech Much of the information on writing systems and the graphics used are taken from the amazing site http://www.omniglot.com. (from: http://www.omniglot.com/writing/fraser.htm) 4 / 59 5 / 59 6 / 59 Language and Language and Language and Abjad example: Phoenician A note on the letter-sound correspondence More examples for non-transparent letter-sound Computers Computers Computers Topic 1: Text and Topic 1: Text and Topic 1: Text and correspondences An abjad used to write Phoenician, created between the 18th and 17th Speech Encoding Speech Encoding Speech Encoding ◮ Alphabets use letters to encode sounds (consonants, centuries BC; assumed to be the forerunner of the Greek and Hebrew Writing systems Writing systems Writing systems vowels). alphabet. Alphabetic Alphabetic Alphabetic French Syllabic Syllabic Syllabic Logographic Logographic Logographic ◮ But the correspondence between spelling and Systems with unusual Systems with unusual Systems with unusual realization realization realization (1) a. Versailles → [veRsai] pronounciation in many languages is quite complex, Relation to language Relation to language Relation to language Comparison of systems Comparison of systems Comparison of systems i.e., not a simple one-to-one correspondence. b. ete , etais , etait , etaient → [ete] Encoding written Encoding written Encoding written language language language ◮ Example: English ASCII ASCII ASCII Unicode Unicode Unicode Typing it in ◮ same spelling – different sounds: ought , cough , tough , Typing it in Irish Typing it in Spoken language Spoken language Spoken language through , though , hiccough Transcription Transcription Transcription ◮ silent letters: knee , knight , knife , debt , psychology , (2) a. Baile A’tha Cliath (Dublin) → [bl’a: kli uh] Why speech is hard to Why speech is hard to Why speech is hard to represent represent represent mortgage Articulation Articulation Articulation b. samhradh (summer) → [sauruh] Acoustics Acoustics Acoustics ◮ one letter – multiple sounds: exit , use Relating written and Relating written and c. scri’obhaim (I write) → [shgri:m] Relating written and ◮ multiple letters – one sound: the , revolution spoken language spoken language spoken language From Speech to Text ◮ alternate spellings: jail or gaol ; but chef does not have From Speech to Text From Speech to Text From Text to Speech From Text to Speech From Text to Speech (from: http://www.omniglot.com/writing/phoenician.htm) an alternative seagh (despite sure , dead , laugh ) What is the notation used within the [] ? 7 / 59 8 / 59 9 / 59

Recommend


More recommend