unicode and unicode and iso iec 10646 iso iec 10646
play

Unicode and Unicode and ISO/IEC 10646 ISO/IEC 10646 V.S. - PDF document

INTERNATIONAL ORGANIZATION FOR STANDARDIZATION L2/04-028 ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC 1/SC 2/WG 2 Universal Multiple-Octet Coded Character Set (UCS) ISO/IEC JTC 1/SC 2/WG 2 N 2696 2004-01-22 Title: Presentation


  1. INTERNATIONAL ORGANIZATION FOR STANDARDIZATION L2/04-028 ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC 1/SC 2/WG 2 Universal Multiple-Octet Coded Character Set (UCS) ISO/IEC JTC 1/SC 2/WG 2 N 2696 2004-01-22 Title: Presentation Foils from National Workshop on Unicode, New Delhi, Sept 24-26, 2003 Source: V.S. Umamaheswaran – umavs@ca.ibm.com References: Action: For information to WG2 Distribution: ISO/IEC JTC 1/SC 2/WG 2 At the request of our convener Mr. Mike Ksar, I have packaged the set of foils (modified slightly) that I had presented at the National Workshop on Unicode, New Delhi, Sept 24-26, 2003, organized by the Ministry of Information and Communication Technology, India. Some of you involved with JTC1/SC2/WG2 and the Unicode Technical Committee may find it of some use. In particular, slide number 4 of the second presentation – on page 14 – titled ‘Framework for Discussion’ was also used in WG2 meeting M44 during our ad hoc on Tibetan. It is a gist of the principles to follow while proposing additions or changes to the standard.

  2. Unicode and Unicode and ISO/IEC 10646 ISO/IEC 10646 V.S. Umamaheswaran V.S. Umamaheswaran umavs@ca.ibm.com umavs@ca.ibm.com IBM Toronto Lab, Canada IBM Toronto Lab, Canada 2003- 2003 -09 09- -25 25 Session 10, National Workshop on Session 10, National Workshop on 1 1 Unicode, New Delhi Unicode, New Delhi Topics Topics � Unicode and ISO/IEC 10646 Unicode and ISO/IEC 10646 � � UCA and 14651 UCA and 14651 � � Processes Processes � � Guidelines for Proposals Guidelines for Proposals � � Organize the Expertise Organize the Expertise � 2003- -09 09- -25 25 2 2003 Session 10, National Workshop on Session 10, National Workshop on 2 Unicode, New Delhi Unicode, New Delhi 1

  3. Unicode and ISO/IEC 10646 Unicode and ISO/IEC 10646 Unicode Unicode 10646 10646 Code Space 0 to 0 to x10FFFF* Code Space 0 to 0 to x10FFFF* x10FFFF x10FFFF Repertoire Same Same Repertoire Same Same Supp. Planes Same Same Supp. Planes Same Same BMP non CJKV BMP non CJKV Same Same Same Same BMP CJKV BMP CJKV Single Col Single Col CJKV Cols CJKV Cols Chart Creation Common DB Common DB Chart Creation Common DB Common DB 2003 2003- -09 09- -25 25 Session 10, National Workshop on Session 10, National Workshop on 3 3 Unicode, New Delhi Unicode, New Delhi Unicode and ISO/IEC 10646 Unicode and ISO/IEC 10646 2003- -09 09- -25 25 4 2003 Session 10, National Workshop on Session 10, National Workshop on 4 Unicode, New Delhi Unicode, New Delhi 2

  4. Unicode and ISO/IEC 10646 Unicode and ISO/IEC 10646 Unicode Unicode 10646 10646 Publication Web; Book Edition + Amds Amds Publication Web; Book Edition + Dot Release (1 volume end Dot Release (1 volume end of 2003) of 2003) Book Style Book Style ISO Style ISO Style Conformance Conformance =Level 3 =Level 3 Levels 1, 2, 3 Levels 1, 2, 3 ( use 3 for Indic use 3 for Indic ) ) ( BiDi Defined Refers to BiDi Defined Refers to Unicode Unicode Normalization Normalization Defined Defined Refers to Refers to Unicode Unicode 2003 2003- -09 09- -25 25 Session 10, National Workshop on Session 10, National Workshop on 5 5 Unicode, New Delhi Unicode, New Delhi Unicode and ISO/IEC 10646 Unicode and ISO/IEC 10646 Unicode Unicode 10646 10646 Combining Property + List + Minimal Combining Property + List + Minimal TRs+ Text + Text Info TRs Info Format Chars Format Chars Property Property Some Listed Some Listed Script Info Script Info Lot of Detail Lot of Detail Minimal Minimal Annotations Many more Some in Annex Annotations Many more Some in Annex Naming Rules uses 10646 Defined Naming Rules uses 10646 Defined 2003- -09 09- -25 25 6 2003 Session 10, National Workshop on Session 10, National Workshop on 6 Unicode, New Delhi Unicode, New Delhi 3

  5. Unicode and ISO/IEC 10646 Unicode and ISO/IEC 10646 Unicode 10646 Unicode 10646 Properties + Defined Out of scope Properties + Defined Out of scope Processing Processing Rules Rules UTF- -8, 8,- -16, 16, Same Same UTF Same Same -32/UCS4 32/UCS4 - Compressions Defined Not included Compressions Defined Not included …. . ….. .. ….. .. … … … Conforming to Unicode will automatically conform to 10646 Level 3 plus lots more 2003 2003- -09 09- -25 25 Session 10, National Workshop on Session 10, National Workshop on 7 7 Unicode, New Delhi Unicode, New Delhi Unicode Collation Algorithm Unicode Collation Algorithm and ISO/IEC 14651 and ISO/IEC 14651 � Synchronized with Each Other Synchronized with Each Other � � Share same Concepts for Weights Categories and Share same Concepts for Weights Categories and � Tailoring Tailoring � Tailoring Required in Both Tailoring Required in Both � � Default Weights and Repertoire Identical in Both Default Weights and Repertoire Identical in Both � – – generated from the same data base generated from the same data base � 14651 Editions + 14651 Editions + Amds Amds versus UCA Versions versus UCA Versions � Conforming to UCA will also conform to 14651 plus more functions 2003- -09 09- -25 25 8 2003 Session 10, National Workshop on Session 10, National Workshop on 8 Unicode, New Delhi Unicode, New Delhi 4

  6. Processes Processes 2003- 2003 -09 09- -25 25 Session 10, National Workshop on Session 10, National Workshop on 9 9 Unicode, New Delhi Unicode, New Delhi Processes Processes 2 Ballots Draft, Final 12-18 months 2003- -09 09- -25 25 10 2003 Session 10, National Workshop on Session 10, National Workshop on 10 Unicode, New Delhi Unicode, New Delhi 5

  7. Processes Processes � UTC has additional procedures for preparing and processing Technical Reports � See FAQ page at Unicode site 2003 2003- -09 09- -25 25 Session 10, National Workshop on Session 10, National Workshop on 11 11 Unicode, New Delhi Unicode, New Delhi Processes Processes � Membership in SC2 Membership in SC2 � • • National Bodies National Bodies � Ex: INCITS in USA, SCC in Canada, BIS in India Ex: INCITS in USA, SCC in Canada, BIS in India � � Roster on SC2 site Roster on SC2 site www.dkuug.dk/JTC1/SC2 www.dkuug.dk/JTC1/SC2 � � Membership in UTC Membership in UTC � • Review by all members and experts Review by all members and experts • • Voting by Corporate Members • Voting by Corporate Members � Government of India is a Corporate Member Government of India is a Corporate Member � � Roster on Unicode site. Roster on Unicode site. � 2003- -09 09- -25 25 12 2003 Session 10, National Workshop on Session 10, National Workshop on 12 Unicode, New Delhi Unicode, New Delhi 6

  8. Proposal Guidelines Proposal Guidelines Do your homework ? Check if Already encoded ? (see http://www.unicode.org/standard/where/) � Check Charts in Unicode V4 � Also charts in TRs – � TR15 Normalization charts � TR10 Collation charts � TR21 Case map charts � TR24 Script charts � or for legacy sets ICU Charmaps or equivalents 2003- 2003 -09 09- -25 25 Session 10, National Workshop on Session 10, National Workshop on 13 13 Unicode, New Delhi Unicode, New Delhi Proposal Guidelines Proposal Guidelines � May be in a block with recognized name .. � Search Nameslist file in Unicode Database Name could be in Annotations � Shape in standard can be a variant (see handout page 2) � Is it a Glyph (from a Font for example?) http://www.unicode.org/reports/tr17/#Characters vs. Glyphs and TR 15285 – Character Glyph Model http://isotc.iso.ch/livelink/livelink/fetch/2000/2489/Ittf_Ho me/PubliclyAvailableStandards.htm??Redirect=1 2003- -09 09- -25 25 14 2003 Session 10, National Workshop on Session 10, National Workshop on 14 Unicode, New Delhi Unicode, New Delhi 7

  9. Proposal Guidelines Proposal Guidelines � Character may be under consideration � Look in Unicode Pipeline http://www.unicode.org/alloc/Pipeline.html � Check if previously considered and rejected - http://www.unicode.org/alloc/rejected.html � Also for any accepted pending scripts: http://www.unicode.org/pending/pending.html 2003 2003- -09 09- -25 25 Session 10, National Workshop on Session 10, National Workshop on 15 15 Unicode, New Delhi Unicode, New Delhi Proposal Guidelines Proposal Guidelines Do your homework For entire script - check out the ROADMAPS : http://www.unicode.org/roadmaps http://www.dkuug.dk/JTC1/SC2/WG2/docs/roadmaps.html Already encoded- Bold text in Roadmap proposal accepted - (Bold text between parentheses) under consideration (Text between parentheses) exploratory ¿Text between question marks? possible future – no suggestions ??? hot links for latest proposal included 2003- -09 09- -25 25 16 2003 Session 10, National Workshop on Session 10, National Workshop on 16 Unicode, New Delhi Unicode, New Delhi 8

Recommend


More recommend