GLOBALISATION History • I bet it is quite natural to dream about writing • As a little warm-up, consider the situation about software that is being sold around the world… 20 years ago. • However, there may be some small obstacles on • At that time there was an increasing interest into the way to selling your software worldwide. creating software products, which could be sold to different customers within Finland. • Today we study potential problems and solutions. • For customising the software, it was important to • Terms: put all user interface constants into a place, where Localisation = adjusting software locally they are easy to change. Globalisation, internationalisation = creating a software in such a way that it is eays to localise it • Program code is not such a place -> parameter to different countries. files or database are a much better choice. 4.10.2003 Software Engineering 2004 1 4.10.2003 Software Engineering 2004 2 Jyrki Nummenmaa Jyrki Nummenmaa Simple example What should be globalised? • http://java.sun.com/docs/books/tutorial/i18n/inde • Messages • Numbers x.html deals with internationalisation issues. • Labels on GUI components • Currencies • Online help • Measurements • We have a quick look at their example. • Sounds • Phone numbers • In the example, multilingual texts are managed • Colors • Honorifics and personal using titles • Graphics - a locale, identified by a (country, language) pair • Postal addresses • Icons - resource bundles, one per locale, • Page layouts • Dates - property files, where strings are identified by • Times keys -Our example in the previous slide only dealt with a simple message! • Strings are identified by keywords within a locale. -Labels can also be managed in a fairly straightforward manner, if enough space is reserved for them. -Now let’s have a look at the rest… 4.10.2003 Software Engineering 2004 3 4.10.2003 Software Engineering 2004 4 Jyrki Nummenmaa Jyrki Nummenmaa Identify what needs to be Locales managed through locales • As we saw, globalisation in Java was based on the • As you think about locales, you will find out that use of Locales . you have • A local is identified by a language (compulsory), - data items such as messages and sounds, which country (optional) and variant (optional). change altogether with the locale, and • A class, whose behaviour is based on the use of a - data items, which remain the same, but whose locale, is called locale-sensitive . formatting changes, e.g. dates and numbers • You can find locales available to a locale-sensitive - possibly data items not to be localised (internal class by using the getAvailableLocales() method. use, interface to another application, …). • There is also a default locale for a Java Virtual • Design the globalisation - identify which is which. Machine, and it can be accessed by • Arrange your data items into resource bundles Locale.getDefault() (e.g. items for the same form in the same bundle, • Different objects may use different locales. so that you will not need to load unnecessary objects). 4.10.2003 Software Engineering 2004 5 4.10.2003 Software Engineering 2004 6 Jyrki Nummenmaa Jyrki Nummenmaa 1
Formats - numbers Dates and Times • Numbers are formatted differently in different • Similarly as with numbers, dates and time are countries, e.g.: represented differently. 345 987,246 – France • Also similarly, there is a DateFormat class, which 345.987,246 – Germany you can use to create standard date and time 345,987.246 - US formats. • Java includes a NumberFormat class that can be • Here again, you may customise – and you may used to format numbers, currencies (no exchange also define your own names for things such as rates, though :) and percentages weekdays etc. • You can use the NumberFormat class to both create formatted strings and parse strings. • You can also provide your own patterns, if this is not enough for you… 4.10.2003 Software Engineering 2004 7 4.10.2003 Software Engineering 2004 8 Jyrki Nummenmaa Jyrki Nummenmaa Messages containing variable Class MessageFormat parts • Examples: • With the MessageFormat class you can define a - 405,390 people have visited your message template, which gives the message text website since January 1, 1998. (1) and shows where to format the changing data and - The <devicename> number <devicenumber> has been how. activated. (2) • With ChoiceFormat, you can choose between • Word order may change between languages, which may make it impossible to correctly translate message (1) strings using based on a number you give as a assuming that it is the text between the number and the parameter (this is particularly handy for date. managing plurals). • In message (2) the word “activated” may require different translation in some languages (e.g. French) depending on the gender of the word for the device name. • Basic rule of thumb: If you can avoid messages containing these variable parts, then do so! 4.10.2003 Software Engineering 2004 9 4.10.2003 Software Engineering 2004 10 Jyrki Nummenmaa Jyrki Nummenmaa Characters Chinese and Japanese • US Ascii – 7 bit • Thousands of symbols. • ISO 8859-X where X is some digit – an 8-bit system – if 8th • Unicode can do – but you need more pixels on the bit is 0, then the first 7 bits represent a US Ascii character. screen as well. • Windows 125x codepages – similar to ISO 8859-X, but not • In Japanese there are several writing systems. the same of course – typical Windows interoperability nightmare… • Text input can be done as followed: • Unicode – meant to represent all characters from all 1. The user types in the word in some phonetic languages. Needs more bits (usually done with 16) but there writing system based on latin characters. are several encoding schemes. Some, for instance, use two 2. The system shows the characters (there may bytes (16 bits) for some characters and one byte (8 bits) for be many) matching the phonetic writing. some… 3. The user picks the right character. • http://www.unicode.org/index.html 4.10.2003 Software Engineering 2004 11 4.10.2003 Software Engineering 2004 12 Jyrki Nummenmaa Jyrki Nummenmaa 2
Korean Writing order • In the Korean writing system (hangul), characters • Latin – left to right. are composed from parts based on which • In Chinese and Japanese, traditional writing order is character follows which. top-down, and columns left-to-right. • There is a limited number of building blocks ie. • Nowadays adjusted to ordinary left-to-right. character parts (can’t remember, but maybe • In Arabic and Hebrew, the text itself is written from around 25). right-to-left, but all latin names (like yours, probably) are written left-to-right in the middle of right-to-left. 4.10.2003 Software Engineering 2004 13 4.10.2003 Software Engineering 2004 14 Jyrki Nummenmaa Jyrki Nummenmaa Comparing characters and Character properties strings • Don’t do: if ((ch >= 'a' && ch <= 'z') || (ch >= 'A' • You can use the Collator class, e.g.: && ch <= 'Z')) // ch is a letter Collator myCollator = Collator.getInstance(); if( myCollator.compare("abc", "ABC") < 0 ) • In Java, char represents a Unicode character. System.out.println("abc is less than ABC"); • You can use class Character to check for things else such as white space, digits, upper and lower case. System.out.println("abc is greater than or equal to ABC"); • getInstance() takes also a locale as a parameter. • E.g.: Character.isDigit(ch), Character.isLetter(ch), Character.isLowerCase(ch) • You can customise the rules used in the comparisons. • You can also use .getType() and predefined constants to check things like: if (Character.getType('a') == Character.LOWERCASE_LETTER) 4.10.2003 Software Engineering 2004 15 4.10.2003 Software Engineering 2004 16 Jyrki Nummenmaa Jyrki Nummenmaa Finding boundaries of words, Colors, gestures, other symbols sentences, etc. • The boundaries may, of course, be defined • E.g. in far east there is a lot of symbolism in differently in different languages. colors, names, numbers, etc. (e.g. red is a good • Initialise BreakIterator with one of these color, 4 is a bad number, etc.) methods: • Also, for instance hand gestures vary from one - getCharacterInstance place to another – what is good here may be bad - getWordInstance elsewhere. - getSentenceInstance • Even in Europe there is variance. Consider tick - getLineInstance marks: x (good here, bad in UK), √ (not exactly • E.g. BreakIterator sentenceIterator = like this, however good in UK, bad here). BreakIterator.getSentenceInstance(currentLocale); • One BreakIterator only works with one type of breaks. 4.10.2003 Software Engineering 2004 17 4.10.2003 Software Engineering 2004 18 Jyrki Nummenmaa Jyrki Nummenmaa 3
Recommend
More recommend