95f8ae74eb5a33004fc8c34dbfe2792b.ppt
- Количество слайдов: 52
TIP Try and stay awake… kick sleeping neighbors. Don’t blink! Copyright, 1998 © Alexander Schonfeld
Introduction Internationalization (i 18 n) is the process of designing an application so that it can be adapted to different languages and regions, without requiring engineering changes. Localization (l 10 n) is the process of adapting software for a specific region or language by adding locale-specific components and translating text.
Organization of Presentation What is i 18 n? n Java example of messages n What is a “locale”? n Formatting data in messages n Translation issues n Date/Time/Currency/etc n Unicode and support in Java n Iteration through text n
Why is i 18 n important? Build once, sell anywhere… n Modularity demands it! n – Ease of translation n “With the addition of localization data, the same executable can be run worldwide. ”
Characteristics of i 18 n. . . n n n Textual elements such as status messages and the GUI component labels are not hardcoded in the program. Instead, they are stored outside the source code and retrieved dynamically. Support for new languages does not require recompilation. Other culturally-dependent data, such as dates and currencies, appear in formats that conform to the end-user's region and language.
Really why… n Carmaggedon
The rest is Java… why? n Java: – is readable! – has most complete built-in i 18 n support. – easily illustrates correct implementation of many i 18 n concepts. – concepts can be extended to any language. n For more info see: www. coolest. com/i 18 n n java. sun. com/docs/books/tutorial/i 18 n n
Java Example: Messages. . . Before: System. out. println("Hello. "); System. out. println("How are you? "); System. out. println("Goodbye. ");
Too much code! After:
Sample Run… % java I 18 NSample fr FR Bonjour. Comment allez-vous? Au revoir. % java I 18 NSample en US Hello. How are you? Goodbye.
1. So What Just Happened? n Created Messages. Bundle_fr_FR. properties, which contains these lines: greetings = Bonjour. farewell = Au revoir. inquiry = Comment allez-vous? (What the translator deals with. ) n In the English one?
2. Define the locale. . . n Look!
3. Create a Resource. Bundle. . . n Look!
4. Get the Text from the Resource. Bundle. . . n Look!
What is a “locale”? Locale objects are only identifiers. n After defining a Locale, you pass it to other objects that perform useful tasks, such as formatting dates and numbers. n These objects are called locale-sensitive, because their behavior varies according to Locale. n A Resource. Bundle is an example of a localesensitive object. n
Did you get that? “fr” “FR” current. Locale = new Locale(language, country); message = Resource. Bundle. get. Bundle("Messages. Bundle", current. Locale); Messages. Bundle_en_US. properties Messages. Bundle_fr_FR. properties Messages. Bundle_de_DE. properties message. get. String(“inquiry”) greetings = Bonjour. farewell = Au revoir. inquiry = Comment allez-vous?
Got a program… need to… n n n What do I have to change? What’s easily translatable? What’s NOT? – “It said 5: 00 pm on that $5. 00 watch on May 5 th!” – “There are 5 watches. ” n n Unicode characters. Comparing strings.
What do I have to change? n Just a few things… messages n labels on GUI components n online help n sounds n colors n graphics n icons n dates n times n numbers n currencies n measurements n phone numbers n honorifics and personal titles n postal addresses n page layouts n
What’s easily translatable? Isolate it! Status messages n Error messages n Log file entries n GUI component labels n – BAD! Button ok. Button = new Button(“OK”); – GOOD! String ok. Label = Button. Label. get. String("Ok. Key"); Button ok. Button = new Button(ok. Label);
What’s NOT (easily translatable)? n “At 1: 15 PM on April 13, 1998, we attack the 7 ships on Mars. ” Message. Bundle_en_US. properties template = At {2, time, short} on {2, date, long}, we attack the {1, number, integer} ships on planet {0}. planet = Mars The time portion of a Date object. The "short" style specifies the Date. Format. SHORT formatting style. The date portion of a Date object. The same Date object is used for both the date and time variables. In the Object array of arguments the index of the element holding the Date object is 2. A Number object, further qualified with the "integer" number style. The String in the Resource. Bundle that corresponds to the "planet" key.
What’s NOT = “Compound Messages” n Example!
1. Compound Messages: message. Arguments. . . n n Set the message arguments… Remember the numbers in the template refer to the index in message. Arguments!
2. Compound Messages: create formatter. . . n Don’t forget setting the Locale of the formatter object. . .
3. Compound Messages: n n n Get the template we defined earlier… Then pass in our arguments! And finally RUN. . .
Sample Run… current. Locale = en_US At 1: 15 PM on April 13, 1998, we attack the 7 ships on the planet Mars. current. Locale = de_DE Um 13. 15 Uhr am 13. April 1998 haben wir 7 Raumschiffe auf dem Planeten Mars entdeckt. (Note: I modified the example and don’t speak German so couldn’t translate my changes so the German does not match. )
What’s NOT (easily translatable)? n Answer = Plurals! There are no files is one file are 2 files on XDisk. Also variable. . . 3 possibilities for output templates. Possible integer value in one of the templates.
Plurals(s)’ses!? ! Choice. Bundle_en_US. properties pattern = There {0} on {1}. no. Files = are no files one. File = is one file multiple. Files = are {2} files There are 2 files on XDisk.
Plurals! n n n What’s different? Now we even index our templates… see file. Strings, indexed with file. Limits. First create the array of templates.
How = n n Not just a pattern. . . Now we have formats too. . .
And. . . n n n Before we just called format directly after apply. Pattern. . . Now we have set. Formats too. This is required to give us another layer of depth to our translation.
Sample Run… current. Locale = en_US There are no files on XDisk. is one file on XDisk. are 2 files on XDisk. are 3 files on XDisk. current. Locale = fr_FR Il Il n' y a pas des y a un fichier y a 2 fichiers y a 3 fichiers sur XDisk.
Numbers and Currencies! n What’s wrong with my numbers? – We say: 345, 987. 246 – Germans say: 345. 987, 246 – French say: 345 987, 246
Numbers. . . n Supported through Number. Format! Locale[] locales = Number. Format. get. Available. Locales(); n Shows what locales are available. Note, you can also create custom formats if needed. 345 987, 246 345, 987. 246 fr_FR de_DE en_US
Money! n Supported with: Number. Format. get. Currency. Instance! 9 876 543, 21 F fr_FR 9. 876. 543, 21 DM de_DE $9, 876, 543. 21 en_US
Percents? n Supported with: Number. Format. get. Percent. Instance!
“A Date and Time… n Supported with: – Date. Format. get. Date. Instance Date. Format date. Formatter = Date. Format. get. Date. Instance(Date. Format. DEFAULT, current. Locale); – Date. Format. get. Time. Instance Date. Format time. Formatter = Date. Format. get. Time. Instance(Date. Format. DEFAULT, current. Locale); – Date. Format. get. Date. Time. Instance Date. Format date. Time. Formatter = Date. Format. get. Date. Time. Instance( Date. Format. LONG, current. Locale);
Date example. . . n Supported with: Date. Format. get. Date. Instance! 9 avr 98 9. 4. 1998 09 -Apr-98 fr_FR de_DE en_US
Characters. . . 16 bit! n 65, 536 characters n Encodes all major languages n In Java Char is a Unicode character n See unicode. org/ Future Use n ASCII Greek Symbols Kana Internal 0 x 0000 0 x. FFFF etc. . .
Java support for the Unicode Char. . . n Character API: – is. Digit – is. Letter. Or. Digit – is. Lower. Case – is. Upper. Case – is. Space. Char – is. Defined n Unicode Char values accessed with: String e. With. Circumflex = new String("u 00 EA");
Java support for the Unicode Char. . . n Example of some repair… – BAD! if ((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z')) // ch is a letter – GOOD! if (Character. is. Letter(ch)) // ch is a letter
Java support for the Unicode Char. . . n Get the Unicode category for a Char: – LOWERCASE_LETTER – UPPERCASE_LETTER – MATH_SYMBOL – CONNECTOR_PUNCTUATION – etc. . . if (Character. get. Type('_') == Character. CONNECTOR_PUNCTUATION) // ch is a “connector”
Comparing Strings • Strings of the world unite! Called “string collation” n Collation rules provided by the Collator class n Rules vary based on Locale n Note: n – can customize rules with Rule. Based. Collator – can optimize collation time with Collation. Key
Collator! n n n As always make a new class. . . Note the Unicode char definitions. Finally note the use of the collator. compare
Sample Run! n n The English Collator returns: peach pêche péché sin According to the collation rules of the French language, the preceding list is in the wrong order. In French, "pêche” should follow "péché" in a sorted list. The French Collator thus returns: peach péché pêche sin
Detecting Text Boundaries • Beware!!! The END of the word is coming! n Important for? Word processing functions such as selecting, cutting, pasting text… etc. (double-click and select) n Break. Iterator class (imaginary cursor) – Character boundaries get. Character. Instance – Word boundaries get. Word. Instance – Sentence boundaries get. Sentence. Instance – Line boundaries get. Line. Instance
Break. Iterator: n n n First we create our word. Iterator. Then attach the iterator to the target text. Loop through the text finding boundaries and set them to carrets in our footer string. She stopped. She said, "Hello there, " and then went on. ^ ^^ ^^^^ ^^ ^
Break. Iterator: I only speak English. . . n You see this = Arabic for “house” n Although this word contains three user characters, it is composed by six Unicode characters: String house = "u 0628" + "u 064 e" + "u 064 a" + "u 0652" + "u 067 a" + "u 064 f"; n Really only 3 user characters… (Imagine the characters masked on top of each other…)
Break. Iterator: n n n First note creating the Arabic/Saudi Arabia Locale. Then notice our 6 Unicode char of text. Looping through the text finding boundaries yields only 3 breaks after the beginning. 0 2 4 6
Break. Iterator: n It works with: Please add 1. 5 liters to the tank! “It’s up to us. ” ^ ^ ^ n Problems with: "No man island. . . every man. . . " ^ ^ ^^ My friend, Mr. Jones, has a new dog. ^ ^ The dog's name is Spot. ^ ^
Break. Iterator: n Returns places where you can split a line (good for word wrapping): She stopped. ^ ^ n She said, "Hello there, " and then went on. ^ ^ ^ ^ ^ According to a Break. Iterator, a line boundary occurs after the end of a sequence of whitespace characters (space, tab, newline).
Break. Iterator: n Java provides: Non-Unicode chars Input. Stream. Reader Output. Stream. Writer Unicode chars Non-Unicode File. Input. Stream fis = new File. Input. Stream("test. txt"); Input. Stream. Reader default. Reader = new Input. Stream. Reader(fis); String default. Encoding = default. Reader. get. Encoding(); File. Output. Stream fos = new File. Output. Stream("test. NEW"); Writer out = new Output. Stream. Writer(fos, "UTF 8"); Output encoding format
n For more info on i 18 n and: – W 3 C and i 18 n n The future of HTTP, HTML, XML, CSS 2… – GUIs – The OTHER character sets… n Scary stuff… those ISO standards – UNIX/clones C programming for i 18 n n X/Open I 18 N Model n • Go forth and internationalize. . .
95f8ae74eb5a33004fc8c34dbfe2792b.ppt