c208f8b294d4f4613e349cd08f3366d1.ppt
- Количество слайдов: 86
DEV-23: Global Applications and Code Pages Jordi Sastre Application Architect PSC IT
Introduction § Global applications need to deal with several § § § 2 languages, countries and time zones Do’s and don'ts about globalization using Open. Edge® technology Based on real experience from an IT department Not a complete review of Open. Edge features DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Agenda § Code Pages Overview § Open. Edge Settings § Common Mistakes § Hints & Tips § Linguistic Sorting and Collation § Time Zones § Summary § Questions 3 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Code Pages Overview § Code page is a table that maps characters to § § 4 numbers (code points) ASCII was created in 1963 to encode 127 characters based on the English alphabet ASCII = “American Standard Code for Information Interchange” EBCDIC = “Extended Binary Coded Decimal Interchange Code” 8 -bit code pages appeared for other languages, encoding up to 255 characters DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Code Pages Overview § All code pages include the ASCII encoding in § § § 5 the first 127 code points, except EBCDIC A single code page does not contain all characters for all languages, except Unicode A character may have different code points in different code pages Data may become corrupted when transferred between two different code pages DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Data Corruption ISO 8859 -1 1250 E 8 “è” “č” France 6 DEV-23: Global Applications and Code Pages Czech Republic © 2007 Progress Software Corporation
Data Corruption § English uses the 127 codes that are common § 7 in all code pages, including Unicode Problems may occur when: • Handling non-English data • Using platforms with non-English settings • Pasting MS Office text, even in English DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
8 -bit Code Pages § ISO 8859 -1 and ISO 8859 -2 were defined by ISO and mainly used on Unix systems. § 1250 and 1252 were defined by Microsoft and used on MS Windows. § IBM 437, IBM 850 and IBM 852 were defined by IBM and used on PC-DOS/MS-DOS. 8 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
8 -bit Code Pages § ISO 8859 -1, IBM 850 and 1252 are used for Western European languages: • Danish, Dutch, English, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish, Swedish, etc. § ISO 8859 -2, IBM 852 and 1250 are used for Central European languages: • Czech, Hungarian, Polish, German, etc. § IBM 437 is mainly used for English, although it contains some extra characters 9 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
8 -bit Code Pages § Examples of character encoding: ISO 8859 -1 1252 1250 IBM 437 IBM 850 IBM 852 a 61 61 á E 1 E 1 A 0 A 0 È C 8 n/a n/a D 4 n/a Č n/a C 8 n/a AC “ 10 ISO 8859 -2 n/a 93 93 n/a n/a DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
8 -bit Code Pages § Where to find code page tables: • 10. 1 B Internationalizing Applications manual (IBM 850 and ISO 8859 -1) • http: //www. microsoft. com/globaldev/reference/cphome. mspx • http: //www 03. ibm. com/servers/eserver/iseries/software/globalization/co depages. html • http: //en. wikipedia. org • http: //www. fileformat. info/charset/index. htm 11 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Unicode What is Unicode? Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. http: //www. unicode. org/standard/What. Is. Unicode. html § § 12 ISO/IEC 10646 It covers virtually ALL characters in the world! DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Unicode and UTF § Unicode stands for “Unique Code” § UTF stands for “Unicode Transformation § § 13 Format” UTF is not a code page, but an encoding format for the Unicode page UTF encodes Unicodes into 1 to 4 bytes UTF-8, UTF-16 and UTF-32 are three basic encoding forms supported by Unicode All UTF formats handle all Unicodes DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
UTF Encoding Examples Unicode UTF-16 UTF-32 U+004 D 4 D 00 00 00 4 D U+00 A 1 C 2 A 1 00 00 00 A 1 U+00 E 1 C 3 A 1 00 E 1 00 00 00 E 1 U+0470 D 0 C 0 04 70 00 00 04 70 U+4 E 9 C E 4 BA 9 C 4 E 9 C 00 00 4 E 9 C U+10302 14 UTF-8 F 0 90 9 C 82 D 8 00 DF 02 00 01 03 02 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Unicode Conversion § All code pages convert to Unicode § Unicode may not convert to other code pages IBM 437 IBM 852 IBM 850 1252 ISO 8859 -1 15 ü DEV-23: Global Applications and Code Pages Unicode ? IBM 437 IBM 852 IBM 850 1252 ISO 8859 -1 © 2007 Progress Software Corporation
Agenda § Code Pages Overview § Open. Edge Settings § Common Mistakes § Hints & Tips § Linguistic Sorting and Collation § Time Zones § Summary § Questions 16 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Open. Edge Settings § Database settings • _db-xl-name: Database code page • _db-coll-name: Database collation § Startup parameters • • • 17 -cpinternal: Process code page -cpstream: Input/Output code page -cpcoll: Process collation -d: Date format -E: Numeric format DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
More Open. Edge Settings § -cplog: Code page for log files (-cpstream) § -cpterm: Code page for screen I/O (-cpstream) § -cpprint: Code page for printing (-cpstream) § -numsep: Separator for thousands (-E) § -numdec: Separator for decimals (-E) § -cprcodein/-cprcodeout: Code page for § 18 compiled code (-cpinternal) -lng: Translation Manager language DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Even More Open. Edge Settings § convmap. cp: Character Processing Tables § progress. ini: Fonts (More parameters in documentation) 19 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Open. Edge Settings _db-xl-name, -cpinternal and -cpstream Open. Edge Process GUI Database -cpinternal CHUI Keyboard -cpstream Screen Printer Open. Edge code page conversions ! _db-xl-name OS files 20 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Open. Edge Settings WEBSPEED™ _progres -web DB SERVER _mprosrv -cpinternal Web Browser Database _db-xl-name -cpstream OS files APPSERVER™ _proapsv OS files -cpinternal GUI CLIENT prowin 32 Keyboard -cpstream CHUI CLIENT _progres -cpinternal Screen -cpstream Printer OS files -cpinternal Keyboard -cpstream Printer Screen OS files Printer 21 DEV-23: Global Applications and Code Pages OS files © 2007 Progress Software Corporation
Open. Edge Settings § Since Open. Edge 10 supports UTF-8 in most § § processes… … just configure all OE settings to UTF-8 ! Well, not really. We need to look at: • • • 22 Operating System Web Server Printer drivers Data from/to other systems OCX’s Terminal Emulators, etc. DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Open. Edge Settings WEBSPEED™ _progres -web DB SERVER _mprosrv -cpinternal Web Browser Database _db-xl-name -cpstream OS files APPSERVER™ _proapsv OS files -cpinternal GUI CLIENT prowin 32 Keyboard -cpstream CHUI CLIENT _progres -cpinternal Screen -cpstream Printer OS files -cpinternal Keyboard -cpstream Printer Screen OS files Printer 23 DEV-23: Global Applications and Code Pages OS files © 2007 Progress Software Corporation
Open. Edge Settings _db-xl-name (metaschema field) § Database should use Unicode (UTF-8) to ensure support for all characters 24 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Open. Edge Settings -cpinternal (startup parameter) § Processes should use Unicode to ensure § § § 25 support for all characters Best if -cpinternal matches database Batch Client (_progres –b) can use Unicode, but Character Client (_progres) cannot Interfaces with Windows controls DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Open. Edge Settings -cpstream (startup parameter) § -cpstream is the main cause of data § § § corruption when set incorrectly It tells the code page of input/output data from/to files On Character Client it also tells the code page of keyboard and screen Rule of thumb: • Set -cpstream to match the Operating System code page • Use ABL to override -cpstream when needed 26 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Open. Edge Settings -cpstream (startup parameter) § Unix/Linux code page % locale charmap ISO 8859 -1 § DOS code page C: >mode con cp Status for device CON: -----------Code page: 437 27 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Open. Edge Settings convmap. cp (Open. Edge file) § Contains the Character Processing Tables § DLC/convmap. cp § DLC/prolang/convmap. dat § Open. Edge 10. 1 B out of the box contains: • 54 code pages • 595 code page conversion tables • 491 collation tables § More tables can be added 28 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Open. Edge Settings progress. ini (Open. Edge file) § Use appropriate fonts for code page and language: • Recommended to replace MS Sans Serif with Microsoft Sans Serif • MS Gothic or MS Mincho for Japanese • MS Song for Chinese • Use script when needed font 0=Courier New, size=8, script=russian font 0=Courier New, size=8, script=easteurope 29 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Not an Open. Edge Setting Windows Fonts § Linked fonts § Information about Windows fonts: http: //www. microsoft. com/typography/fonts/default. aspx http: //www. microsoft. com/globaldev/getwr/steps/wrg_font. mspx 30 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Open. Edge Settings Summary § Meet requirements for Input/Output: • -cpinternal for process and GUI I/O (UTF-8) • -cpstream for file I/O and CHUI I/O (OS) § Decide the code page when exporting data § Know the code page when importing data Open. Edge Process Database GUI -cpinternal CHUI Keyboard -cpstream _db-xl-name Screen Printer OS files 31 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Agenda § Code Pages Overview § Open. Edge Settings § Common Mistakes § Hints & Tips § Linguistic Sorting and Collation § Time Zones § Summary § Questions 32 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Common Mistakes Loading or importing data with the wrong code page ÄŚzech 0 125 C 4 8 C 7 A 65 63 68 ISO 8859 -1 Ä zech UT F-8 Čzech 33 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Byte Order Mark (BOM) § Identifies the UTF encoding of a data file § Unicode point U+FEFF § U+FEFF is also encoded: § § § UTF-8: EF BB BF UTF-16 BE: FE FF UTF-16 LE: FF FE UTF-32 BE: 00 00 FE FF UTF-32 LE: FF FE 00 00 § Open. Edge understands BOMs when reading 34 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Byte Order Mark (BOM) on ! auti C Čzech 0 125 EF BB DF C 4 8 C 7 A 65 63 ISO 8859 -1 Čzech 68 UT F-8 Čzech 35 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Common Mistakes Loading or importing data with the wrong code page (…) "imuller" "Ian Muller" "Y" "C" 1657 283200 "jdoe" "Jane Doe" "N" "U" 3275 450010 "jsmith" "John Smith" "Y" "C" 1450 323700 "jsanchez" "Juan Sánchez" "Y" "C" 4250 323900. PSC filename=users records=000001133 ldbname=mydatabase timestamp=2007/03/28 -20: 55: 03 numformat=44, 46 dateformat=mdy-1950 map=NO-MAP cpstream=ISO 8859 -1. 0000143373 36 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Common Mistakes Updating data with the wrong code page _progres E 0 -cpinternal IBM 850 _mprosrv -cpinternal ISO 8859 -1 D 3 E 0 -cpstream IBM 850 OS = 1252 E 0 à 37 DEV-23: Global Applications and Code Pages _db-xl-name ISO 8859 -1 D 3 Ó © 2007 Progress Software Corporation
Common Mistakes Updating data with the CORRECT code page _progres 85 -cpinternal IBM 850 _mprosrv -cpinternal ISO 8859 -1 E 0 -cpstream 1252 OS = 1252 E 0 à 38 DEV-23: Global Applications and Code Pages _db-xl-name ISO 8859 -1 E 0 à © 2007 Progress Software Corporation
Common Mistakes Updating data with the wrong code page _progres –web 39 -cpstream UTF-8 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Common Mistakes Incorrect tools to verify data § Notepad sometimes guesses the code page § § § 40 based on the content Notepad understands BOM, Excel doesn’t Startup parameters in Procedure Editor Fonts in progress. ini Terminal Emulator needs to be configured to support remote OS code page Use an Hexadecimal Editor Two wrongs may make it look right DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Agenda § Code Pages Overview § Open. Edge Settings § Common Mistakes § Hints & Tips § Linguistic Sorting and Collation § Time Zones § Summary § Questions 41 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints Development and Integration § When starting development, make sure all the § § 42 components have the correct code page settings Each application may need different code page settings When integrating, review the code page settings of all applications and processes involved DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints How to display the code page settings: MESSAGE "Database = " DBCODEPAGE(1) "Collation = " DBCOLLATION(1) "-cpinternal = " SESSION: CPINTERNAL "-cpstream = " SESSION: CPSTREAM "-cpcoll = " SESSION: CPCOLL VIEW-AS ALERT-BOX. 43 DEV-23: Global Applications and Code Pages SKIP SKIP © 2007 Progress Software Corporation
Tips & Hints Temp-tables using Word Indexes § Temp-tables use their own word-break tables § for word indexes Use -ttwrdrul parameter proutil -C wbreak-compiler proutil -C word-rules Database Word Break Table -ttwrdrul Progress clients prowin 32 _progres [-web] 44 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints Input/Output § When using § § OUTPUT TO, know the code page you need the output to be converted to, which will be dependant on how the file will be used When using INPUT FROM, know in what code page the imported data was encoded To override the -cpstream default: OUTPUT TO file CONVERT TARGET "UTF-8". INPUT FROM file CONVERT SOURCE "UTF-8". § Stamp code page, especially for integration 45 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints UTF-8 can be multi-byte! § Many UTF-8 characters are more than one byte: DEFINE VARIABLE c AS CHARACTER INIT "á". MESSAGE LENGTH(c) SKIP LENGTH(c, "RAW") VIEW-AS ALERT-BOX. returns 46 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints CHR() and ASC() § Use § § 47 CHR() and ASC() with code page parameters Do not hard-code encoding values See examples… DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints CHR() and ASC() – Example 1 § Detecting non-breaking blank spaces (NBSP) CASE SESSION: CPINTERNAL: WHEN "UTF-8" THEN IF c = CHR(49824) THEN MESSAGE "NBSP" VIEW-AS ALERT-BOX. WHEN "ISO 8859 -1" THEN IF c = CHR(160) THEN MESSAGE "NBSP" VIEW-AS ALERT-BOX. END CASE. § Better code: IF c = CHR(49824, SESSION: CPINTERNAL, "UTF-8") THEN MESSAGE "NBSP" VIEW-AS ALERT-BOX. 48 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints CHR() and ASC() – Example 2 § Open. Edge silently ignores incorrect values to ASC() or CHR() /* When run with –cpinternal UTF-8 it returns YES because 160 is not a valid UTF-8 encoding. When run with –cpinternal 1252 it returns NO. */ MESSAGE CHR(160) = "" VIEW-AS ALERT-BOX. /* Always returns NO */ MESSAGE CHR(49824, SESSION: CPINTERNAL, "UTF-8") = "" VIEW-AS ALERT-BOX. 49 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints CHR() and ASC() – Example 3 § CHR() and § ASC() work with encoding values, as opposed to code points For example, this code run on a session with cpinternal UTF-8 DEFINE VARIABLE c AS CHARACTER NO-UNDO. c = "á". MESSAGE ASC(c) VIEW-AS ALERT-BOX. returns 50081 (C 3 A 1) and not 225 (00 E 1). Unicode U+00 E 1 50 DEV-23: Global Applications and Code Pages UTF-8 C 3 A 1 © 2007 Progress Software Corporation
Tips & Hints Unicode points § If needed, Unicode points can be used: DEFINE VARIABLE c AS CHARACTER NO-UNDO. c = "á". MESSAGE c = "~u 00 E 1" SKIP c = CHR(50081) SKIP c = CHR(225, "UTF-8", "1252") VIEW-AS ALERT-BOX. 51 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints Un-corrupting data _progres E 0 -cpinternal IBM 850 -mprosrv -cpinternal ISO 8859 -1 D 3 E 0 -cpstream IBM 850 OS = 1252 E 0 à 52 DEV-23: Global Applications and Code Pages _db-xl-name ISO 8859 -1 D 3 Ó © 2007 Progress Software Corporation
Tips & Hints Un-corrupting data § ISO 8859 -1 database with data encoded in § IBM 850 Run on session with -cpinternal iso 8859 -1 FOR EACH my. Table EXCLUSIVE-LOCK. RUN Fix. Char(INPUT-OUTPUT my. Table. my. Field). END. PROCEDURE Fix. Char: DEF INPUT-OUTPUT PARAM c AS CHAR NO-UNDO. c = CODEPAGE-CONVERT(c, "IBM 850", "ISO 8859 -1"). END PROCEDURE. 53 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints BOM § How to output UTF-8 BOM to a file OUTPUT TO text. txt CONVERT TARGET "UTF-8". PUT CONTROL "~357~273~277". /* BOM */ PUT UNFORMATTED "UTF-8 text". OUTPUT CLOSE. § Intended for Notepad (. txt) or web browser (. html) 54 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints Web browser needs to map Web. Speed’s -cpstream _progres –web Web Browser -cpstream UTF-8 Encoding ? ? ? § Original output. Header procedure: PROCEDURE output. Header: output-content-type ("text/html"). END PROCEDURE. 55 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints Web browser needs to map Web. Speed’s -cpstream (1) § Use Open. Edge’s convcp. p procedure PROCEDURE output. Header: DEF VAR c. Mime. CP AS CHAR NO-UNDO. RUN adecomm/convcp. p(SESSION: CPSTREAM, "To. Mime", OUTPUT c. Mime. CP). output-content-type ("text/html; charset=" + c. Mime. CP). END PROCEDURE. 56 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints Web browser needs to map Web. Speed’s –cpstream (2) § User Defined Function PROCEDURE output. Header: output-content-type ("text/html; charset=" + Get. Mime. CP(SESSION: CPSTREAM)). END PROCEDURE. § Get. Mime. CP converts Open. Edge code page § 57 names to MIME names See example… DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints Get. Mime. CP example FUNCTION Get. Mime. CP RETURNS CHAR (INPUT progress-Code. Page AS CHAR): DEF VAR pro-cplist AS CHAR INIT "1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258, 620 -2533, BIG-5, EUCJIS, GB 2312, IBM 037, IBM 273, IBM 277, IBM 278, IBM 284, IBM 297, IBM 437, IBM 500, IBM 851, IBM 852, IBM 857, IBM 858, IBM 861, IBM 862, IBM 866, ISO 8859 -10, ISO 8859 -15, ISO 8859 -2, ISO 8859 -3, ISO 8859 -4, ISO 8859 -5, ISO 8859 -6, ISO 8859 -7, ISO 8859 -8, ISO 8859 -9, KOI 8 -R, KSC 5601, ROMAN-8, SHIFT-JIS, UCS 2, UTF-8". DEF VAR MIME-cplist AS CHAR INIT "Windows-1250, Windows-1251, Windows-1252, Windows-1253, Windows-1254, Windows-1255, Windows-1256, Windows-1257, Windows-1258, TIS-620, Big 5, EUC-JP, GB_2312 -80, IBM 037, IBM 273, IBM 277, IBM 278, IBM 284, IBM 297, IBM 437, IBM 500, IBM 851, IBM 852, IBM 857, IBM 00858, IBM 861, IBM 862, IBM 866, ISO-8859 -10, ISO-8859 -15, ISO-8859 -2, ISO-8859 -3, ISO-8859 -4, ISO-8859 -5, ISO-8859 -6, ISO-8859 -7, ISO-8859 -8, ISO-8859 -9, KOI 8 -R, KS_C_5601 -1987, hp-roman 8, Shift_JIS, UTF-16, UTF-8". DEF VAR i AS INT. i = LOOKUP(progress-Code. Page, pro-cplist). RETURN IF i = 0 THEN "Unknown" ELSE ENTRY(i, MIME-cplist). END FUNCTION. 58 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints Caution with numeric format § Do not store decimal values in char fields /* prog 1. p */ DEFINE VARIABLE d AS DECIMAL INIT 123. 45. CREATE table. char 1 = STRING(d). /* prog 2. p */ FIND FIRST table. DISPLAY DECIMAL(table. char 1). § prog 2. p will fail if run with a different -E or § 59 numdec than prog 1. p Comma-delimited lists DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints Date and Numeric formats can be changed at run time DEFINE VARIABLE mynum AS DECIMAL NO-UNDO. SESSION: DATE-FORMAT = "mdy". DISPLAY SESSION: DATE-FORMAT TODAY SKIP. SESSION: DATE-FORMAT = "dmy". DISPLAY SESSION: DATE-FORMAT TODAY FORMAT "99 -99 -9999" SKIP. SESSION: DATE-FORMAT = "ymd". DISPLAY SESSION: DATE-FORMAT TODAY FORMAT "9999. 99" SKIP. mynum = 12345. 67. SESSION: NUMERIC-FORMAT = "American". DISPLAY SESSION: NUMERIC-FORMAT STRING(mynum) SKIP. SESSION: NUMERIC-FORMAT = "European". DISPLAY SESSION: NUMERIC-FORMAT STRING(mynum) SKIP WITH NO-LABELS. 60 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints Miscellaneous § Never use the “undefined” code page § If the source and target code pages are the § § same, no conversion happens If we always make the same mistake we’ll notice the data corruption r-code is encoded using -cpinternal Source files are encoded using -cpstream Recognize UTF-8 read as iso 8859 -1: • ö becomes ö 61 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Tips & Hints DBA reminder § How to create a UTF-8 word-break table: > proutil -C wbreak-compiler %DLC%prolangconvmaputf 8 -bas. wbt 1 > copy proword. 1 %DLC% § How to create a UTF-8 database: > prodb <db> %DLC%prolangutfempty. db > proutil <db> -C word-rules 1 § How to start a UTF-8 client: > _progres -b –cpinternal UTF-8 -ttwrdrul 1 > prowin 32 –cpinternal UTF-8 -ttwrdrul 1 62 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Agenda § Code Pages Overview § Open. Edge Settings § Common Mistakes § Hints & Tips § Linguistic Sorting and Collation § Time Zones § Summary § Questions 63 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Linguistic Sorting and Collation § Collation: Set of rules for ordering and § § § 64 comparing character data Open. Edge supports 54 ICU (International Components for Unicode) collations with UTF 8 Local databases vs global databases COMPARE and COLLATE DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Linguistic Sorting and Collation Sorting with Basic collation FOR EACH mytable BY myfield: DISPLAY myfield WITH FONT 8. END. Basic Aaa Ááá Äää Ççç Ĉĉĉ Bbb Ccc Zzz 65 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Linguistic Sorting and Collation Sorting with English collation FOR EACH mytable BY COLLATE(myfield, "CASE-INSENSITIVE", "ICU-UCA"): DISPLAY myfield WITH FONT 8. END. Basic Aaa Ááá Äää Ççç Ĉĉĉ Bbb Ccc Zzz 66 ICU-UCA Aaa Ááá Äää Bbb Ccc Ĉĉĉ Ççç Zzz DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Linguistic Sorting and Collation Sorting with Finnish collation FOR EACH mytable BY COLLATE(myfield, "CASE-INSENSITIVE", "ICU-fi"): DISPLAY myfield WITH FONT 8. END. Basic ICU-fi Aaa Ááá Äää Ççç Ĉĉĉ Bbb Ccc Zzz 67 ICU-UCA Aaa Ááá Äää Bbb Ccc Ĉĉĉ Ççç Zzz Aaa Ááá Bbb Ccc Ĉĉĉ Ççç Zzz Äää DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Linguistic Sorting and Collation Comparing with Basic collation FOR EACH mytable WHERE myfield >= "C" BY myfield: DISPLAY myfield WITH FONT 8. END. Basic Ccc Zzz 68 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Linguistic Sorting and Collation Comparing with English collation FOR EACH mytable WHERE COMPARE(myfield, ">=", "CASE-INSENSITIVE", "ICU-UCA") BY COLLATE(myfield, "CASE-INSENSITIVE", "ICU-UCA"): DISPLAY myfield WITH FONT 8. END. Basic Ccc Zzz 69 ICU-UCA Ccc Ĉĉĉ Ççç Zzz DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Linguistic Sorting and Collation Comparing with Finnish collation FOR EACH mytable WHERE COMPARE(myfield, ">=", "CASE-INSENSITIVE", "ICU-fi") BY COLLATE(myfield, "CASE-INSENSITIVE", "ICU-fi"): DISPLAY myfield WITH FONT 8. END. Basic ICU-fi Ccc Zzz 70 ICU-UCA Ccc Ĉĉĉ Ççç Zzz Äää DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Linguistic Sorting and Collation Global Setup Caution with performance! Database -cpcoll ICU-uca TEMPTABLES App. Server -cpcoll ICU-uca --Uses client collation in COMPARE and COLLATE -cpcoll ICU-en TEMPTABLES -cpcoll ICU-fr TEMPTABLES -cpcoll ICU-cs RUN ASprg. p ON h. App. Server (INPUT SESSION: CPCOLL, INPUT USERID, INPUT <other parameters>, OUTPUT TABLE tt. Mytable). 71 DEV-23: Global Applications and Code Pages TEMPTABLES -cpcoll ICU-fi English User French User Czech User Finnish User © 2007 Progress Software Corporation
Agenda § Code Pages Overview § Open. Edge Settings § Common Mistakes § Hints & Tips § Linguistic Sorting and Collation § Time Zones § Summary § Questions 72 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Time Zones Considerations § Timestamps: client vs server vs GMT § Display time: saved vs converted § Database queries: saved vs converted http: //www. csgnetwork. com/timezonemap. html 73 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Time Zones Extra consideration § Daylight Saving Time for time conversions DST used DST no longer used DST never used 74 DEV-23: Global Applications and Code Pages http: //en. wikipedia. org/wiki/Daylight_saving © 2007 Progress Software Corporation
Time Zones OS Support § Operating Systems have time zone tables • • Solaris: /usr/share/lib/zoneinfo HP-UX: /usr/lib/tztab Red Hat: /usr/share/zoneinfo Windows: HKEY_LOCAL_MACHINESOFTWAREMicrosoftWindows NTCurrent. VersionTime Zones § Java uses its own time zone tables § Open. Edge relies on the platform 75 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Time Zones DATETIME and DATETIME-TZ data types DEFINE VARIABLE dt AS DATETIME. DEFINE VARIABLE dtz AS DATETIME-TZ. dt = NOW. dtz = NOW. MESSAGE dt SKIP dtz VIEW-AS ALERT-BOX. This is offset, not Time Zone ! 76 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Time Zones Timestamping App. Server Database All times are GMT 77 DEV-23: Global Applications and Code Pages Gets OS time in GMT User Converts GMT To User’s Time Zone © 2007 Progress Software Corporation
Time Zones Displaying times Summer Winter User 08: 30 (-1) 07: 30 User 14: 30 User 22: 30 12: 30 GMT App. Server Database GMT Times 78 Converts GMT To User’s Time Zone DEV-23: Global Applications and Code Pages Bedford USA (-1) 13: 30 Berlin Germany (0) 22: 30 Brisbane Australia (+1) 23: 30 Sydney Australia © 2007 Progress Software Corporation
Time Zones Database tables users 10 user-id 20 tz-id User ID Time zone ID timezones 10 tz-id 20 tz-name C X(4) C X(40) Time zone ID Time zone name tz-changes 10 tz-id 20 tz-date 30 min-1 40 min-2 50 from-month 60 from-day 70 from-time 80 to-month 90 to-day 100 to-time 79 C X(8) C X(4) C D I I C Time zone ID Date that the changes apply from Normal minutes of difference from GMT Minutes of difference from GMT during DST Month when DST starts Code for day when DST starts Time when DST starts Month when DST ends Code for day when DST ends Time when DST ends X(4) 99/99/9999 ->>>9 >9 9 99: 99 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Time Zones ABL functions § Get. GMT() to get current time in GMT FUNCTION Get. GMT RETURNS DATETIME (): DEF VAR dt. GMT AS DATETIME NO-UNDO. dt. GMT = ADD-INTERVAL(NOW, - TIMEZONE, 'MINUTES'). RETURN dt. GMT. END FUNCTION. 80 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Time Zones ABL functions § Convert. DT() to convert GMT to user’s time FUNCTION Convert. DT RETURNS DATETIME (INPUT pdt. Now AS DATETIME NO-UNDO, INPUT pc. Tz-id AS CHARACTER NO-UNDO): DEF VAR dt. Out AS DATETIME NO-UNDO. FIND LAST tz-change NO-LOCK WHERE tz-change. tz-id = pc. Tz-id AND tz-change. tz-date <= DATE(pdt. Now) NO-ERROR. (. . . ) RETURN dt. Out. END FUNCTION. 81 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Agenda § Code Pages Overview § Open. Edge Settings § Common Mistakes § Hints & Tips § Linguistic Sorting and Collation § Time Zones § Summary § Questions 82 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Summary § UTF-8 for database and -cpinternal as a start § Know the code page of data getting into and out of Open. Edge (-cpstream / CONVERT) § Two wrongs may make it look right § It’s not only about conversion, but checking § § 83 results as well – Use hexadecimal tools Take a look at the 10. 1 B Internationalizing Applications manual Code Pages are tricky, but fun ! DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Questions? 84 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
Thank you for your time 85 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
86 DEV-23: Global Applications and Code Pages © 2007 Progress Software Corporation
c208f8b294d4f4613e349cd08f3366d1.ppt