9dc574ccdf945ca25dd2e054912035dc.ppt
- Количество слайдов: 129
Introduction to Localization World Conference Berlin 2009 Richard Sikes, Angelika Zerfass, Daniel Goldschmidt
Agenda • Introduction – the problem, problem definition • Tools • Localization Process 101
Agenda Localization, internationalization, Globalization, translation, regionalization… too many “ation” terms… During the next three sessions we will make sense of them for you
Agenda Globalization to understand requirements (for going global) Internationalization to enable products to meet requirements Localization to fulfill requirements
The problem
The Problem • A known company developed a powerful product for CRM (Customer Relationship Management System) • The first and main market was, as usual, the USA • The board decided that it is time to penetrate new markets: Europe, Far. East, Middle East The R&D department claimed – no problem, we are fully UNICODE…let’s go!
The Problem Ouch…
The Problem #1 – String Externalization • All the GUI (graphical user interface) had to be translated to the target languages • But lots of strings were hard-coded (written directly into the code)
The Problem #2 - Sorting • After translating the GUI, the first installation took place in Spain • Some customers were unhappy: Many indexes and lexical orders were corrupted • In Traditional Spanish, the letters “CH “ and “LL” have their own positions in the sort order • A, B, C, CH, D…K, L, LL, M, … etc. – Curioso – Chalina – Luz – Llama
The Problem The second installation in Germany had three problems: – The search function didn’t work – The financial and numerical functions were buggy – Many strings were cutoff in the GUI
The Problem #3 –Collation • Combining characters: Ü ( Latin Small letter U with diaeresis 0 x 00 DC) U¨ (Latin Small letter U 0 x 0055, Combining diaeresis 0 x 0308) ç (Latin Small letter with Cedilla 0 x 00 E 7) c (Latin Small letter C 0 x 0063, Combining Cedilla 0 x 0327) • • fi=fi Case sensitive/insensitive Accent sensitive/insensitive Upper case ß (Latin Small letter Sharp S)= SS
The Problem #4 – Numerical format • 4. 500 (UK) ≠ 4. 500 (DE) • 4, 500 (UK) = 4. 500 (DE) • 4. 500 (UK) = 4, 500 (DE)
The Problem #5 - Length • • • German strings are usually longer than in most languages English: Redo German: Wiederherstellen English: Skip German: Zeilensprung
The Problem #6 – Date Format • The client from Spain called after 2 months; the license had expired earlier then expected! Does 01/07/2006 mean: “July, first 2006” Or “January, seventh 2006”?
The Problem #6 – Date Format, Calendars • • The first day of the week is Monday. . . or Sunday (weekend) Year length Week numbers (ISO? Other? ) Last Monday
The Problem #7 - Encoding The installation in Russia was catastrophic: • All imported data from the legacy systems was full of question marks. • All data inserted by the user couldn’t be retrieved from the database • This was the first installation using a non “Western European” encoding!
The Problem #8 - Segmentation • In Japan the problem even got worse: the parsers stopped working. • In Japanese, there are no white spaces in-between words. The tokenizers didn’t work properly Tokenization is the process of demarcating and possibly classifying sections of a string of input characters.
The Problem #9 – Politics The Hebrew website had some minor issues: When localizing a website for Israel, which map shall we use: • The one with Judea and Samaria • The one with the Palestinian Authority • The one without the occupied territories “Judea and Samaria” vs. “occupied territories”
The Problem #10 – Grammar • Singular? Plural? • Male, female, something else? • How to translate concatenated strings?
The Problem #10 – Grammar String concatenation example: The Winfax Installer has found %s. • Case – Microsoft • S=“Outlook” – Netscape • S=“Netscape Mail” – Notes • S=“Notes Email” – Else • that you have no email provider.
The Problem #11 – Graphics & Symbols The OK gesture: • English-speaking: OK • France: zero, nothing, worthless • Mediterranean: a rude sign • Japan: money • Brazil & Germany: vulgar, obscene gesture
The Problem more issues • • Color scheme Time zone Paper sizes (A 4 vs. Letter) Phone numbers Address format Temperature Measurements
Culture is Everywhere “If I'm selling to you, I speak your language. If I'm buying, dann müssen Sie Deutsch sprechen (then you must speak German)” Willy Brandt
Problem Definition
Terms Globalization • Adaptation of marketing strategies to regional requirements of all kinds. Internationalization • Engineering of a product to enable efficient adaptation of that product to local requirements. Localization • Localization is the process of adapting a (software) product and accompanying materials to suit a target-market locale.
Terms Locale • A locale is a geographic region defined by a combination of language and cultural norms. “Locale” is not to be confused with “language. ”For example fr-FR, fr-CA, fr-CH. Fully supporting locales requires: – Globalization – to understand requirements – Internationalization – to enable products to meet requirements – Localization – to fulfill requirements
Globalization Internationalization Localization GERMAN FRENCH CHINESE LOCALIZATION Adapting software and JAPANESE accompanying materials to suit. PORTUGUESE target-market locales INTERNATIONALIZATION Engineering of a product to enable efficient adaptation to local requirements GLOBALIZATION Expansion of marketing strategies to Expansion of requirements of all to address regionalmarketing strategieskinds address regional requirements of all kinds
Costs that are generated in one place become visible in another.
Globalization Expansion of marketing strategies to address regional requirements of all kinds
Globalization • • • IMPLICATIONS: International market research Prioritize local markets through business case analysis Development of separate business cases for emerging markets Product planning with serving of diverse markets in mind Tracking of revenues by locale Extensive liaison with foreign sales offices and resources Globalization is a mind set as much as a task set.
Internationalization Engineering of a product to enable efficient adaptation to local requirements
Internationalization IMPLICATIONS: • Removal of cultural assumptions (such as date formats) • Implementation of support for global norms (such as language character sets or accounting procedures). Internationalization is an expansion of product capability to be local-generic.
Localization The process of adapting software and accompanying materials to suit a target-market locale with the goal of making the product "transparent" to that locale, so that native users would interact with it as if it were developed there and for that locale alone.
Localization IMPLICATIONS: • Language and character set support • Support for various format settings such as decimal delimitation, time/date display, and other such norms. • Conformance with locale-specific technical norms. Localization imposes constraints on software’s regional applicability.
Localization • Success Product appears to be developed in the target market • Failure: We can easily notice that the program was adapted (Please read the instructions on the package of hygiene products in the bathroom…)
Internationalizing the UI
The Other Side of the Fence What Localization Managers Often Face Internally • • • Lack of Understanding re Localization Issues and Processes Poorly Internationalized Software Underestimation of the “Ripple Effect” Caused by Changes Inadequate Version Control Core Project Slippage Marketing Managers Who Can’t Plan Ahead Changing Priorities Inadequate International Quality Assurance FUD About Localization HOW CAN YOU HELP?
Concatenation – Definition • Building sentences out of two or more separate parts using replaceable string variables. • Changes in situation will cause the calling string to call a different sub-string. This can lead to various types of problems: – Linguistic logic hiccoughs – The translator can’t determine what or where the substrings are • Programmers LOVE concatenation!
Concatenation – Example The Winfax Installer has found %s. • Case – Microsoft • S=“Outlook” – Netscape • S=“Netscape Mail” – Notes • S=“Notes Email” – Else • that you have no email provider.
Concatenation – Excel example
Concatenation String probably not found by translator or hard coded.
CGI code snippet in PERL sub print_form { my ($content); my ($template, $HTML) = @_; open (FILE, "<$template") or die "Couldn't open $template: $!n"; while (<FILE>) { s/{{(. *? )}}/$HTML->{$1}/g; $content. = $_; } close FILE; print $content; } sub error_out { my (%HTML); $HTML{CGI} = $cgi; $HTML{ERROR} = shift; print_form("$path_templates/error. html", %HTML); }
CGI code snippet in PERL sub print_form { my ($content); my ($template, $HTML) = @_; open (FILE, "<$template") or die "Couldn't open $template: $!n"; while (<FILE>) { s/{{(. *? )}}/$HTML->{$1}/g; $content. = $_; } close FILE; print $content; } sub error_out { my (%HTML); $HTML{CGI} = $cgi; $HTML{ERROR} = shift; print_form("$path_templates/error. html", %HTML); }
Introduction to Localization Tools and Evaluation Localization World Conference Berlin 2009 Angelika Zerfass, Richard Sikes, Daniel Goldschmidt
Possible file preparation Project Management / Workflow Management Source language files Software Localization Tool Selfdeveloped tool Target language files API Alignment Terminology Extraction Term base / term list Translation Memory Tool Word count tool Macros possible text extraction file conversion Editor of TM tool QA tool Creation of target language file DTP
Software Localization Tool • A tool to test the localizability of software – • Pseudo Translation A tool to translate text in software applications – GUI (graphical user interface) • – • Menus, Dialogs Error messages, system messages A tool to adapt the GUI to the translation • • • Resizing dialog boxes Flipping contents of dialog boxes for right-to-left languages Adaptation of icons, graphics
Dialog view Navigation Translation list
Translation Memory Tool – A system (most often a database) that stores source sentence plus translation as a pair, a so-called „segment pair“ – During translation the translation memory compares the segment to be translated with the segments in the database. – If a match is found (same or similar segment), the translation is offered as a suggestion – The translator decides if the translation can be accepted or has to be changed. – The TM system does NOT translate by itself, it is no machine translation system!
Translation Memory Tool SOURCE ORIGINAL VERSION NEW VERSION TARGET The 4 -mm tip electrode of the steerable 7 F cryoablation catheter is provided with the refrigerant halocarbon (Freon ®) by a double lumen in the catheter shaft. La pointe de 4 mm du cathéter de cryoablation (d'un calibre de 7 F) est alimentée en réfrigérant (protoxide d'azote) par un double conduit situé dans la tige du cathéter. The 6 -mm tip electrode of the steerable 7 F cryoablation catheter is provided with the refrigerant nitrogen oxide (N 2 O) by a double lumen in the catheter shaft. La pointe de 4 mm du cathéter de cryoablation (d'un calibre de 7 F) est alimentée en réfrigérant (protoxide d'azote) par un double conduit situé dans la tige du cathéter.
SDL Trados Workbench and Tag. Editor Terminology window TM window Translation fields in Word
Memo. Q Sentence and terminology matches Source and target language columns for translation
Localization Processes Translation Memory Tool Software Localization Tool Pseudo translation Import into TM File preparation Export of segment pairs for TM Extract translatable segments Export of Terminology Translation of software Import into term base Translation of Online-help, readme files, manuals, web pages, packaging. . .
Terminology Management • Components of TM tools or stand-alone solutions • Connect to the TM systems and localization tools during translation • Manage additional information like explanations, definitions, classifications and graphics • Ensure the consistent use of terms over the whole project through term checks • Term extraction (monolingual and bilingual)
Term base
Term Extraction • Concordance tools – Extraction all words and word combinations up to x words from a monolingual document • Statistical extraction – Extracting the most frequent terms from monolingual or bilingual sources • Linguistic extraction – Extracting by rules and with the help of language analysis (e. g. all noun phrases up to 4 words)
Term Extraction
Alignment • Old source and target language documents are read into the alignment component of the TM tool • The tool segments the files and tries to connect the segments that belong together, thus creating segment pairs • A translator checks the alignment • Results are imported into a TM system for reuse with new translations
Example: Déjà Vu
Project Management and Workflow Tools • Project Creation in TM tool – packaging of project files • Workflow Tool – Automation of processes (file conversion, pre-translation, packaging, sending out package to assigned translator…) • Project Management Tools – Offers and invoicing – Data on customers and vendors
TM • Interactive translation - MT • Machine translation – interactive process – fully automated process – almost all language pairs possible – only works for the language pair the system was created for – creation of a repository – text is usually pre-edited and or post-edited – Recycling of translations independent of the format of the source document – good systems are relatively costly – very fast
How do you evaluate which tool is right for you?
Test the tool… • • Get a demo from the vendor with your own files Get a testing license (usually 4 weeks) Run a pilot project Discuss useful features with users of the tool – Translators, project managers of service providers, developers • Create a test matrix and run a small project
What to evaluate • General – Vendor company, number of developers, user base, responsiveness… • By requirement – List what the tool should be able to do • By feature – Test the existing features for importance and performance
Coffee Break – 15 Minutes
Introduction to Localization Process 101 Localization World Conference Berlin 2009 Angelika Zerfass, Richard Sikes, Daniel Goldschmidt
Localization - Recap
Localization - Recap The i 18 n and l 10 n problem is a mixture of: • Technical Issues • Cultural Issues • Political Issues • Language / Linguistic Issues • Esthetical Issues
Localization - Recap The 3 layers approach: • Transportation • Application • Display
Localization - Recap Layer 1 – Transportation (“handle with care” sticker) moving data from A to B Usually not locale dependant
Localization - Recap Layer 2 – Application doing something with the data (e. g. sorting, searching casing, date/time format etc. ) usually locale dependant
Localization - Recap Layer 3 – Display Presentation layer Localization readiness (resources externalization) usually locale dependant
Localization – Recap Internationalized Software Architecture
Localization vs. Internationalization • Internationalization -> Generalization • Localization -> Customization
Localization vs. Internationalization Localization English Localization Hebrew Internationalization Localization Chinese Localization French Globalization = Internationalization + N X Localization
Localization vs. Internationalization • Internationalization is an essential process for preparing the product for localization • The deliverables of the i 18 n process are two: – Generic version of the application – Software components of the Localization Kit
Localization vs. Internationalization • You don’t need to actually read and write 22 languages • i 18 n is software engineering, not a linguistic process – There are cross-over concepts, however, such as: • Allowing sufficient white space for language growth in documentation • Not hard-coding page references in books • Planning website architecture to support multilingual content and navigation • i 10 n is mainly project management!
The Process
Who’s involved? • • • Content providers (Editors, technical writers, R&D teams etc. ) Localization project managers (on publisher side, on vendor side) Localization engineers (on publisher side or vendor side) Translators (In house, freelance, Single Language Vendor, sub contractors) Reviewers (In house, freelance, Single Language Vendor, sub contractors, regional office employees) Quality Assurance specialists (on publisher side, on vendor side) Finance personnel Program managers Product marketing managers Webmasters
A short To-Do list • • Researching and gathering components to be localized Preparing the content (text segmentation, resource extraction etc. ) Pseudo localization and proactive i 18 n QA on core code Leveraging against existing TMs Effort estimation, costing Management Approval Work assignment • • • Translation and localization Proof reading / Editing / Reviewing Testing, if applicable • • • TM updates, maintenance of linguistic assets Delivery Billing
The Traditional Process Leveraging Content providers Preparing Effort assessment Linguistics assets: TMs Terms Glossaries Content Repository Translating Content providers Reviewing Packaging and delivery Updating Linguistics assets
Preparation
Preparation • Research and collect all relevant components - be sure to have everything you need • Create LBOM (localization bill of materials) • Prepare the content (text segmentation, resource extraction etc. ) using the appropriate tools.
Preparation • Run a pseudo-localization to test localization readiness • Check: – Externalization of strings – Adaptation of the GUI (length, date, time, currency etc. ) – Handling of string concatenations – Software functionality – Data entry, transfer, persistence, and redisplay – and…
Preparation • Prepare glossary – add new terms/update changed terms • If you don’t have a glossary – prepare one, send it for translation and approve it BEFORE work starts • If you as a client own the TM – provide vendor with most recent version • If your vendor owns the TM – be sure the last (clean) version is being used (and also try to change your contract so that you get ownership of the TM)
Preparation Prepare a “Localization Kit”: A Localization Kit contains everything that anyone who touches the project needs to know in order to do their work. Localization Kit includes: • Product: – – – – • • • Text strings Menus Dialogs Shortcut keys Images Functional l 10 n components (tax rules) Documentation and OLH files … Glossaries TMs Localization Guidelines and Expectations
Preparation • Leverage the content against your TMs • Get comparative quotes and time estimation • Obtain information regarding resource arability
Preparation: The Vendor • The vendor is your best friend! • However, this friend sells words (for translation)!
Preparation: The Vendor What to consider: • Rate: 25 -30 $cent/word • Pace: 1500 words/day Price should include: • Translation • Editing • Proof reading Not included: • Project management • QA cost • DTP Consider training the vendor’s translators and the proof readers: it will give them insight into the product
Preparation: The Vendor Be sure to establish the following: • Processes • Escalation process • Location of translators • Single focal point • Localization material • Deliverable • TM ownership • What are you paying for • Bug fixing responsibility • Service Level Agreement (SLA)
Preparation: The Vendor Be sure to determine what you are paying for: • Price per word • Discount for repetitions • Word counting in source language or target language? • QA? • Bug fixing?
Translation
Translation Basic premises: • Translation is expensive Example: – 1 million words = $250, 000 per language – A 10 languages localization project easily could incur cost of $2. 5 M • Glossaries are required • Translation memories are required
Context in Translation Translators need to know what to translate and what not to translate (tags, code etc. ) Expose only translatable content to them – don’t run the risk of having your code broken Translators need to know the context: • Surrounding text, dialog etc. • i. e. “display” German: anzeigen (to display) German: Anzeige (a display)
Testing / QA
Testing / QA 5 types of testing: • Before localization – i 18 n testing – l 10 n readiness testing (pseudo localization) • After localization – Cosmetic testing – Linguistic testing – Functional testing
Testing / QA • • Effort Estimations: i 18 n QA: the same timeframe as the original acceptance tests Pseudo localization: the same timeframe as the original acceptance tests Cosmetic/ linguistic – one pass on all dialogs/ screens/ menus etc. Usually a matter of days. Functional testing - the same timeframe as the original full test cycle of the original product
Testing / QA i 18 n testing: • Is the software really locale independent • Does your software know how to handle data in different languages (double-byte enabled? )
Testing / QA • • Cosmetic Testing: Check to see if the UI is broken Dialogs, buttons, menus etc. – have they been properly localized Chinese words are shorter, but the characters are higher! French words lengths…
Testing / QA Linguistic testing: • Does the translation make sense in the context? • Edite vs. Edition • Share vs. Shares
Testing / QA Functional testing: • Full acceptance test of the product in target language • Usually not done due to cost and time
Testing / QA In country reviewing: • Resources in or from the country/market, who know the target market and target language to check if localization makes sense
Document Quality Control • Document QC is another kind of Quality Control, and is just as important (sometimes). • Issues to watch for: – Linguistic – Technical – Layout • Pagination • Screenshots and surrounding text in sync • Cross-references and hyperlinks • Conditional text
Project Wrap • • TM update Delivery Invoice Management Post-mortem
Planning Tips
Planning Tips • Kick off meeting – Touch on a all aspects of project, size, timeline, number of languages etc. • Analysis of source meeting – Outline potential L 10 n/I 18 n issues with source code • Scheduling and budgeting – Based on size, timeline, number of languages etc. schedule resources, quotes, • Terminology setup – Create glossary leveraging existing glossaries, adding additional terminology by using tools such as SDL Trados Term. Extract. • Preparation of source Material • and…. .
Planning Tips • Translation of Software – Translation, editing and proof-reading (TEP) of software • Translation of documentation – Translation, editing and proof-reading (TEP) of documentation • Testing the Software – Testing of software for functional, linguistic and cosmetic defects • Screen Capture – Capture screenshots for documentation, help files • DTP – Prepare the hard copy of the documents
Planning Tips • Start planning from the end: focus on the release date • Make sure that you work within a realistic timeframe – allow extra time, in case things go wrong (buffers, slippage, holidays) • Check the required time for QA • Estimate number of words, make sure what your are paying for (source/target) • Rule of thumb: Number of words / 2000 = number of translator days for translation – Software = slower – Flowing documentation ~ faster – Diminishing returns as more translators added
Planning Tips • • Keep in mind that translations can start before all resources are ready You can start translating your material once the GUI is frozen Think about running QA for several languages in parallel Remember that the process might require several iterations
Pitfalls
Pitfalls “We are not doing any localization nor translation. We will give our distributors in each country a discount, and they take care of it” Careful – consider the following: • Who is in the end responsible for quality? • Who owns the Intellectual Property? • No leveraging of handling the localization for all countries at once.
Pitfalls “There is no need for a localization process, once we release the product, we will prepare Excel files with the strings to be translated” Careful – consider the following: • • Has your software been prepared for localization? Be ready for surprises in the code Consider pseudo localization Translation out of context can result in errors and/or excessive project management time
Pitfalls “Philippe, from engineering, speaks French fluently, lets ask him to translated the GUI of our product!” Careful – consider the following: • Languages are evolving – therefore best translations will be done using incountry translators • What about localization? • What about using translation tools? • Leveraging, Terminology, Glossary?
Q/A • Ask now……
Thank you for your attention
Backup slides
Jargon
Jargon • • • • • g 11 n i 18 n l 10 n Sim ship MLV SLA Translation Memory (TM) Segment Matching (100%, ICE, Partial, Fuzzy) Leveraging Alignment Glossary, Glossary building Terminology management Machine Translation Localizing Marketing Translating Guideline NDA • • • • • Software l 10 n Resource ID Context Localization Tool QA Linguistic QA Cosmetic QA Functional QA Reviewing Proof Reading Localization Readiness Pseudo Localization Single source Word count CMS Publisher
Preparation Localization Kit includes: • Product: – – – – Text strings Menus Dialogs Shortcut keys Images Functional l 10 n components (tax rules) Documentation and OLH files … • Glossaries • TMs • Localization Guidelines and Expectations
Preparation of software • User Interface – Pseudo-translation to test for localizability • Are buttons large enough for text expansion? • Is there hard-coded text in the software? • Can the characters of the target language be displayed correctly? – Hiding or locking of non-translatable text with a software localization tool – Re-use of old projects by alignment – Setup of a terminology list
Preparation of documents • Internationalization of documents – Spaced layout – “Simplified English” – Single Sourcing with conditional text or text layers within one document – Content Management System with text modules • Text preparation – Text extraction or file conversion into a format that translation memory systems can deal with – Hiding non-translatable text layers / columns / paragraphs
Other preparation steps • Terminology – Extraction, collection, creation of lists (Excel) or term bases • Alignment (re-use of old projects) • Setup of process automation (Workflow Management)
Testing
Software Testing • Before translation – Internationalization testing • Is the software locale independent • Will the software be able to accommodate different characters sets, date formats, measurements… – Localization testing • Pseudo translation or simulated translation to find out if the characters of the target language can be displayed correctly • Will expanding text of the target language still fit the buttons and text fields • During translation – Spell check and terminology checks – Checks if the translator did not forget any access keys (underlined letters that allow calling a menu or menu point by keyboard) – Checks that the same access key has not been used twice in a menu
Software Testing • After translation – Cosmetic checks • Check the UI if not broken • Layout of dialogs, buttons, menus etc. • Chinese words are shorter, but the characters are higher! • German words’ lengths… – Linguistic checks • Does the translation make sense? – Functional test • Full acceptance test of the product in target language • Has the translation in any way “broken” the functionality • Are data input, stored, manipulated, and redisplayed accurately? • (often not done because of time and cost constraints)
Document Testing • Resources in or from the country/market, who know the target market and target language to check if localization makes sense – Spell check, formal check (punctuation…) and terminology check on translated text – Cosmetic checks • • Pagination, layout should be checked and possibly redone Correct index Cross-references, hyperlinks (online-help) Correct screenshots – Linguistic checks • Does the translation make sense? • Do software and documentation correspond? – Functional test • Cross-references in online help • Hyperlinks on web pages
What does it take to be a good L 10 n Project Manager?
What does it take to be a good L 10 n Project Manager? • Adaptability / versatile thinker – think outside the box, come up with non-orthodox solutions • Technically inclined – know the basics of what an L 10 n engineer’s daily work entails • Localization industry experience – translation background, editing background • Attention to detail – see defects, potential pitfalls, have a good eye for layout/design • Skilled in writing and presentation – comfortable writing in native and potentially other languages
What does it take to be a good L 10 n Project Manager? • Interest in and awareness of foreign cultures: – read foreign language books, watch foreign language movies, enjoy “diversity”… • One or more non-English languages – helpful to know basics or the concept of non-European languages (i. e. Chinese, Japanese…) • Instinct for prioritization – know how to get your ducks in a row… • Pragmatic, realistic approach to problem-solving – have processes in place, but don’t follow them slavishly if faced with a worse case scenario and……
Localization Quality Assurance: Skill Set • Comfortable with diverse language software versions • Ability to distinguish between languages, i. e. German from Dutch • Versatility of OS, Platform, & Database language versions • Generic QA methodology • Creation and usage of scripted QA tools
9dc574ccdf945ca25dd2e054912035dc.ppt