Upload
api-3803168
View
219
Download
0
Embed Size (px)
Citation preview
8/14/2019 E-business Globalization Guide
1/210ibm.com/redbooks
Front cover
e-business Globalization
Solution Design GuideGuideGetting Started
Xiao Hui Zh
Ming Zhu C
Bei ShYi Zhen X
Xia
Ming
Fei Q
Easily comprehend state-of-the-artglobalization technologies
See how best practice design
guidelines can work for you
Learn ways to achieve
cost-effective globalization
http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/8/14/2019 E-business Globalization Guide
2/210
8/14/2019 E-business Globalization Guide
3/210
International Technical Support Organization
e-business Globalization Solution Design Guide:Getting Started
December 2002
SG24-6851-00
8/14/2019 E-business Globalization Guide
4/210
Copyright International Business Machines Corporation 2002. All rights reserved.
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM
Corp.
First Edition (December 2002)
Note: Before using this information and the product it supports, read the information in Notices onpage vii.
8/14/2019 E-business Globalization Guide
5/210
Copyright IBM Corp. 2002. All rights reserved. iii
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
The team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixBecome a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiComments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Part 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 1. What is globalization? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Chapter 2. Why is globalization necessary? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Chapter 3. How to implement globalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Part 2. Globalization application design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Chapter 4. Single Executable. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Chapter 5. Unicode support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Chapter 6. Locale model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Chapter 7. Localization pack. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Chapter 8. Input and output of multilingual data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Chapter 9. Linguistic services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Chapter 10. Global Business Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Chapter 11. Localization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Part 3. Our Global Travel Shanghai Demo: A working example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Chapter 12. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6312.1 Multilingual front-end. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
12.1.1 Multilingual user interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6412.1.2 Multilingual main functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
12.2 Multilingual Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Chapter 13. Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7713.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
13.1.1 Development environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7813.1.2 Runtime environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
13.2 Product globalization capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8113.2.1 IBM WebSphere Application Server Advanced Edition V4.0 . . . . . . . . . . . . . . . 81
13.2.2 IBM DB2 Universal Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Chapter 14. A development methodology for globalized applications . . . . . . . . . . . . 87
8/14/2019 E-business Globalization Guide
6/210
iv e-Business Globalization Solution Design Guide
Chapter 15. Design and development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9115.1 Single Executable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
15.2 Unicode support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9515.3 Locale model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
15.3.1 Structure of locale model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9715.3.2 Identification of user locale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
15.3.3 Implementation of locale-sensitive features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9915.3.4 Locale-sensitive features displayed in Our Global Travel Shanghai Demo . . . 103
15.4 Localization pack. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10515.5 Machine translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
15.5.1 What is machine translation? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10715.5.2 WebSphere Translation Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10715.5.3 Solution for Our Global Travel Shanghai Demo . . . . . . . . . . . . . . . . . . . . . . . . 108
15.6 Global Business Object. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
15.7 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11715.7.1 Locale model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11715.7.2 GBO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11715.7.3 Localization packs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Chapter 16. Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11916.1 Function testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12016.2 Translation testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12416.3 Globalization feature testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12516.4 Linguistic testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12816.5 Browser testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
16.6 Usability testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Chapter 17. Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13517.1 Adding new languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
17.1.1 Locale-related computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13617.1.2 Language-dependent content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
17.2 Changing or adding globalization features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Part 4. Appendixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Appendix A. Server-side installation and configuration for Our Global Travel ShanghaiDemo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
A.1 IBM HTTP Server V1.3.19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142A.1.1 Install IBM HTTP Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142A.1.2 Configure IBM HTTP Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
A.2 IBM DB2 Universal Database V7.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143A.2.1 Install DB2 Universal Database Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143A.2.2 Configure the DB2 Universal Database Server . . . . . . . . . . . . . . . . . . . . . . . . . 143
A.3 IBM WebSphere Application Server V4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
A.3.1 Install WebSphere Application Server Advanced Edition V4.0. . . . . . . . . . . . . . 145A.3.2 Configure WebSphere Application Server Advanced Edition V4.0. . . . . . . . . . . 148
A.4 IBM WebSphere Translation Server V1.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149A.5 UDDI Registry Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151A.6 IBM WebSphere Personalization Server V4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Appendix B. Client-side installation and configuration for Our Global Travel Shanghai
Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157B.1 Installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158B.2 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8/14/2019 E-business Globalization Guide
7/210
Contents v
B.2.1 System settings configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
B.2.2 Browser settings configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Appendix C. CSS and artwork globalization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165C.1 How to make CSS Single Executable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
C.1.1 Avoid locale-related restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166C.2 Avoid language-dependent restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
C.3 Further considerations for bi-directional data display . . . . . . . . . . . . . . . . . . . . . . . . . 171
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Other resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181Referenced Web sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
IBM Redbooks collections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8/14/2019 E-business Globalization Guide
8/210
vi e-Business Globalization Solution Design Guide
8/14/2019 E-business Globalization Guide
9/210
Copyright IBM Corp. 2002. All rights reserved. vii
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consultyour local IBM representative for information on the products and services currently available in your area. Anyreference to an IBM product, program, or service is not intended to state or imply that only that IBM product,program, or service may be used. Any functionally equivalent product, program, or service that does notinfringe any IBM intellectual property right may be used instead. However, it is the user's responsibility toevaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. Thefurnishing of this document does not give you any license to these patents. You can send license inquiries, inwriting, to:IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where suchprovisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATIONPROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer ofexpress or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically madeto the information herein; these changes will be incorporated in new editions of the publication. IBM may makeimprovements and/or changes in the product(s) and/or the program(s) described in this publication at any timewithout notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in anymanner serve as an endorsement of those Web sites. The materials at those Web sites are not part of thematerials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurringany obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their publishedannouncements or other publicly available sources. IBM has not tested those products and cannot confirm theaccuracy of performance, compatibility or any other claims related to non-IBM products. Questions on thecapabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate themas completely as possible, the examples include the names of individuals, companies, brands, and products.All of these names are fictitious and any similarity to the names and addresses used by an actual businessenterprise is entirely coincidental.
COPYRIGHT LICENSE:This information contains sample application programs in source language, which illustrates programmingtechniques on various operating platforms. You may copy, modify, and distribute these sample programs inany form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sampleprograms are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, anddistribute these sample programs in any form without payment to IBM for the purposes of developing, using,marketing, or distributing application programs conforming to IBM's application programming interfaces.
8/14/2019 E-business Globalization Guide
10/210
viii
e-Business Globalization Solution Design Guide
Trademarks
The following terms are trademarks of the International Business Machines Corporation in the United States,other countries, or both:
AFPAIX/L
AIXDB2 Universal DatabaseDB2
IBMNetfinity
Redbooks(logo)S/390SP
TivoliViaVoice
VisualAgeVTAMWebSphere
The following terms are trademarks of International Business Machines Corporation and Lotus DevelopmentCorporation in the United States, other countries, or both:
Lotus Notes Lotus Notes
The following terms are trademarks of other companies:
ActionMedia, LANDesk, MMX, Pentium and ProShare are trademarks of Intel Corporation in the UnitedStates, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in theUnited States, other countries, or both.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of SunMicrosystems, Inc. in the United States, other countries, or both.
C-bus is a trademark of Corollary, Inc. in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
SET, SET Secure Electronic Transaction, and the SET Logo are trademarks owned by SET Secure ElectronicTransaction LLC.
8/14/2019 E-business Globalization Guide
11/210
Copyright IBM Corp. 2002. All rights reserved. ix
Preface
Globalization is not a featureit is an architecture.1 Globalization is the proper design and
execution of systems, software, services, and procedures so that one instance of software,executing on a single server or end-user machine, can process multilingual data and presentculturally correct information (for example, collation, date, and number formats).
As the Internet increasingly drives the economy, today's market is quickly becoming more andmore geared toward multinational participation and international transactions. The challengefor those companies that intend to thrive in this environment is that you cannot just dropglobalization on top of your existing applications. Globalization permeates so many areas thatit must be taken into consideration from the very beginning of the development cycle.
This redbook presents an architecture, a working example, and an accompanying set ofmethodologies. The sample solution is built on WebSphere Application Server and the DB2Universal Database, together with Web Services technologies incorporating dynamic
e-business concepts. We will introduce IBM's recommended globalization architecture andhow it works throughout the application development cycle, and will also explain from thecustomer's point of view how to plan and design a multilingual solution, with our workingexample validating the soundness of this architecture.
Our target audience includes design architects who are new to or at the entry level ofe-business globalization. Software developers can also use this book as a reference whendeveloping globalized e-business applications.
The GCLThe Globalization Certification Laboratory (GCL) is an organization established by IBMCorporate Globalization. GCL provides the following services to IBM internal and externalcustomers:
Globalization Comprehensive Interoperability Test ServicesTests products andsolutions to verify from a globalization perspective whether they behave properly invarious e-business scenarios.
Globalization Enablement ServicesEnables customers' products and solutions withglobalization features (multilingual capabilities, proper data format, etc.).
Globalization Consultation ServicesProvides consultation services in thearchitectural design of customers' products and solutions in order to minimize theirexpenditures for globalization application development.
Other globalization-specific Test ServicesProvides testing services for productionplatforms covering one or more specific text encodings or locales.
The team that wrote this redbook
This redbook was produced by a team of specialists from the Globalization CertificationLaboratory working together with the International Technical Support Organization, RaleighCenter.
You can contact members of the GCL team at [email protected].
1 Addison P. Phillips, Globalization Architect/Manager, Globalization Engineering, webMethods, Inc.
mailto:[email protected]:[email protected]8/14/2019 E-business Globalization Guide
12/210
x e-Business Globalization Solution Design Guide
Figure 0-1 The team that wrote this redbookFront row (LTR): CP Chang, Xia Li, Bei Shu, Fei Qu, Xiao Hui Zhu, Ming
Zhu Cui, Feng Zheng. Back row (LTR): Yi Zhen Xu, Ting Yong Zhu, Buck Stearns, Ming Li, Yang Wang
Xiao Hui Zhu is an advisory software engineer at the IBM Development Lab in China. Shehas worked for IBM since November 1994, starting her career as a project manager in theGlobalization organization and performing various roles, including tester, architect,coordinator, and consultant. Currently, she is the technical leader for the GlobalizationCertification Laboratory located in Shanghai.
Xiao Hui Zhu wrote:
Chapter 2, Why is globalization necessary? on page 9 Chapter 3, How to implement globalization on page 13 Chapter 4, Single Executable on page 17 Chapter 5, Unicode support on page 21 Chapter 1, What is globalization? on page 3 (with Yi Zhen Xu) Chapter 6, Locale model on page 23 (with Xia Li) Chapter 8, Input and output of multilingual data on page 37 (with Yi Zhen Xu) Chapter 9, Linguistic services on page 43 (with Xia Li) Chapter 14, A development methodology for globalized applications on page 87 (with
Ming Zhu Cui) Chapter 10, Global Business Object on page 51 (with Xia Li and Ming Li)
Ming Zhu Cui is a software engineer in the Globalization Certification Laboratory. Shereceived her MS in Computer Sciences at Oxford Brookes University in 2000. She joined IBM
in May 2001 and has been involved in Globalization Inter-operability Testing as a developerand tutorial writer. Her main interests lie in Java programming, globalization solutions, andtechnical writing.
Ming Zhu Cui wrote:
Chapter 12, Overview on page 63 Chapter 13, Environment on page 77 Chapter 15, Design and development on page 91 Chapter 16, Testing on page 119 Chapter 17, Maintenance on page 135
8/14/2019 E-business Globalization Guide
13/210
Preface xi
Appendix A, Server-side installation and configuration for Our Global Travel ShanghaiDemo on page 141
Appendix B, Client-side installation and configuration for Our Global Travel ShanghaiDemo on page 157
Chapter 14, A development methodology for globalized applications on page 87 (withXiao Hui Zhu)
Appendix C, CSS and artwork globalization on page 165 (with Fei Qu)
Bei Shu is a software engineer in the Globalization Certification Laboratory. She joined IBMin April 2001 and has been involved in many globalization solution projects as tester,developer, and coordinator. She has deep interests and abundant experience in XML-relatedtechnologies and their contribution to globalization.
Bei Shu wrote:
Chapter 7, Localization pack on page 29 Chapter 11, Localization on page 57
Yi Zhen Xu is a software engineer in the Globalization Certification Laboratory. He joined IBMin April 2001 and participated in many globalization solution projects as tester, developer, and
team leader. His area of specialty includes globalization technologies, XML-relatedtechnologies, Voice Server, and Web-based application development.
Yi Zhen Xu wrote:
Chapter 1, What is globalization? on page 3 (with Xiao Hui Zhu) Chapter 8, Input and output of multilingual data on page 37 (with Xiao Hui Zhu)
Xia Li is a software engineer with the Globalization Certification Laboratory. She hasextensive experience in globalized Web site development and globalization interoperabilitytest projects as test coordinator. She holds a BS in English for Science and Technology.
Xia Li wrote:
Chapter 6, Locale model on page 23 (with Xiao Hui Zhu) Chapter 9, Linguistic services on page 43 (with Xiao Hui Zhu) Chapter 10, Global Business Object on page 51 (with Xiao Hui Zhu and Ming Li)
Ming Li is a software engineer with the Globalization Certification Lab. He joined GCL oneyear ago and primarily focuses on J2EE architecture, Web Services, and EIP. He hasabundant development experience in Java-based Web application, took part in developingthe Translation Communication Tool (TCT), and is interested in open-source Java projectssuch as Tomcat and Jboss.
Ming Li wrote:
Chapter 10, Global Business Object on page 51 (with Xiao Hui Zhu and Xia Li)
Fei Qu is the Artwork Designer of the Globalization Certification Laboratory located inShanghai, P. R. China.
Fei Qu contributed:
All screen graphics Appendix C, CSS and artwork globalization on page 165 (with Ming Zhu Cui)
Editorial staffBuck Stearns was managing editor. He is a Solution Development IT Specialist for IGSBusiness Development at ITSOs Raleigh Center. Prior to joining ITSO, he worked two years
8/14/2019 E-business Globalization Guide
14/210
xii e-Business Globalization Solution Design Guide
as a mobile employee assigned to Tivoli Services, and previously logged over 25 years in thebanking and insurance industries. He has extensive experience in IT managementdisciplines, and holds undergraduate and graduate degrees in English from the University ofNorth Carolina at Chapel Hill.
Gail Christensen of ITSO Raleigh served as executive editor.
Linda Robinson of ITSO Raleigh was our graphics design supervisor.
ContributorsThanks to the following people for their contributions to this project:
Feng Zheng is the Software Engineering Manager of the Globalization CertificationLaboratory located in Shanghai, P. R. China. Feng Zheng wrote the section entitled TheGCL at the beginning of this Preface.
Ting Yong Zhu is a software engineer in the IBM Research Lab in China, and has worked forIBM since August 2000. He received his BS in Mathematics and MS in Computer Scienceand Engineering from East China Normal University. His interests include exploring the Linux
world and object-oriented technologies. He is one of the technical reviewers of this redbook.
Yang Wang is a software engineer at the Globalization Certification Laboratory and hasworked for IBM since November 2000. Yang Wang has joined or led many globalizationsolution projects with various roles, including tester, coordinator, developer, and projectleader, and has become one of the key engineers in the lab. Yang Wang was our othertechnical reviewer.
Thomas Hampp-Bahnmueller is Technical Team Lead, Text Analysis Framework,Globalization Architectural Technical Team (GATT), Germany, and D.J. McCloskey isPrincipal, Software Development, Dublin, Ireland. They provided much of the informationused in Chapter 9, Linguistic services on page 43.
And thanks to the following people working in the Globalization Center of Competency(GCoC) and Globalization Architecture and Technology Team (GATT) for their contributions tothis book:
Ahmed Talaat, Globalization Development ManagerBidirectional Scripts, Cairo, Eqypt
Akio Kido, Linux Globalization, Yamato, Japan
Akira K Oda, Manager, Globalization Center of Competency, Yamato, Japan
Alexis Cheng, DB2 UDB Globalization, Markham, Ontario, Canada
Art Day, S/390 Software Design, Poughkeepsie, New York, USA
Charles Pau, Director Globalization Architecture and Technology, Cambridge,Massachusetts, USA
CP Chang, Manager of Globalization, CDL, Shanghai, PRC Debasish Banerjee, WebSphere Internationalization Architect, Rochester, Minnesota,
USA
Dennis Hebert, WebSphere Application Server Development, Research Triangle Park,North Carolina, USA
Elizabeth Cuan, AIX National Language Support Development, Austin, Texas, USA
Israel Ervin Gidali, Globalization Manager, GCoCComplex Text Languages, PetahTikva, Israel
Joe Ross,Tivoli Internationalization, Austin, Texas, USA
8/14/2019 E-business Globalization Guide
15/210
Preface xiii
Julius Griffith, Globalization Support, User Technology Solutions Team, San Jos,California, USA
Katsushi Takeuchi, Senior Product Development Manager, Lotus Development,Westford, Massachusetts, USA
Kentaroh Noji, Globalization Architecture, Globalization Center of Competency, Yamato,Japan
Mark Davis, Chief Globalization Architect, Globalization Center of Competency, SanJos, California, USA
Markus Scherer, GCoC San Jose/Unicode/ICU International Components for Unicode forC/C++ Project Leader, Globalization Center of Competency, San Jos, California, USA
Matitiahu Allouche, Bidi Architect, GCoCBidirectional Scripts, Israel
Mike Moriarty, Corporate Globalization Strategy and Architecture, National LanguageSupport and Information Development, Rochester, Minnesota, USA
Ranat Thopunya, Manager, Globalization Center of Competency, Bangkok, Thailand
Rasha Morgan, IT Specialist, National Language Support and Business Services, Cairo,Egypt
Takaaki Shiratori, Globalization Architecture, Yamato, Japan
Tetsuji Orita, DBCS SPA, Code Page Standard, Yamato, Japan
Thomas McBride, Technical CEM for Globalization, Integrated File System, NetServer,e-business Management and Integration, Rochester, Minnesota, USA
V.S. Umamaheswaran, IBM Standards Projects Authority for SIRS 030CodedCharacter Sets, IBM Rep to UTC/CAC/JTC1/SC2, Globalization Center of Competencyand Language Services, Markham, Ontario, Canada
William Nettles, Distributed Strategy, Strategy/Architecture and Planning, San Jos,California, USA
W.J. (Bill) Sullivan, Program Director for Globalization, National Language Support and
Information Development, Southbury, Connecticut, USA Yukiko Kane, Technical Advisor, Globalization and Production Planning Services,
Research Triangle Park, North Carolina, USA
Become a published author
Join us for a two- to six-week residency program! Help write an IBM Redbook dealing withspecific products or solutions, while getting hands-on experience with leading-edgetechnologies. You'll team with IBM technical professionals, Business Partners and/orcustomers.
Your efforts will help increase product acceptance and customer satisfaction. As a bonus,you'll develop a network of contacts in IBM development labs, and increase your productivityand marketability.
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
http://www.redbooks.ibm.com/residencies.htmlhttp://www.redbooks.ibm.com/residencies.htmlhttp://www.redbooks.ibm.com/residencies.htmlhttp://www.redbooks.ibm.com/residencies.html8/14/2019 E-business Globalization Guide
16/210
xiv e-Business Globalization Solution Design Guide
Comments welcome
Your comments are important to us!
We want our Redbooks to be as helpful as possible. Send us your comments about this orother Redbooks in one of the following ways:
Use the online Contact us review redbook form found at:ibm.com/redbooks
Send your comments in an Internet note to:
Mail your comments to:
IBM Corporation, International Technical Support OrganizationDept. HZ8 Building 662P.O. Box 12195Research Triangle Park, NC 27709-2195
http://www.redbooks.ibm.com/http://www.ibm.com/redbooks/http://www.redbooks.ibm.com/contacts.htmlhttp://www.redbooks.ibm.com/contacts.htmlhttp://www.ibm.com/redbooks/http://www.ibm.com/redbooks/http://www.redbooks.ibm.com/8/14/2019 E-business Globalization Guide
17/210
Copyright IBM Corp. 2002. All rights reserved. 1
Part 1 Introduction
This part includes a general introduction to the globalization area from the end usersperspective and what technology can provide.
Part 1
8/14/2019 E-business Globalization Guide
18/210
2 e-Business Globalization Solution Design Guide
8/14/2019 E-business Globalization Guide
19/210
Copyright IBM Corp. 2002. All rights reserved. 3
Chapter 1. What is globalization?
One of the key aspects of globalization is the ability to handle multiple languages. As we allknow, much of the original development in the field of computers was done in English. Butcompared to English, most other languages involve a greater degree of computer processing.Many of these languages, such as Chinese, have a very large character set, while others,such as French, have accent marks, and still others, such as Arabic, have bi-directional(left-to-right and right-to-left) input and output.
Figure 1-1 Chinese has a very large character set
Figure 1-2 French has its accent marks
Figure 1-3 Arabic follows bi-directional input/output schema
1
Les journalistes de la pressespcialis mondiale ont reconnu lesperformances des logiciels IBMViaVoice.
8/14/2019 E-business Globalization Guide
20/210
4 e-Business Globalization Solution Design Guide
Computer software was originally designed to work with English, one of the simplestlanguages in terms of characters. No wonder then that we face difficulties today using morecomplicated character sets. Although English is still the most commonly used language onthe Internet, we cannot assume that it will always maintain its present level of dominance.
Internet business is global. It expands the traditional transaction region to a global level. Endusers in this global community increasingly expect to interact with Web pages andapplications in their native languages, and these cultural expectations should be satisfied bycomputer applications. A particular concern might be language differences, sorting systems,calendar presentation, or even the cultural suitability of icons. Therefore, a fundamentalchallenge in developing multilingual Web applications is to customize user interactions toeach user's cultural expectations.
Thinking internationallyAs with any Web site, developing a multilingual Web site begins with identifying youraudience. Only when you are clearly aware of the purpose of your site can you begin todesign that site to meet the requirements and expectations of future customers.
Figure 1-4 Identifying the audience is the first step to designing a multilingual Web site
The more languages that a multilingual Web site provides, the more effort its developmenttakes. Yet it is generally far less expensive to build and maintain a single multilingual Web site
than to duplicate efforts with parallel sites for different languages. Even the same languagecan vary widely in different countries and regions. For example, people in Mainland Chinawrite in Simplified Chinese, while people in Hong Kong and Taiwan use Traditional Chinese.Simplified and Traditional Chinese differ from each other in both glyphs and wording.
Figure 1-5 Simplified and Traditional Chinese differ from each other in both glyphs and wording
Good translation should translate not only the language itself but also the relevant culture.Translators should translate the source English message into the target language using atone that is right for the specific audience. In addition, local individuals, such as ordinarynative readers, lawyers, and the marketing team, should be asked to review the contents.This will guarantee that the message is conveyed around the world accurately and withproper regard to legal requirements. It is also the developers' responsibility to ensure that thesite is designed suitably and works well for every target audience.
8/14/2019 E-business Globalization Guide
21/210
Chapter 1. What is globalization? 5
Figure 1-6 Multilingual sites should be designed for every target audience
Winning globallyTo win in global business, you must win in a global perspective. Globalization technology isbecoming more and more prevalent in a variety of fields. Up-to-date Web sites now provideonline translation services, some of which even work with spoken language.
Research shows that only 8% of the world's population speaks English as their first language.If your Web application can communicate with users only in English, you might well lose your
customers. To make your site multilingual is to make it communicate directly with as manycustomers as possible across the remaining 92%. In this way, your site can attract morecustomers and thus benefit from more business opportunities. Multilingual Web applicationsmultiply your e-business.
Figure 1-7 Multilingual Web applications can multiply your e-business
Globalization is everywhereConsider a simple travel application. What can globalization bring the customer? A travelagent wants to set up a Web site to serve tourists from around the world. In general, touristsas a group are much more likely to visit a Web site if they can read its contents easily. At thevery least, this site must be able to:
Provide sightseeing information in the user's language and cultural setting (for example,date format or currency symbol), based on the browser's setting or the user's selection.
A more sophisticated site can also take cultural differences into consideration whenconducting business. For example, it could recommend different itineraries for people of
different national backgrounds.
Distinguish between information that is dependent upon the server's cultural setting andthat which is dependent on the client's, and then ensure the integrity of that information.When a ticket price in US dollars is displayed on a Japanese client machine, that pricemight either be converted to Japanese Yen or left in US dollars. In either case, using aparticular currency symbol carries with it the responsibility to ensure the accuracy of itscorresponding amount.
Accept input in the user's language and cultural setting.
8/14/2019 E-business Globalization Guide
22/210
6 e-Business Globalization Solution Design Guide
Store and display user information in the user's own language and format (for example,name and address).
Figure 1-8 Serve users in their own languages and formats
Some applications might need more capabilities in addition to these common customer needsfor globalization, and technologies are advancing aggressively to make all of the followingthings happen:
Machine-assisted translation might be able provide on-the-fly translations for rapidlychanging contents (for example, in the case of tourism Web sites). Note, however, thatsince the technology is still immature, client expectations must be set appropriately insituations where it seems appropriate.
Pervasive computing is making e-business an any place, any time phenomenon. Voicetechnology (both speech-to-text and text-to-speech) lets people easily check accountbalances via telephones and in their own languages.
Figure 1-9 Voice servers and the enterprise environment
8/14/2019 E-business Globalization Guide
23/210
Chapter 1. What is globalization? 7
The portal concept makes customization easier and more culture-oriented so thatglobalization features can be merged together seamlessly with other designconsiderations.
Globalization requirements are everywhere. They serve various cultural expectations forusers around the world interacting with computer systems. Their differences can range fromthe obvious to the subtler, such that sometimes customers are not even aware of theirexistence.
For example, currency amount can be displayed in a form compatible with its users' culturalconventions, and the date can be represented correctly based on their preferred calendarsystems (for example, the Chinese lunar calendar, the Arabic calendar, or the Hebrewcalendar).
While formatting and handling text (either displaying or printing), text boundary analysis canlocate appropriate points for word-wrapping text so that it can fit within specific margins orgeneral linguistic boundaries for whole-word searching or indexing.
Figure 1-10 Line break and word break differ in different languages1
While comparing and sorting strings, some cultural conventions will be selected to takeprecedence over other properties in order to have the culturally expected result. Along withother script-dependent considerations, punctuation must sometimes be ignored. Accentdifferences are occasionally treated as key sorting properties. When a sequence consisting of
two or more letters must consider a single letter in sorting, specific guidance is required. InEnglish, the sorting order of words is generally straightforward. In Simplified Chinese, whilesorting Chinese chat (PinYin) in alphabetical order is also the most widely used method,sometimes the character counts override that order.
For example, the words (TongXue), (LanTianBaiYun), (ShiJie), and(Da) can be sorted alphabetically as:
But users might instead prefer to use character counts as the primary sorting rule andalphabetic order as the secondary:
1 This illustration is from Introduction to ICU ((http://oss.software.ibm.com/icu/userguide/boundaryAnalysis.html)
http://oss.software.ibm.com/icu/userguide/boundaryAnalysis.htmlhttp://oss.software.ibm.com/icu/userguide/boundaryAnalysis.htmlhttp://oss.software.ibm.com/icu/userguide/boundaryAnalysis.html8/14/2019 E-business Globalization Guide
24/210
8 e-Business Globalization Solution Design Guide
Moreover, different languages sor t the same characters differently. For example, in Swedish,z comes before if sorted in an ascending order, while in German, z comes after .
More technical details will be introduced in the following chapters.
8/14/2019 E-business Globalization Guide
25/210
8/14/2019 E-business Globalization Guide
26/210
10 e-Business Globalization Solution Design Guide
Figure 2-1 Single Executable for all is the basis for globalization
FlexibilitySince any server can support all available languages, the customer can design its networkand deploy servers based on load levels and resources rather than language supportrequirements. For example, one individual server can easily handle visits from both Chineseand American customers even though there is as much as a 13-hour time zone differenceseparating the two.
Figure 2-2 A well-globalized solution can balance loads effectively
Lower total cost of ownershipAn IBM customer can use the same version and patch level of a product throughout the world,thereby reducing the cost of support, maintenance, and training. IBM itself therefore sets avery good example by shipping a software product with only one code-base for all languages,thereby greatly reducing unnecessary expense for itself while simplifying customer usage.
8/14/2019 E-business Globalization Guide
27/210
Chapter 2. Why is globalization necessary? 11
Consistent data handlingCustomers using multiple solutions expect each included product to handle data identicallyand consistent with established industry standardsfor example, in collation and date/timeformatting. In order to develop a multilingual application, they most likely will have picked upan assortment of global programming tools. If these tools can interact with one another, thiscreates added value for each tool and thus improves the total solution.
Shorter time to marketIf various localized versions for different geographies are handled separately rather thancentrally in a global product development, the multinational product owner cannot deploy aproduct until all language versions are available. The waiting time can last from a couple ofdays to several weeks or even many months, and the functions and features might not bestrictly consistent.
Following the methodologies covered in this book (especially the concepts of the SingleExecutable and localization packs) will greatly shorten the time it takes to deliver all localizedversions for different geographies as soon as possible and at the same time (that is,worldwide simultaneous general availability).
Consistent deliveryUpon implementing the methodologies discussed in this book, a product owner will not becompelled to create separate language versions for all maintenance releases, updates, fixpacks, patches, etc. The Single Executable model means that changes to executables can bedelivered independently of their translation so that new translated versions are needed onlywhen there are changes to the product's user interface.
8/14/2019 E-business Globalization Guide
28/210
12 e-Business Globalization Solution Design Guide
8/14/2019 E-business Globalization Guide
29/210
Copyright IBM Corp. 2002. All rights reserved. 13
Chapter 3. How to implement globalization
Generally speaking, globalization belongs in the category of ease-of-use technologies.Offering ease-of-use solutions to your customers is critical to business success. Variousglobalization elements are available for ease-of-use solutions, ranging from the explicit to thesubtle. This chapter gives a brief introduction to implementing globalization, and theremainder of the book adds many detailed explanations and examples.
Globalization functions can be enabled by the operating system, software product, orbusiness application. The operating system usually provides basic support, being theminimum set of baseline requirements that a globalization solution needs, but all three worktogether in the following ways in order to provide an integrated and seamless globalizedsolution:1
1. The end user can input, view, and print characters from diverse languages, and a systemshould be able to accept data, process it, and output results correctly.
Some languages such as English, French, German, and Spanish are easy to handle,while others are complicated in terms of the programming required for computerprocessing. Scripts such as Thai, Hebrew, and Arabic are called complex displaylanguages. Hebrew and Arabic have letters that are displayed from right to left. Since theyalso mix in other languages and numbers that display from left to right, they require whatis called bi-directional support. Thai and again Arabic have characters that change verticalposition or shape depending on the characters around them, and require contextualsupport.
Special devices are being visualized and then designed by globalization professionals asnew script requirements emerge. The standard keyboard is the most common input
medium, while desktop or notebook displays and printers are the most prevalent outputdevices. Software assistants such as Input Method Editors (IMEs) are employed tosupport data entry of composed characters or large character sets. Now that pervasivecomputing devices such as cell phones, personal digital assistants (PDAs), and pagershave increasingly important roles, they are constantly being equipped with newinput/output mechanisms as research labs continue turning out their pioneeringtechnologies. On-screen keyboards not only beautify the appearance of computers to anunprecedented extent, but also bring improved functionality to the end user. Speech
3
1 The four categories here follow IBM G11N organization's opinions, which can be found inhttp://eou2.austin.ibm.com/global/global_int.nsf/Publish/982. However, the details are written by GCL.
http://eou2.austin.ibm.com/global/global_int.nsf/Publish/982http://eou2.austin.ibm.com/global/global_int.nsf/Publish/9828/14/2019 E-business Globalization Guide
30/210
14 e-Business Globalization Solution Design Guide
recognition introduces a brand-new input method with unprecedented globalizationchallenges in that computers now must recognize numerous spoken languages anddialects. Handwriting recognition similarly helps people get closer to the computer andmore easily, while challenging it to detect various written scripts and personal-choregraphics.
Figure 3-1 Voice technology architecture
So that physical devices can support the full character set of diversified languages, theoperating system must at the very least provide corresponding support for IMEs, fonts,and layout software.
2. Correct cultural support of data. For example, date/time/number/currency must bedisplayed/processed appropriately in formats that users prefer. (See Chapter 6, Localemodel on page 23.) Cultural support can be accomplished through the use of locales andlocale-sensitive functions. IBM's recommended cultural support solution is InternationalComponents for Unicode (ICU), an open source project that it sponsors. ICU can beprovided to operating systems, software products, and business applications to meet theirglobalization needs.
Advanced cultural support might involve business logic. For example, income taxcalculators must reflect tax amounts based on nation-specific income policies and theindividual's reported income.
3. Multilingual support through Unicode technology. Unicode is the universalcharacter-encoding scheme for written characters and text, including character sets used
by many of the world's written scripts. By providing a consistent way for handlingmultilingual text interchange internationally, Unicode is in widespread use today. It hasbeen widely accepted as the default encoding for many industry standards such as HTMLand XML that enable Java's capabilities for multilingual support, thus providing one of theprincipal foundations of e-business.
4. Users can choose language and cultural preferences. The working example introduced inPart 3, Our Global Travel Shanghai Demo: A working example on page 61 clearlyexplains how to make this happen.
Dialogic Hardware and Software
VoiceXML
Browser
Reco
Engine
TTS
Engine
VoiceXML
Browser
Reco
Engine
TTS
Engine
Language Support Component
Telephony and Media Component
System Management Component
Public Switched
Telephone Network
VoiceXML
Voice Technology Architecture
Web/Application
Server
Enterprise
Server
Global
VoiceXML
Application
Enterprise
Data
8/14/2019 E-business Globalization Guide
31/210
Copyright IBM Corp. 2002. All rights reserved. 15
Part 2 Globalization
application design
This part includes a general description of how to develop multilingual applications.
Part 2
8/14/2019 E-business Globalization Guide
32/210
16 e-Business Globalization Solution Design Guide
8/14/2019 E-business Globalization Guide
33/210
Copyright IBM Corp. 2002. All rights reserved. 17
Chapter 4. Single Executable
The world abounds with software products and applications encompassing a wide range oftechnical areas, and it is impossible to have a single architecture that will work for all of them.Nevertheless, there still must be interrelated base elements to enable a successful andefficient global e-business solution. In this chapter, we briefly introduce the building blocksessential to composing a multilingual system. Above all, it is most important that by product orapplication a Single Executable provide total support for all languages. This is key to ensuringthat a globalized system can be designed, built, and maintained efficiently and correctly.
This methodology has many benefits. For a product owner, it greatly simplifies development,testing, and support. For a product user, only one body of globally executable code must beinstalled per platform using a straightforward system configuration method to make it work fordifferent languages.
There is no easy way to evolve from single language-only applications to their globalizedcounterparts. To surmount the significant obstacles that will confront you, several approacheshave been devised for the delivery of multi-language translated applications. From today'sstandpoint, earlier thinking seems rather crude, and we can readily see how far softwareengineers have progressed in this area.
Basically there are three kinds of approaches to enabling applications with languageawareness and cultural sensitivity, as illustrated in Figure 4-1 on page 18.
4
8/14/2019 E-business Globalization Guide
34/210
18 e-Business Globalization Solution Design Guide
Figure 4-1 Three different program category types
1. Programs with messages, menus, and cultural behavior embedded in their code
This is the most expensive approach among the three, because each language needs itsown separate program, and each of these programs can serve only its own particularlanguage. Therefore, costs are escalated by the redundant testing, maintenance, andsupport required for such kinds of programs.
2. Programs with separated but bound or linked messages, menus, and cultural behavior
This approach shows some improvement over the first. Here the application is generatedfrom a common program source that bundles culture-sensitive files. Program source codeis separated from culture-related considerations, thus making it easier to maintain andleverage the existing investment. However, the executable still can support only thelanguage sets packaged, and functional testing, maintenance, and support must still berepeated for each language. Even if there is only a single set of codes, they might havecertain assumptions burned into them at compile time (for example, the use of singlebyte-only or multiple byte-aware string libraries).
3. Single Executable programs that dynamically retrieve resources
This is a dramatic improvement over the previous two approaches, and the benefits aremany. In this approach, software programs are developed that allow the Single Executableproduced by source compilation to handle the cultural needs of all supported locales. Thedifference from the second approach is that cultural and language-independent programcode calls cultural and language-dependent information at runtime, thereby greatlyreducing the expenditure of cost and effort otherwise invested throughout the product lifecycle.
Employing this best-of-breed technique brings up several design and implementationconsiderations:
1. The executable source code must logically be the only one used for all the supportedlocales.
2. Only one executable should be built/tested for all supported locales, and there should beno lag in code availability.
3. Only one version of the executable should be manufactured and distributed, althoughpackage options might be made available.1
Category
One
Message/Menu/
Help Text
Program
Code
Category
Two
Message/Menu/
Help Text
Program
Code
Category
Three
Message/Menu/
Help Text
Program
Code
Program Category Type
8/14/2019 E-business Globalization Guide
35/210
Chapter 4. Single Executable 19
4. Only one logic fix pack should apply in all supported locales.
5. The addition of new locales will generally require no modifications or additions to theprogram executable.
6. Locale-sensitive operations are supplied through a common API support mechanism thatprovides a full set of globalization functions.
7. All functions must behave correctly for all supported locales, including but not limited tothe following:
Number representation Date representation Time representation Currency representation Messaging User interface
1 Supported locale resources can be freely selected for packaging with the Single Executable.
8/14/2019 E-business Globalization Guide
36/210
20 e-Business Globalization Solution Design Guide
8/14/2019 E-business Globalization Guide
37/210
Copyright IBM Corp. 2002. All rights reserved. 21
Chapter 5. Unicode support
An encoding system is a method of assigning numbers to individual characters so that acomputer can process those characters. In computing's early days, there were hundreds ofdifferent encoding systems spanning many different language sets. No single system existedthat could process every single character from all languages throughout the world. Anotherdrawback was that those encoding systems might conflict with one another in that the samecharacter could have different number representations in different systems. Then Unicodeevolved.
The Unicode consortium, found at http://www.unicode.org , publishes Unicode information.
Now when the world wants to talk, it speaks in Unicode. Unicode provides a unique numberfor every character, no matter what the platform, no matter what the program, no matter what
the language. Unicode can represent every character in the world by providing a singleconsistent character to number mapping schema. Figure 5-1 illustrates some of thesecharacter to number mappings.
Figure 5-1 A single consistent character-to-number mapping schema
5
http://www.unicode.org/http://www.unicode.org/http://www.unicode.org/8/14/2019 E-business Globalization Guide
38/210
22 e-Business Globalization Solution Design Guide
Unicode can be used as a lingua franca across systems and languages. The InternationalComponents for Unicode (ICU) and Java are the IBM-recommended ways to handle Unicodetext. With Unicode, characters in different character sets can be displayed in the same Webpage simultaneously, as illustrated in Figure 5-2.
Figure 5-2 Characters from different scripts can be displayed in the same Web page simultaneously
8/14/2019 E-business Globalization Guide
39/210
Copyright IBM Corp. 2002. All rights reserved. 23
Chapter 6. Locale model
In English, the word locale means a place where something happens or has happened. Thisis a key term in globalization. Its meaning is not so straightforward that the definition cannotbe better given in several short sentences.
In globalization, the word locale was borrowed by software engineering from geography toindicate that the distribution of human cultural expectations of computer behavior fall intoclumps that can be grouped together, most commonly by language and country or region.This clumping of expectations has allowed the use of computer standards that describe setsof related expectations, such as how dates and times are formatted and how words aresorted. For the purposes of this architecture, a locale means a specification of a languageand country/region, or a specification of a language, country/region, and variant. Thus alocale can be specified by a string such as French-Belgium. It does not mean a datastructure that contains information for a language and country/region.
Locale is used by the software industry in general to mean any of the following relatedconcepts:
The set of people who share a set of common expectations about their computerinteractions
The common expectations of computer behaviors that those people share
The name given to one of those particular sets of expectations or people
The computer-readable data (and sometimes code) that encapsulates those behaviors
A locale model contains assumptions about all of these cultural features. In particular, any
adequate locale model used in a global e-business system must meet these requirements:
The locale model accounts for at least language and country/region and has someadditional way of specifying variants.
It includes support for the major categories of locale-dependent computing.
It provides for hierarchical fall-back behavior at either the source or runtime levels.
It allows different locales to be set per client. For multi-client server software, this meansthat there must be a way to have different locale processing for each client context (whichmay be per thread, depending on the client interaction model).
6
8/14/2019 E-business Globalization Guide
40/210
24 e-Business Globalization Solution Design Guide
The locale model supports conventions that allow all locale-sensitive components of ane-business system to communicate appropriately about locale settings.
This chapter covers several frequently used locale models. It will present complicationspresented by dates, times, currencies, etc. Typically, programs call international servicessuch as those found in Java or ICU to handle the complications found in all of these.
Numbers and mathematicsThe decimal system (base 10) is used in almost every country of the world. However, numberformats vary considerably. and there are still traditional (non-decimal) numbering systemssuch as Roman numerals that are used in important contexts.
Table 6-1 Number format
Currency formatCurrency format is usually composed of a locale, its currency name, its currency subunits,1 itscurrency symbol,2 positive format, negative format,3 currency codes,4 and currencyseparators.5 Currency separators include thousands separators, decimal separators, decimalposition, field length, and padding character.6
Different countries/regions have different formats and rules for currency. Table 6-2 and
Table 6-3 on page 25 show typical currencies used in international banking:
Table 6-2 Currency format
1 For example, the Egyptian pound contains 100 piasters.2 Whether symbols should be displayed to the right or to the left .3 Whether the minus sign should be displayed to the right or to the left.4 Defined by ISO 4217: 1995, for the currency code used in international banking.5 For example, $9,876,543.21.6 The symbol used to pad out the format to a specific string length.
8/14/2019 E-business Globalization Guide
41/210
Chapter 6. Locale model 25
Table 6-3 Currency separators
The situation is slightly more complicated in real life, since a given locale can have differentformats for different currencies. For example, people in the US might want to display a chartshowing both dollars and rupees, but using an English format for rupees rather than theIndian (i.e., with Hindi letters).
DateTable 6-4 shows the common and short formats for presenting dates within severalcountries/regions. When further precision is required, there are additional considerations. Forexample:
Whether a leading zero should be used for the day and month.
Although numbers are usually presented in decimal format, other locale-specificcharacters might be desired, such as in Hebrew.
Differences might exist between storage format (keyboard sequence) and presentationformat, such as with bi-directional scripts.
Table 6-4 Date format
TimeTable 6-5 on page 26 shows formats for presenting the time, and they can be adjusteddepending on various business circumstances. For more precision, we can add otherparameters. For example:
Time zone information (EST, CST, GMT, etc.) might need to be appended to therepresentation.
Where situations warrant, the separator symbol between minutes and seconds can beomitted.
Regarding AM and PM indicators, different geographies can have differentunderstandings as to what constitutes midnight and noon.
Weekends, holidays, and daylight saving time can make things even more complicated.
Locale
Thousands
Separator
Decimal
Separator Decimal Position Field Length
Padding
Character
ar_EG apostrophe comma 2 Not applicable None
en_US comma period 2 Not applicable Not applicable
zh_CN comma period 2 12 None
Important: Since both Java and ICU have multiple currency support (seehttp://oss.software.ibm.com/icu4j/doc/com/ibm/icu/util/Currency.html), numeric amountsshould always be paired with their corresponding ISO currency tags. Otherwise, peoplemight confuse values expressed in British pounds as Japanese yen.
http://oss.software.ibm.com/icu4j/doc/com/ibm/icu/util/Currency.htmlhttp://oss.software.ibm.com/icu4j/doc/com/ibm/icu/util/Currency.html8/14/2019 E-business Globalization Guide
42/210
26 e-Business Globalization Solution Design Guide
Table 6-5 Time format
CalendarThe Gregorian calendar is used today in most places in the world and is the standardcalendar for international business transactions. However, some countries/regions still usetheir own calendars for historical, political, religious, cultural, or even astrological reasons.
For thousands of years, the Chinese people have used their own lunar calendar (stillunofficially used even today), which accurately reflects the moon's rotation around the earth.The ancient Chinese used this calendar to guide their annual planting, and this custom haspersisted into the 21st century. In China, many senior citizens do not know their birth date bythe Gregorian calendar, but only by the Chinese lunar calendar.
In countries/regions such as Japan, the local calendar has an additional era name that isderived from the name of the reigning emperor. It is simply a Gregorian calendar, but must berefreshed so as to restart from year 1 once a new emperor begins his reign.
The Hijri calendar used in some Arabic countries/regions is more sophisticated. It begins inthe year 625 Gregorian and is a lunar calendar where each month begins with the new moon.Consequently, the number of days in the month is not fixed each year, but changesdepending on this cycle.
Telephone
Telephone numbers vary in length from country to country, but certain fields are common. Forexample, a hypothetical call from one country to another might require a calling sequence of011-xxxx-yyy-000-0000, consisting of:
011, the international access code from the USA and Canada xxxx, the country code (1 to 4 digits) yyy, the area code (1 to 3 digits, with no leading zero) 000-0000, the local portion (usually 8 digits or less)
To place an international telephone call, you must know the international access code of thecountry from which you are dialing as well as the country code of the country you are trying toreach.
Table 6-6 International telephone codes
Country/Region International Access Code CCITT/ITU Code Internal Phone Format
Egypt 00 20 (12) 3456789
Germany 00 49 12345-6789012345678
United States 011 1 (123) 456-7890
China 00 86 (10)65391188
8/14/2019 E-business Globalization Guide
43/210
Chapter 6. Locale model 27
MeasureThe measurement systems used in various countries/regions differ due to historical andlinguistic reasons. For example, people in United States still tend to use miles instead ofkilometers. The common terms used for the same units can also differ. Kg and kilogram arerecognized metric designations in the USA and Canada, but not in Greece, China, Russia,and many other countries/regions. Another example is the size of paper used in printers and
typewriters, which is inconsistent throughout the world.
IconsIcons are pictures of objects or actions. The importance of an icon's local meaning is explicitlyrecognized. To reduce the chance of a product rejection in a particular country/region orculture, we might:
Allow for icon substitution (not bind icons into executable code)
Aim for widespread acceptance, or prepare different icons to suit different cultures
If possible, avoid using icons that are similar to an offensive symbol in the targetcountries/regions
ConventionsMany other things vary considerably in form and meaning within particular countries/regionsand cultures. For example:
AbbreviationsThe same symbol may have different meanings in differentcountries/regions. For example, some peoples may interpret an X as crossing out what isnot desired rather than indicating what is to be selected.
Question marksA great many languages (such as English, French, and German) usethe question mark to indicate interrogation, but there are some exceptions to this. InSpanish, questions always begin with an inverted question mark (). For example, Ques eso? The Greek question mark looks very much like a semicolon, while the Greeksemicolon resembles an elevated period.
Percent symbolThe most common symbol used to indicate percent is %, as in 37%.There are exceptions. The Dutch language as written in Belgium and the Netherlandssometimes uses pct (as in 37 pct) to represent percentages. In the province of Qubec inCanada, the number and symbol is written as 37 %, with a space separating the two. InTurkey, the percent symbol is written before the number, as in %37. In Arabiccountries/regions, the percent sign should logically be written after the number, but sincewriting progresses from right to left, it is displayed on the left.
Pound sign/number sign symbolThe symbol # is known as the pound sign, hash mark,or number sign in various countries/regions, but unknown in many others.
Wildcard symbolsA wildcard symbol is any graphical character used to specify anindefinite argument in a search or other query. The asterisk (*), question mark (?), andampersand (&) are commonly used wildcard symbols.
Navigation and motoringIn the UK, India, Japan, and South Africa, people drive on theleft side of the road. In North America, South America, China, most of Europe and Africa,and the Middle East, they drive on the right.
Numeric superstitionsSuperstitions are beyond exhaustive cataloging, but a fewnumeric superstitions are worth specific mention because of their impact on individual andmass behavior. For example, the number 8 is usually considered to be wealthy in China,while the number 13 is treated as an unlucky sign in most western countries/regions.
8/14/2019 E-business Globalization Guide
44/210
28 e-Business Globalization Solution Design Guide
Sorting orderThe sorting order of two strings is the order in which they should appear when sorted. Thisorder is typically derived according to weights given to each of the characters in the string.There are, however, a number of complicating factors for different languages.1 In the contextof national language support, a correct sort must produce the following:
Predictable resultsThe sort result must always be the same, regardless of the initial
order of the items.
Culturally expected orderA person will easily find an item in a sorted list only if it issorted in the expected order.
People usually expect items to be sorted in alphabetical sequence. For example, if the user issearching for the item H90U42 and he finds L49M31 first, he will expect H90U42 to havepreceded L49M31 and thus search for H90U42 among those items occurring beforeL49M31, taking no notice of the items that follow it.
ConclusionIn addition to all of the things discussed above, many other customs and practices differ fromcountry/region to country/region. Different countries/regions have different lucky (fortunate)
colors, while colors in general are perceived differently in different countries/regions.Business etiquette is very culture-sensitive. For example, business dress codes are strictlyenforced in Arab countries/regions.
To be successful in business on a worldwide basis, it is imperative that you understand andaccommodate such cultural differences. In the realm of e-business, that means satisfyingindividual cultural needs by providing sound locale model software programs.
1 For more information, see Section 5.17 in the Unicode Standard.
8/14/2019 E-business Globalization Guide
45/210
Copyright IBM Corp. 2002. All rights reserved. 29
Chapter 7. Localization pack
Since Single Executable, discussed in Chapter 4, Single Executable on page 17, means tohave one and only one executable for multiple locales, we must use a standardized approach(which we call localization packs) for working with different sets of locale-specific programdata.
There are two types of localization packs, based on the type of program that uses them:
Application-dependent (such as menus, dialogs, and other user-interface elements) Application-independent (such as collation tables, transliteration rules, and the names of
date and time elements)
A very simple example helps to explain this concept. In an application program, a single key
(msg1) is associated with a single string value (Hello), such that when we want otherlanguage versions, corresponding localization packs can easily be created. See Example 7-1.
Example 7-1 A simple example containing the greetings in different languages
English Version:
STRINGTABLEBEGINMsg1HelloEND
Simplified-Chinese Version:
STRINGTABLEBEGINMsg1??END
French Version:
Japanese Version:
7
8/14/2019 E-business Globalization Guide
46/210
30 e-Business Globalization Solution Design Guide
The localization pack manageris the module that manages the location, loading andaccessing of localization pack resources.
There are various localization pack formats such as Java resource bundles, but in thischapter we discuss only the XML format (which has been recommended by the IBMglobalization organization).
Java has different formats for resource bundles, among which the most commonly used areListResourceBundle and PropertyResourceBundle. ListResourceBundle is actually compiledJava codes that are hardly readable outside the Java environment. PropertyResourceBundlecontains property files holding unstructured mappings, and 8859-1 characterencoding is used when saving properties to or loading them from a stream. For charactersthat cannot be directly represented in this encoding, Unicode escapes are used. This canresult in low readability in PropertyResourceBundle.
XML files can store human-readable and well-structured information, although in generalJava resource bundles have better performance. Most XML discussion will be reserved untillater on in this book. However, since this format has various implementation approaches, wewill cover some of those in this chapter.
XML source format and implementationsConsidering that localization packs need a cross-platform format and all-in-one characterrepository, the IBM globalization organization recommends XML because:
It is platform-independent (flexible enough to accommodate the need for variousplatforms)
By default it uses Unicode for document encoding so that it is capable of processingmultilingual data without data loss
It is an Internet standard, meaning that it can meet content format requirements for Webapplications
Furthermore, due to its popularity there are already many tools for working with XML files. For
example, XML Spy (see http://www.xmlspy.com) is the first true Integrated DevelopmentEnvironment (IDE) for XML and contains many useful functions that can simplify typical XMLediting tasks. It offers different presentations of an XML document, including:
Enhanced grid view Database/table view Schema design view Text view Browser view
The enhanced grid view is XML Spy's core presentation and editing view. This view allowsyou to see and directly manipulate elements in your XML document, such as the actual datathat it contains.
http://www.xmlspy.com/http://www.xmlspy.com/8/14/2019 E-business Globalization Guide
47/210
Chapter 7. Localization pack 31
Figure 7-1 XML Spyenhanced grid view
XML Spy shows the hierarchical structure of any XML-compliant document through a set ofnested containers that can easily be expanded and collapsed to get a clear picture of thedocument's structure. All items contained in an XML document such as the XML declaration,document type declaration, or any element that contains child elements are displayed in astructured way that allows for easy manipulation of content and structure simultaneously.
A hierarchical item is represented with a gray side bar and a tiny arrow. An element isdenoted with the icon , and an attribute is denoted with the icon =.
The enhanced grid view manipulates data in a graphical way so that editing in this view isinfinitely more comfortable. For example, you can:
Click the side bar to expand or collapse the item Drag and drop elements Insert new rows Copy/paste your data to and from other applications such as Excel and Access
When opening any XML document, XML Spy uses its built-in incremental validating parserboth to check the document for good formation and to validate it against any specified DTD or
8/14/2019 E-business Globalization Guide
48/210
32 e-Business Globalization Solution Design Guide
XSD schema. The same parser is also used while editing a document that refers to one ofthese schemas in order to provide intelligent editing help and immediately display anyvalidation error encountered.
For localization packs with XML format, it is also important to validate source and translatedfiles both before and after translation. XML Spy is an excellent tool for the preparation,editing, and maintenance of XML localization packs.
The typical e-business application essentially uses HTML as the interface shown to endusers. In a multilingual e-business environment, the XML-based localization pack managercan act as a kind of XML parser and therefore have different implementations as dictated bythe various HTML generation approaches. Here we present three modes for localization packmanager implementationembed mode, extend mode, and synthesize mode.
Suppose we have a page containing the greeting message hello and need to show thatword in the language of a specific locale (for example, zh_CN). The target HTML is shown inFigure 7-2.
Figure 7-2 A simple page containing the greeting message
And the localization pack XML files that store the multi-language greeting messages areshown in Figure 7-3 and Figure 7-4.
Figure 7-3 Localizationpack_en_US.xml
Figure 7-4 Localizationpack_zh_CN.xml
8/14/2019 E-business Globalization Guide
49/210
Chapter 7. Localization pack 33
For other languages, Localizationpack_xx_XX.xml (where xx_XX is the locale string) can beproduced if necessary.
Usually, the programs that generate the result HTML are JSPs, servlets, or portlets. We onlyprovide pseudo-codes here for demonstation purposes.
Embed modeThis is so called because the codes accessing localization pack XML are embedded inprograms such as JSPs, servlets, or portlets, where HTML source codes are produced line byline. The working flow as shown in Figure 7-5 gets the current locale, selects thecorresponding localization pack file for that locale in order to construct the XML instance tree,and accesses the nodes in the tree to insert them into the right place in the resulting HTML.
To fetch the translated keywords from localization packs, many Java-based XML parser APIscan be applied (such as DOM or SAX). We recommend that you use the parser API forXPATH because XPATH is an excellent language for manipulating path expressions (workingmuch like the directory path in a computer file system to identify nodes in an XML document).Thus it is easy to locate the message Hello by the path expression //greeting/Msg1,//Msg1, or /localizationpack/greeting/Msg1.
Figure 7-5 Localization pack implementationEmbed mode
This method is simple and classic, with no extra files except the localization packsthemselves. Note, however, that since the page layout and program data are merged in theprogram code, the program might need to be rebuilt if the layout changes.
Extend modeBesides the Java XML parser, you can also use XSL (eXtensible Stylesheet Language) toparse XML using XPATH path expressions. In this mode, the result HTML is not generated
8/14/2019 E-business Globalization Guide
50/210
34 e-Business Globalization Solution Design Guide
line by line in JSPs, servlets, or portlets, but rather transformed from an XML file to an XSLfile as shown in Figure 7-7 on page 35.
Figure 7-6 Localization pack implementationExtend mode
The program creates a temporary XML file containing dynamic data from back-end logic andapplies an external XSL file to transform it to the result HTML. It is the XSL's responsibility tomanage the localization packs and obtain the required locale data. The following XSL syntaxcan access nodes from another XML file:1
Using XSL separates the page layout from the program data and saves coding by applyingthe XML parser. All you need to do when the layout changes is to modify your XSL fileswithout rebuilding the program.
Synthesize modeSimilar to extend mode, synthesize mode also needs transforming XML and XSL to get theresult HTML. But this XML is actually an HTML containing the page layout with theto-be-translated messages marked by the syntax . The wayto produce such HTML is the same as that in embed mode.
1 Not the one with which to perform transformingthis is the data XML
Important: Since the number of files doubles because each program needs at least onecorresponding XSL file, using XSL might have an adverse impact on program
performance.
8/14/2019 E-business Globalization Guide
51/210
Chapter 7. Localization pack 35
This XSL is different from the XSL used in extend mode. It acts as a translator that searchesall tags named , replaces them with the content fetched from the localization packXML by the path expression defined in the attribute lppath, and copies the left parts to theresult HTML. In this way, one XSL can be applicable for all transformations. To make thesource HTML transformable, all you must do is add an XML header at the top of your HTML.
Figure 7-7 Localization pack implementationSynthesize mode
Like extend mode, synthesize mode saves the coding for the XML parser. Moreover, the totalnumber of files is small because only one XSL file is needed. The drawback is the need forrebuilding when your layout changes, and the tag remains in the result HTMLalthough it does not affect your display. The XSL can be revised to make the result HTMLmore concise.
Conclusion
Using XML as the source format of localization pack provides flexibility in organizinglocalization packs and localization pack manager.
8/14/2019 E-business Globalization Guide
52/210
36 e-Business Globalization Solution Design Guide
8/14/2019 E-business Globalization Guide
53/210
Copyright IBM Corp. 2002. All rights reserved. 37
Chapter 8. Input and output of multilingual
data
Inputand outputare very commonly used computer terms defining two separate computers'roles in communicating with end users. Input is the process of getting data from users, andoutput is the process of sending back a comprehensible reply to them.
In this context globalization means the ability to input text in different languages with akeyboard, mouse, or other device and to properly present it in those languages on the screenor printer. Generally, these functions will be supported by the operating system. By usinglinguistics services, a more human-friendly interface such as speech input can be enabled forthe end user.
Complex inputIMEs(input method engines, or editors) have been developed for handling complex input forcertain language sets. Language scripts such as Chinese are composed of a set ofideographs, and the supported character set is quite large. Input thus becomes very complex.IMEs are designed to solve this problem. Usually, IMEs are bundled with the operatingsystem, provide user-friendly and keyboard-accepted input methods (such as Pin Yin inChinese), get the right character based on the user's input, and are returned to whateverapplication is running on the system for further processing.
Figure 8-1 IMEs help to handle complex input for certain language sets
If we want to input Chinese characters, we first need to choose a Chinese IME. Once this IMEis active, a panel is usually displayed to accept input. Figure 8-1 shows a Chinese IME thatprovides the Chinese Pin Yin input method.
8
8/14/2019 E-business Globalization Guide
54/210
38 e-Business Globalization Solution Design Guide
When we want to input the Chinese word we key in the letters zhongwenshuru,which is the Pin Yin for the Chinese word.
The Pin Yin Latin alphabet characters are shown on the top line of the panel, with theircandidate Chinese glyphs listed just beneath. You select a Chinese word by pressing 1, andits Chinese characters will be then become your input.
Multiple IMEs can be installed on the same operating system to provide multilingual input. Nomatter what the current working locale, the end user is always allowed to choose thepreferred IME for a particular input language script. Given the ability to freely switch IMEs, anend user can easily input multilingual data regardless of system locale.
A typical example is Microsoft's Windows operating system, which operates in such a waythat end users can select and install the appropriate IMEs, together with their associated hotkeys.
Figure 8-2 IME selection panel in Microsoft Windows
In addition, IMEs also let end users choose their preferred encoding method. Windows 2000has a good range of IMEs, providing both Unicode and non-Unicode1 character set APIs sothat the application can define whether values generated by the IME should be in Unicode ornon-Unicode encoding. Figure 8-3 shows Windows 2000 Unicode character IMEs in differentlanguages.
1 Also known as legacy or OEM.
8/14/2019 E-business Globalization Guide
55/210
Chapter 8. Input and output of mult ilingual data 39
Figure 8-3 Unicode characters input by different IMEs
Complex outputIn certain scripts such as Japanese and Chinese, input becomes extraordinarily complex anddifficult. In languages such as Hebrew, Arabic, and Thai, on the other hand, it is with output
that the complexity arises.Glyphsare visible shapes representing characters in languages. Each language must haveenough glyphs to represent all defined composite characters. As far as mapping betweencharacters in memory and glyphs on the scree