35
Mahipalsinnh Rana Member of Technical Staff Sun Microsystems 1 Mahipalsinh Rana Member of Technical Staff Sun Microsystems Internationalization 360˚ Testing Agenda Introduction of I18n - 40 Minutes Internationalization(I18n) 360˚ testing - 60 Minutes Testing Standalone Applications – 15 minutes Quiz – 10 minutes Testing Web Applications – 15 minutes Quiz – 10 minutes I18n testing Automation – 30 minutes Advanced I18n testing , References – 15 minutes Q/A - 15 minutes

Internationalization 360 Testing - STeP-IN Forum

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Internationalization 360 Testing - STeP-IN Forum

Mahipalsinnh RanaMember of Technical StaffSun Microsystems

1

Mahipalsinh RanaMember of Technical StaffSun Microsystems

Internationalization360˚ Testing

AgendaIntroduction of I18n - 40 MinutesInternationalization(I18n) 360˚ testing - 60 MinutesTesting Standalone Applications – 15 minutesQuiz – 10 minutesTesting Web Applications – 15 minutesQuiz – 10 minutesI18n testing Automation – 30 minutesAdvanced I18n testing , References – 15 minutesQ/A - 15 minutes

Page 2: Internationalization 360 Testing - STeP-IN Forum

Understanding of Internationalization(I18n)Why I18n testingMyths for I18n testing Scope of I18n testingTerminologies in i18n technology

Character set/Character repertoireCharacter Code/Code Point,Coded CharacterEncoding , Unicode , UTF-8 ,UTF-16 ,UTF-32Glyph , Fonts , Input Method Engine (IME)Locale

Introduction

“Everyone has the right... to seek, receive and impart information and ideas through any media regardless of frontiers”

-- Universal Declaration of Human Rights

Page 3: Internationalization 360 Testing - STeP-IN Forum

Why Globalization

Why Globalization

Sun Portal server in Chinese

Page 4: Internationalization 360 Testing - STeP-IN Forum

Why Globalization

Yahoo.com in Kannada

Why Globalization

“Visitors linger twice as long as they do at English-only URL's.Business users are 3 times more likely to buy when addressed in their language.Customer service costs drop when instructions are displayed in the user's native language."

'Strategies for Global Sites'Donald DePalmaForrester Research Inc.

Page 5: Internationalization 360 Testing - STeP-IN Forum

Why Globalization

"One large IT company discovered that asignificant percentage of inquiries were comingfrom South Korea - they created a Koreanwebsite and revenues rose by 8 percent."

'Global eCommerce'Donald J. PlumleyBowne Global Solutions

What's with the acronyms?

Internationalization ====> i18n , How?There are 18 characters between i and n

With that logic :Localization ====>L10n Globalization ====> G11nTranslation ===> T9n

and you can call me M5l ==> Mahipal ,

Page 6: Internationalization 360 Testing - STeP-IN Forum

Don't they all look the same?

LocalizationInternationalizationGlobalizationTranslation

How do they differ and relate?

An

Globalization encompasses i18n and l10n.Internationalization enables localization.An expert in i18N may not be an expert in l10N.

Page 7: Internationalization 360 Testing - STeP-IN Forum

LISA* DefinitionsGlobalization-(G11n)

“Globalization addresses the business issues associated with taking a product global. In the globalization of high-tech products this involves integrating localization throughout a company, after proper internationalization and product design, as well as marketing, sales, and support in the world market.”

Internationalization-(I18n)“Internationalization is the process of generalizing a product so that it can handle multiple languages and cultural conventions without the need for re-design. Internationalization takes place at the level of program design and document development.”

Localization-(L10n)“Localization involves taking a product and making it linguistically and culturally appropriate to the target locale (country/region and language) where it will be used and sold.”

*Localization Industry Standards Association

Why I18n testing ?

I18n testing is required for enable product localization in multiple languages.Removing barriers to localization

Enabling UnicodeIndependence from UI strings in codeHandling legacy character encodings.Separating localizable elements from source.

Enabling code to support local,regional, language, or culturally related preferences.

Page 8: Internationalization 360 Testing - STeP-IN Forum

Myths for I18n Testing

Misunderstood as translation testingOnly language expert can perform i18n testingDone after product releasedMisunderstood with product localizationIt is only about String messages

Terminologies of I18n

What is Character set/Character repertoire?What is Character Code/Code Point,Coded Character?What is Unicode?What is meant by Encoding?

UTF-8,UTF-16,UTF-32What is Glyph?What is Font?What is Input Method Engine(IME) ?What is Locale?

Page 9: Internationalization 360 Testing - STeP-IN Forum

What is Character, Character Set ?

A character is just an abstract minimal unit of text. It doesn'thave a fixed shape (that would be a glyph), and it doesn't have a value.

"A" is a character, and so is "$", the symbol for the currency. Character set/repertoire is a collection of characters.Examples

!Making the World Wide Web world wide!

!!

What is Character Code/Code Point, Coded Character Set ?

Character Code - A mapping, which defines a one-to-one correspondence between characters in a character repertoire and a set of non-negative integers. Examples of character codes:

ASCII, ISO Latin 1 alias ISO 8859-1, ISO 10646, the Windows character set exists in different variations,or "code pages" (CP)-Windows code page 1252 etcA Character Code point is unique non-negative integer assigned to character in character code

A coded character set is a character set where each character has been assigned to a unique code point

Page 10: Internationalization 360 Testing - STeP-IN Forum

What is Character Code/Code Point, Coded Character Set ?

ASCII character set , one of early character set

Image Source :

What is Character Code/Code Point, Coded Character Set ?

Ex. ASCII Character set

8 bit character set , cover most of character needed by Europeans but What about east part of the world?

Image Source:

Page 11: Internationalization 360 Testing - STeP-IN Forum

Unicode

Ex. ASCII Character set

Answer is

It has characters from almost every written script in this worldEuropean alphabetic scripts

Latin,Greek,Cyrillic,Armenian,Georgian,Runic,Ogham,Modifier lettersMiddle East Scripts

Hebrew,Arabic,Syriac,ThaanaSouth & South East Asian scripts

Devanagari,Bengali,Gujurati,Panjabi,Oriya,Tamil,Telugu,Kannada,Malayalam

East Asian scripts Han,Hiragana,Katakana,Hangul,Bopomofo,Yi

SymbolsCurrency symbols,Letter like symbols,Mathematic operators,Numeric forms,Technical symbols,Geometrical symbols

Additional scriptsEthiopic Cherokee Canadian Aboriginal Syllabics Mongolian

What is Character Encoding ?

A mapping from a set of non-negative integers that are elements of a Coded Character Set, to a set of sequences of particular code units of some specified width, such as 8- bit/16-bit/32-bit integers

The most commonly used code units are bytes, but 16-bit or 32-bit integers can also be used for internal processing.

Examples are UTF-8,UTF-16,UTF-32

Page 12: Internationalization 360 Testing - STeP-IN Forum

UTF-8 , UTF-16 , UTF-32

UTF-32 simply represents each Unicode code point as the 32-bit integer of the same value.UTF-16 uses sequences of one or two unsigned 16-bit code

units to encode Unicode code points. [Values U+0000 to U+FFFF are encoded in one 16-bit unit with the same value. Supplementary characters are encoded in two code units]UTF-8 uses sequences of one to four bytes to encode Unicode

code points. [U+0000 to U+007F are encoded in one byte, U+0080 to

U+07FF in two bytes, U+0800 to U+FFFF in three bytes, and U+10000 to U+10FFFF in four bytes.]

Relation between Character set and Encoding

Characters A

Code Point 41 5D0 597D

UTF-8 41 D7 90 E5 A5 BD UTF-16 00 41 05 D0 59 7D UTF-32 00 00 00 41 00 00 05 D0 00 00 59 7D

Different encodings yield different byte sequences for same character in Character set

Page 13: Internationalization 360 Testing - STeP-IN Forum

Unicode Character set, code set, encodings

UniversalCharacter set/repertoire

UnicodeCodePoints

UTFencodings

All Character set will bea subset of this hugecharacterrepertoire.ASCIIset,French,Japanese,Korean,Devanagari

Each Unicodecharacter isassigned aUnicode Codepoint .Rangeis U+0000 toU+10FFFF.

UTF-8,UTF-16,UTF-32are theencodingformatsforinternalprocessing

What is Glyph?

A glyph - a visual appearanceIt is important to distinguish the character concept from the glyph concept. A glyph is a presentation of a particular shape which a character may have when rendered or displayed.Example: a letter and different glyphs for it:latin capital letter z (U+00E9)

Z Z Z Z+ + + + +

Page 14: Internationalization 360 Testing - STeP-IN Forum

What is Font?

A repertoire of glyphs comprises a fontA font is a numbered set of glyphs.The numbers correspond to code positions of the characters (presented by the glyphs).Font including characters for a language should be available for an application to display text for the language

What is Input Method Engine(IME)

Input methods capture a sequence of keystrokes and form a character or characters as input for languages

Input Method Engine (IME) is a program or operating system component that allows computer users to enter complex characters and symbols using a standard Western keyboard. It is also referred as Input Method Environment.

Page 15: Internationalization 360 Testing - STeP-IN Forum

What is Locale?Locale is a set of parameters that defines the user's language, country and any special variant references that the user wants to see in their user interface.

The locale naming convention is usually:language[_territory][.encoding][@modifier].

Example for Hindi with UTF8 encoding : hi_IN.UTF8Encoding [ Native encoding (iso8859-*, Shift_JIS,GB18030, BIG5, ISO2022) , Unicode encoding (UTF-8, UTF-16, ) ]

What is Locale?Behavior affected by Locale

Language culture dataSorting, searching, text boundary, text conversionIndexingCountry culture dataCalendar, date/time/number/currency formatPeople name/mailing address layout

Page 16: Internationalization 360 Testing - STeP-IN Forum

I18n 360˚ Testing Approach

What is Traditional ApproachWhat is 360˚ ApproachCase StudyRequirement PhaseDesign PhaseImplementation PhaseQA PhaseDocumentation

Generally start after build released by development team. In some case starts even after product release as they release separate international release

Major Focus on functionality testingI18n testing done on following

MessagesDate/CalenderSorting/Searching

Traditional Approach of I18n Testing

Page 17: Internationalization 360 Testing - STeP-IN Forum

Major Architectural flaws related to i18n get caught quite late Support for adding new language not givenDoes not consider global cultural requirementsAll the issues are reported in QA phase which takes longer time to fixDocumentation does not care about usage in non-english environment

Traditional Approach of I18n Testing

Start as early as Product planningI18n has role to play in each phase of Product life cycleNo corner untouchedHelps to design and build better products for Global Customers

What is 360˚ Approach

Page 18: Internationalization 360 Testing - STeP-IN Forum

UsecasesSearch Trains/FareMake ReservationKnow Passenger statusKnow Train ScheduleUser management

We will use this case study to illustrate key points in this workshop

Case Study – Railway Reservation System

Requirements phase

What is Global market requirements?Languages ,Regions to be supportedWhat Date format will be supported?What Calender will be supported?

Gregorian , Vikram samvat , Lunar etcWhat Cultural requirements to be taken care of?

Page 19: Internationalization 360 Testing - STeP-IN Forum

Case Study - Requirements phase

Who are the customers of this website?What languages will our target customers use?Which payment methods should be made available?Should we display information visually (ex. seat availability) or textually?What kind of Internet access/computers will our target customers use? how will that effect l10n?Should email confirmations/alerts be sent using local language?

Design phase

What approach will be used to support multiple languages?

Browser based and/or Command line basedList of languages on website

User interface designHow I18n of UI Messages will be done?How I18n of UI components

Button , Dropdown box size to accommodate multibyte valuesReview of images for cultural sensitivity ex. A sentence "Every <number> days" contains variable part which will be a input from text field. So, while design engineer should externalize the whole string as a single string. Not as 3 strings and concatenate programmatically.

Page 20: Internationalization 360 Testing - STeP-IN Forum

Design phaseI18n compliant Product architecture

Consideration of Encoding , Charset , Bi-diStandard I18n mechanism or customized i18n solution for each technology used in Product

ex. Java I18n , JSP I18n , AJAX I18n , Jruby I18nHow Locale fall-back will be handled?How I18n of Date,Calender,Sorting and Searching techniques will be done?How I18n of Error Messages , System error messages will be done?How I18n of Log message will be done?Input/Output should handle multibyte characters

Which features does not need I18n?

Case Study - Design phase

How can customer change language?How can messages on website will be visible in local languages of customer?How different encoding will be handled?How UI components on website handle messages in different language?How user can register in local languages?How I18n of various technologies done?

Page 21: Internationalization 360 Testing - STeP-IN Forum

Implementation phase – Interaction with Development Team

Setting up common convention Naming convention of localizable filesDirectory to store localizable filesHow to specify non localizable text in property file or html file

Educating developer about i18n best practicesMost of technology has standard way of doing i18nDefining customized i18n solution for technologies which does not have standard i18n solution

Implementation phase – Code review

Best way to find early and most common i18n issuesShould be done to catch following

Messages externalizationDate , Calender I18nEncoding handling , HTTP content headerSearching technique i18nSorting technique i18nInput field should have clear hints of which character are allowed

I18n implementation should be common across modules for same technology

ex. Java I18n should be done in same way across modules

Page 22: Internationalization 360 Testing - STeP-IN Forum

Implementation phase – Unit testing

Incorporate i18n in developer level i18n testinglets developers see for themselves if they broke i18nhelps prevent regressionimproves product quality tremendously

Case Study – Implementation phase

Find out any hard coded messages in codeCheck for encoding in html or jsp pageVerify how date are being displayedCheck out button,Dropdown size , is it sufficient for localized charactersInclude i18n testcases in developer testing

Page 23: Internationalization 360 Testing - STeP-IN Forum

QA phase – I18n Test case writingI18n test plan

Which build to start i18n testingHow much testing requiredWhich area to focus more for i18n and which are for less

I18n test cases writing and reviewReview base team testcases for functionality coverageTestcases should capture flow of mutlibyte data in productTestcases should cover culture specific issues

Date format change in various languagesInclude negative testcases for i18n

Fields which does not accept multibyte data

QA phase – Configuration MatrixConfiguration matrix for i18n testing

Which Locale to be testedWhich Encoding to be testedWhich Platforms to be tested

Install OS with l10n supportWhich features to be tested

Hint : test features which base testing team has already tested

Page 24: Internationalization 360 Testing - STeP-IN Forum

QA phase – Cultural Differences

Language ,Cultural specific representation of dataex. name and address formats are specific to language

,

Format Examples town, province postalcode China, India

postalcode town-province Brazil postalcode town, province M éxico

town province postalcode USA, Canada, Australia

Symbolism can differ from place to place. For example the check mark means incorrectin some places around the world. Ensure that you do not give the wrong message through your use of colors,symbolism, examples, etc.Be cautious with humour It doesn't travel well.When dealing with graphics, consider how to deal with text. Ideally the text will be overlaid on a graphic, rather than embedded in it. If the text is within the graphic, try to ensure that you develop it in layers, with text on a separate layer, so that when it comes to translation the text can be easily removed and replaced over complicated backgrounds.

Examples used in text are understandable by the audience of the translated version.

QA phase – Cultural Differences

Image Source :

Fast relief, when youneed it most!

Page 25: Internationalization 360 Testing - STeP-IN Forum

Color also has different connotations in different parts of the world.

For example, a black wedding kimono is not as strange in Japan as it may seem to a European.

QA phase – Cultural Differences

Image Source :

QA phase – Culture DifferencesCulture specific order

Image Source :

Page 26: Internationalization 360 Testing - STeP-IN Forum

InputEntering data in different languages – Is one keystroke equal to one character for non-English languages?Application should parse input multibyte data and process accordinglyOperating system allows to enter data in various languagesApplication can also provide inbuilt feature. Ex. Orkut

QA phase – Human Interface

OutputDisplaying data in different languages - what you enter, what stores in memory & what gets displayed – Is this all one-to-one mapping?It becomes complex and includes many-to-one mapping Text Rendering, Reordering, Layout of strings becomes complexOne character will not be equal to one glyphExample: Languages like Hindi which have Complex Text Layout(CTL), which can use a number of glyphs to form a single character

QA phase – Human Interface

Page 27: Internationalization 360 Testing - STeP-IN Forum

What are the considerations when you have to process Text which are in different languages ?

Text Boundary - Character/Word/Sentence/Line BoundaryChinese and Japanese do not have space between wordsCTL character may contain multiple code points (glyphs)

Text Input/Output, Encoding ConversionText transferred between applications or external files should have consistent encoding, else encoding conversion is involved

Text Layout and Direction, Vertical and BiDiSome Asian countries still use vertical writing systemArabic and Hebrew use Bi-Direction writing system

Text Sorting and Searching

QA phase – Text Processing

Vertical characters should be correctly displayed for based on languages

text proceeds downwards syllable by syllable, not letter by letter.

QA phase – Presentation Matters

Image Source :

Page 28: Internationalization 360 Testing - STeP-IN Forum

Right to left layoutBBC site in Left to Right and Right to Left language.

QA phase – Presentation Matters

Formatting of Data is different when dealing with different languages / regionsDate/Time formats, Calendar

Date/time formats are different across languages and countriesSome countries use local calendar as their official calendar

Number/Currency formatShow number in the format of the language user prefersSuch number should be parsed by number parser for the user preferred language

QA phase – Format

Page 29: Internationalization 360 Testing - STeP-IN Forum

What are the considerations when dealing with messages in your application?

Externalizing UI messages, Error messages from program to resource files for localizationCategorise static content like (help files / docs ) to languages specific directory

Message FormattingWhen message contains more than one place holders, you need to consider that the translated messages may re-order these place holders

Message EncodingMessages should be encoded in the encoding that the application expects

QA phase – Message

Can be used when product is yet to be localizedCreate localized resource bundle by adding localized character at beginning and end of each English messagesEffective way of finding hard coded stringsex. English Resource bundle

ex. English resource bundle MyMessages.propertieswelcome=Welcome to I18n WorldstartProcess=Start the process

Create resource bundle for Hindi as followMyMessages_hi.propertieswelcome= Welcome to I18n WorldstartProcess= Start the process

QA phase – Pseudo localization

Page 30: Internationalization 360 Testing - STeP-IN Forum

Access the website in non english languageDo registration as non english userBook a ticket for non english passengerVerify site able to display non english characters correctlyEnsure website provide correct responses with non english inputsCheck whether website comes in user language

Case Study - QA phase

How to install product in non-English environmentHow to configure features in non-English environmentHow to add new language to ProductVerify I18n specific hints and processes documented correctlyHints to translator regarding culture specific images in documentationCase Study

Documentation

Page 31: Internationalization 360 Testing - STeP-IN Forum

Setting of localized environmentOperating system with l10n supportStarting product in non-English environment

Language selectionApplications testing with multibyte dataQuiz

Testing Standalone applications

Setting of localized environmentOperating system with l10n supportStarting product in non-English environment

Browser preferred languageContent negotiationPresidency of language (user preferred locale, browser preferred locale, platform locale)Application Testing with multibyte dataQuiz

Testing Web applications

Page 32: Internationalization 360 Testing - STeP-IN Forum

Automation frameworkAutomation tool should support multibyte dataLeverage from core testing team

Scope of testing to be automatedRegression testingDemo

Automation Testing

Speech basedHigher recognition accuracy can be obtained by tailoring voice input to regional dialectsVoice output in the wrong dialect can make an application sound ‘foreign’Applications supported with regional dialects have better impact

Indic , Bi-Di specific issuesTitles and NamesDifferent ways of expressing currencyPresentation / Styling issuesCalenders - Vikram Samvat/ Saka / Hijri/Islamic

Advanced I18n testing

Page 33: Internationalization 360 Testing - STeP-IN Forum

Advanced I18n testing – International Domain Name (IDN)

Lot of demand for not ASCII domain nameshttp://räksmörgås.josefsson.org/mål/franzén.html

domain name path

New standards have come out of the IETF recently that make this possible.The W3C personnel contributed to the development of these standards.There are still some hurdles to overcome with regard to security anddeployment, but it is possible to use these now. For more information seehttp://www.w3.org/International/articles/idn-and-iri/ .

References

W3C Internationalization :http://www.w3.org/International/Sun Software Globalization : http://developers.sun.com/techtopics/global/Software Globalization - Architecture, Design,Testing : http://developers.sun.com/techtopics/global/technology/arch/Software Globalization- JES : http://developers.sun.com/techtopics/global/products_platforms/jes/Sun Software Product Internationalization Taxonomy : http://developers.sun.com/dev/gadc/des_dev/i18ntaxonomySubscribe to Software globalization NewsLetter : http://developers.sun.com/dev/gadc/subscribe/index.htmlTechnical articles on Java Internationalization : http://java.sun.com/developer/technicalArticles/Intl/Java Internationalization Tutorial : http://java.sun.com/docs/books/tutorial/i18n/index.htmlThe Java Tutorial's Weblog: http://blogs.sun.com/thejavatutorials/

Page 34: Internationalization 360 Testing - STeP-IN Forum

Last but not the least!

“Maintain that rapport with your Development team.”

Q/A

Page 35: Internationalization 360 Testing - STeP-IN Forum

Thank you

Mahipalsinh [email protected]