37
Open-Source Approaches to Unicode Enablement Panel Discussion

Open-Source Approaches to Unicode Enablement

  • Upload
    jara

  • View
    24

  • Download
    0

Embed Size (px)

DESCRIPTION

Open-Source Approaches to Unicode Enablement. Panel Discussion. Agenda. Panel Introductions Library Descriptions and Demos What is Open Source? What is the Open Source experience? Q and A. Arnt Gulbrandsen Bob Verbrugge Frank Tang Helena Shih Mark Leisher. Steven Loomis Steven Watt - PowerPoint PPT Presentation

Citation preview

Page 1: Open-Source Approaches to Unicode Enablement

Open-Source Approaches to Unicode Enablement

Panel Discussion

Page 2: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Agenda

Panel Introductions Library Descriptions and Demos What is Open Source? What is the Open Source experience? Q and A

Page 3: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Today’s Panel

Arnt Gulbrandsen Bob Verbrugge Frank Tang Helena Shih Mark Leisher

Steven Loomis Steven Watt Tex Texin Yves Arrouye

Page 4: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Page 5: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Page 6: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Library Descriptions and Demos

Troll: QT Free Edition CRL: Assorted Unicode Support Mozilla: International Library of Mozilla IBM: International Components for

Unicode

Page 7: Open-Source Approaches to Unicode Enablement

Troll’s Qt Free Edition

Arnt Gulbrandsen

Troll Tech

Page 8: Open-Source Approaches to Unicode Enablement

CRL’s Unicode Support

Mark Leisher

Computing Research Laboratory

New Mexico State University

Page 9: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

CRL’s Unicode Support

Goal: Provide example resources usable on Unix.

Fonts.

Encoding mapping tables.

Unicode character information.

Algorithms.

Other resources.

Resource availability.

Page 10: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

CRL’s Unicode Support

Fonts.

Three bitmap fonts in BDF format were developed and made available.

Arabic

Devanagari

Clearly U

Page 11: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

CRL’s Unicode Support

Encoding mapping tables.

The Unicode Consortium provides mapping tables for converting many of the more common character sets to Unicode. The CSets archive provides supplementary mapping tables for character sets and encodings that are not supplied by the Unicode Consortium.

Page 12: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

CRL’s Unicode Support

Unicode character information.

To facilitate development of Unicode-capable software, a simple character information and partial bi-directional reordering API and library was developed early on before standardization efforts really gained momentum. This is the UCData package and the Pretty Good Bidi Algorithm.

Page 13: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

CRL’s Unicode Support

Algorithms.

To further encourage independent development of Unicode capable software, a few basic text search algorithms were converted to use Unicode text. These include:

A Boyer-Moore string search routine.

A glob matching routine called Wildmat.

An almost minimal DFA regular expression routine.

Page 14: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

CRL’s Unicode Support

Other resources.

Some of the other resources made available by CRL are:

Code to test wchar_t type support in C/C++ compilers.

Keyboard arrangements for various languages that have been collected over the years.

Resource Availability.

All of the resources mentioned are freeware and can be found at http://crl.nmsu.edu/~mleisher/.

Page 15: Open-Source Approaches to Unicode Enablement

International Library for Mozilla

Frank Tang

Netscape Communications

Mozilla

Page 16: Open-Source Approaches to Unicode Enablement

International Components for Unicode (ICU)

Helena Shih and Steven Loomis

IBM Unicode Technology Center

Page 17: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Unicode support in the Industry

Lack of a complete set of features in most implementations.

Inconsistent across different environments. Win32 vs. POSIX, for example.

Poor portability. Unable to share the resources with other products. Almost no extensibility and customization. Not a concern for most companies when a product is

first designed.

Page 18: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

AS/400 e-Server 720AS/400 e-Server 720

Netfinity ServerNetfinity Server

S/390 Server S/390 Server

Apple G3 MacintoshApple G3 Macintosh

Microsoft NT WorkstationMicrosoft NT Workstation

Sun Ultra 60 WorkstationSun Ultra 60 Workstation

IBM’s DB/2 ProductIBM’s DB/2 Product

World Wide WebWorld Wide Web

II

CC

UU

II

CC

UU

Page 19: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

ICU Objectives

Quality Unicode & I18N support across platforms Consistent results in both C/C++ and Java Powerful, portable API available to the Open-

Source development community Important resources sharing mechanism Outside feedback & contributions improve quality

and feature set

Page 20: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

ICU Features

Parallel to the i18n architecture in JDK All components multi-thread safe Full Unicode string manipulation Complete locale support, e.g. > 145 locales Fast and flexible character set conversion Efficient data loading mechanism Hierarchical resource bundles with Unicode data Extensive calendar and timezone support Date, time, currency, number and message formatting

Page 21: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

ICU Features

Locale sensitive sorting (including Thai) Locale sensitive text boundary detection Customizable transliteration interface Unicode text compression algorithm Fast and compliant Unicode 3.0 Bidi algorithm Unicode 3.0 normalization support Most up-to-date Unicode 3.0 character properties

Page 22: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Platform Support

Reference Platforms:– AIX – OS/390 – AS/400 – RedHat Linux – Solaris– Windows 98, NT4.0 and Win2000– HP-UX

Working Partners: Sun, IBM, NCR, Xerox, Netscape, Progress, RealNames, Versant, Compuware, GlobalSight, Hotmail, Lotus ...

Page 23: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

ICU Documentation

API Documentation– Updated from header files (like javadoc)– Available on external web site

User Guide– Work in progress, feedback welcome– Initial draft available

Page 24: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

ICU4J - ICU for Java

IBM developed extensive I18N library I18N code added to Java JDK 1.1 Java code ported to C++ -> ICU ICU available on alphaWorks Both ICU and Java classes continue development

– Sometimes “leapfrogging” each other with features

ICU open source, moves to developerWorks 2000 March: Java Code open source as “ICU4J”

Page 25: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

ICU4J Features

Builds on Java 2 feature set Feature summary:

– Advanced text boundary detection

– Calendars: Hebrew, Hijri/Islamic, Japanese Gengou, Thai Buddhist

– Spelled-out numbers

– Normalization

– Transliteration

– Standard Unicode compression

Page 26: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Reference Information

ICU Web Sites– http://oss.software.ibm.com/icu/

developerWorks Unicode site– http://www.ibm.com/developer/unicode/

The Unicode Standard– http://www.unicode.org/

developerWorks Java site– http://www.ibm.com/developer/java/

Page 27: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Demos

Locale Explorer xliterate-It! Qt Demo

Page 28: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Agenda

Panel Introductions Library Descriptions and Demos What is Open Source? What is the Open Source experience? Q and A

Page 29: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

ICU OpenSource Objectives

Promotes a cross-platform Unicode strategy Produces a Unicode technology

implementation Supports important OpenSource products

Linux, Apache, Mozilla, XML etc.

Page 30: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Open-Source Models

The Apache model– Web access for CVS repository

– Technical committees

Developer community support – [email protected] support account

– news.alphaworks.ibm.com discussion newsgroup

Commercial product partnership– RealNames, versant, GE ...

Page 31: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Open-Source Models

The Troll Tech model– Free and Professional Editions

– Distinguish private, open source use from commercial,

closed source use

– All contributions accepted and used in both versions.

– Source updated daily

Page 32: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Why contribute to Open Source?

Bob Verbrugge:– Requires robust I18n and portability– Implementing alone, cost is considerable– Sharing development is cost effective– Shared knowledge with experts– Ability to influence the end-result

Page 33: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Why contribute to Open Source?

Steve Watt:– Requires portability and interoperability– Upgrading existing library to Unicode

version 3.0 is a sizable effort– Commercial libraries did not meet our

needs– Shared effort means our development

focus is now aligned with on our needs

Page 34: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Why contribute to Open Source?

Steve Watt’s concerns:– Giving away proprietary technology– Design by committee– Will release schedules fit product

schedules?– Will library and product stay in synch?– Do all participants have common

objectives?

Page 35: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Why contribute to Open Source?

Yves Arrouye:– Share expertise, give something– Benefits from features developed by others

• Normalization, optimized algorithms• Character set conversions

– Access to source code– Using multiple Open Source products

Page 36: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Why contribute to Open Source?

Yves Arrouye’s concerns:– Management Perceptions

“If it’s free, it must be for play…”– Entry requirements and qualifications to be

able to affect direction or design– Patch integration, Release control and

schedules– Build stability

Page 37: Open-Source Approaches to Unicode Enablement

16th International Unicode Conference Amsterdam, the Netherlands, March 2000

C14, C15: Panel on Open-Source Approaches to Unicode Enablement

Panel Introductions Library Descriptions and Demos What is Open Source? What is the Open Source experience? Q and A

Agenda