66
Agenda: Guidelines for Agenda: Guidelines for Supporting Complex Scripts* Supporting Complex Scripts* on Windows 2000 on Windows 2000 Key Concepts Key Concepts Overview of Unicode Overview of Unicode Migrating existing applications Migrating existing applications Using Unicode text in resources Using Unicode text in resources *Such as Devanagari and *Such as Devanagari and Tamil Tamil

Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000 Key Concepts Overview of Unicode Migrating existing applications Using Unicode

Embed Size (px)

Citation preview

Page 1: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Agenda: Guidelines for Supporting Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000Complex Scripts* on Windows 2000

Key ConceptsKey Concepts Overview of UnicodeOverview of Unicode Migrating existing applicationsMigrating existing applications Using Unicode text in resourcesUsing Unicode text in resources

*Such as Devanagari and Tamil*Such as Devanagari and Tamil

Page 2: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

DefinitionsDefinitions Enabling for a script:Enabling for a script:

Adding support for input, display, Adding support for input, display, and output of the scriptand output of the script

Localization:Localization:Translating user interface elementsTranslating user interface elements

Globalization:Globalization:Developing software such that feature Developing software such that feature design and code design are not limited design and code design are not limited to a single locale or scriptto a single locale or script

Page 3: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Requirements for Enabling Requirements for Enabling Indian Scripts in Applications Indian Scripts in Applications on Windows 2000:on Windows 2000:

Use Unicode to encode textUse Unicode to encode text Enable for complex scriptsEnable for complex scripts

Note: Many Microsoft products do Note: Many Microsoft products do not yet meet these requirements. not yet meet these requirements. However, we’re working on it!However, we’re working on it!

Page 4: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Overview of UnicodeOverview of Unicode

Page 5: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Character Set EvolutionCharacter Set Evolution

MS-DOS: OEM character setsMS-DOS: OEM character sets Windows 3.x: ANSI character setsWindows 3.x: ANSI character sets Windows 9x: ANSI character setsWindows 9x: ANSI character sets Windows NT: Windows NT:

UnicodeUnicode Supported for Compatibility: Supported for Compatibility:

OEM (console) character sets, OEM (console) character sets, ANSI character sets, ANSI character sets,

Page 6: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Why do character set Why do character set differences matter?differences matter?

Historically, they fragmented code Historically, they fragmented code bases for both Windows and bases for both Windows and applicationsapplications Single byte: European editionsSingle byte: European editions Double byte: Far East editionsDouble byte: Far East editions Bi-directional: Middle East editionsBi-directional: Middle East editions

Make it difficult to share dataMake it difficult to share data Make it difficult to develop multilingual Make it difficult to develop multilingual

applicationsapplications

Page 7: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

What is Unicode?What is Unicode?

A 16-bit character encodingA 16-bit character encoding A mapping of characters to numbersA mapping of characters to numbers Syntax rules for display of complex scriptsSyntax rules for display of complex scripts Not a font or glyph encoding!Not a font or glyph encoding! Not a sort algorithm!Not a sort algorithm!

Includes all characters in common use Includes all characters in common use in modern scripts (and others)in modern scripts (and others)

Basis for the ISO 10646 character Basis for the ISO 10646 character encoding standardencoding standard

Native text encoding for Windows NTNative text encoding for Windows NT

Page 8: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

UnicodeUnicode™™ / ISO / ISO 1064610646

16-16-bit international bit international character encodingcharacter encoding

Windows 2000 uses Windows 2000 uses Unicode version 2.0 Unicode version 2.0

0x0000

0xFFFF

PunctuationPunctuation

Future useFuture use

ASCIIASCII

Private usePrivate use

CompatibilityCompatibility

IndianIndian

GreekGreek

Arabic, HebrewArabic, Hebrew

LatinLatin

IdeographsIdeographs(Hanzi, Kanji, (Hanzi, Kanji, Hanja)Hanja)

SymbolsSymbols

HangulHangulKanaKana

ThaiThai

A0041 9662 FF96 4F85 0000((null)null)

Page 9: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Relatives of UnicodeRelatives of Unicode

ISO/IEC 10646ISO/IEC 10646 32 bit ISO standard of 64K X 64K “planes”32 bit ISO standard of 64K X 64K “planes” Unicode repertoire is plane 0Unicode repertoire is plane 0

UTF-7UTF-7 7 bit transformation format7 bit transformation format Not widely usedNot widely used

UTF-8 UTF-8 8 bit transformation format8 bit transformation format Used in web pages and some emailUsed in web pages and some email

Page 10: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Why Should I Use Unicode Why Should I Use Unicode and Win32 for Indian Text?and Win32 for Indian Text?

MMy application works fine

y application works fine now!now!

MMy application works fine

y application works fine now!now!

????

Page 11: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Benefits of Using Unicode Benefits of Using Unicode on Windows 2000on Windows 2000

Share data (e.g., cut and paste) with Share data (e.g., cut and paste) with other Win32 applicationsother Win32 applications

Make use of full Win32 API for text Make use of full Win32 API for text processingprocessing

Support multilingual documents, Support multilingual documents, including multiple Indian scriptsincluding multiple Indian scripts

Use industry standard encodingUse industry standard encoding

Page 12: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Summary: Use Unicode –Summary: Use Unicode –It is the ultimate character It is the ultimate character encodingencoding

Represent all text with Represent all text with oneone unambiguous unambiguous encodingencoding

Support multilingual text easilySupport multilingual text easily Avoid special processing for variable byte-Avoid special processing for variable byte-

length characterslength characters Use standard encoding recognized Use standard encoding recognized

throughout the industry and the worldthroughout the industry and the world Support new scripts that are only supported Support new scripts that are only supported

through Unicodethrough Unicode

Page 13: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Migrating Exiting Applications Migrating Exiting Applications to Support Indian Text on to Support Indian Text on Windows 2000…Windows 2000…

Three Migration Scenarios:Three Migration Scenarios:

1.1. ANSI application to UnicodeANSI application to Unicode

2.2. Standard Win32 application to Standard Win32 application to complex script enabledcomplex script enabled

3.3. Existing Indian language Existing Indian language application to Unicode and Win32application to Unicode and Win32

Page 14: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Migrating ANSI applications Migrating ANSI applications to Unicodeto Unicode

Overview of “A” and “W” entry pointsOverview of “A” and “W” entry points How to build a Unicode Win32 How to build a Unicode Win32

ApplicationApplication Unicode Applications on Windows 98Unicode Applications on Windows 98

Page 15: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Review of the W and A APIsReview of the W and A APIs

Two kinds of window classes: Unicode, ANSI Two kinds of window classes: Unicode, ANSI Win32 API has two versions of most functions:Win32 API has two versions of most functions:

““W” (wide) version handles UnicodeW” (wide) version handles Unicode ““A” (ANSI – A” (ANSI – ) assumes the system default code ) assumes the system default code

page (character encoding)page (character encoding)

Macros resolve to W or A entry pointMacros resolve to W or A entry point Example: Macro for RegisterClassExExample: Macro for RegisterClassEx

#ifdef UNICODE#ifdef UNICODE

#define RegisterClassEx RegisterClassExW#define RegisterClassEx RegisterClassExW

#else#else

#define RegisterClassEx RegisterClassExA#define RegisterClassEx RegisterClassExA

#endif#endif

Page 16: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

To Build a Unicode-enabled To Build a Unicode-enabled Application:Application:

Automatic in Visual Studio:Automatic in Visual Studio: Compile with options –DUNICODE and -D_UNICODECompile with options –DUNICODE and -D_UNICODE Specify WinMainCRTStartup in Specify WinMainCRTStartup in

ProjectSettings/Link/Output/EntryPointSymbolProjectSettings/Link/Output/EntryPointSymbol

Or, use only the “W” routines from Win32 APIOr, use only the “W” routines from Win32 API Metafiles:Metafiles:

Use Extended Metafiles (EMF)Use Extended Metafiles (EMF) Windows Metafiles (WMF) don’t support UnicodeWindows Metafiles (WMF) don’t support Unicode

Page 17: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

For Applications that Must For Applications that Must Also Run on Windows 98… Also Run on Windows 98…

Use Unicode everywhere with single Use Unicode everywhere with single binary, two code paths:binary, two code paths: On Windows NT use W entry pointsOn Windows NT use W entry points On Windows 98, convert Unicode On Windows 98, convert Unicode ANSI, ANSI,

use A entry pointsuse A entry points See sample GLOBALDV for exampleSee sample GLOBALDV for example

See April Microsoft Systems Journal See April Microsoft Systems Journal for details and other optionsfor details and other options

Page 18: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Migrating Standard Win32 Migrating Standard Win32 Application to Support Application to Support Complex ScriptsComplex Scripts

Good news: In a Unicode Good news: In a Unicode application, it basically just works!application, it basically just works!

Page 19: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Simple, Simple, PlainPlain-text -text ApplicationsApplications

Use standard edit control in Visual Use standard edit control in Visual C/C++C/C++

Use standard win32 API functionsUse standard win32 API functions Win32 APIs: ExtTextOutW or DrawTextWWin32 APIs: ExtTextOutW or DrawTextW ScriptString API in UniscribeScriptString API in Uniscribe

Page 20: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Pitfalls in Enabling for Pitfalls in Enabling for Complex ScriptsComplex Scripts When displaying typed text:When displaying typed text:

Do notDo not output characters one by one! output characters one by one! DoDo save text in a buffer and display the save text in a buffer and display the

whole string with Uniscribe or Win32 APIwhole string with Uniscribe or Win32 API

To measure line lengths:To measure line lengths: Do notDo not sum cached character widths sum cached character widths DoDo use a GetTextExtent function or use a GetTextExtent function or

UniscribeUniscribe

Page 21: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Simple Applications With Simple Applications With FormattedFormatted Text Text

Use rich edit Use rich edit control in Visual control in Visual C/C++C/C++

Internet Explorer 5.0: Use Internet Explorer 5.0: Use Document Object Model Document Object Model (more later)(more later)

Page 22: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Applications With Applications With Advanced Advanced FormattingFormatting and Layout and Layout

Use Use scriptscript APIs (“Uniscribe”) APIs (“Uniscribe”) See MSJ article of November 1998See MSJ article of November 1998

Page 23: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

What about Visual Basic, What about Visual Basic, Visual J++?Visual J++? Visual Basic 6.0Visual Basic 6.0

Standard controls are ANSI, not UnicodeStandard controls are ANSI, not Unicode Use “MS Forms 2.0” controls to use Use “MS Forms 2.0” controls to use

Unicode in controlsUnicode in controls Resource editor Resource editor doesdoes support Unicode support Unicode

Visual J++Visual J++ Resource editor supports UnicodeResource editor supports Unicode Text Output is ANSI onlyText Output is ANSI only

Future Plans: Make Unicode work Future Plans: Make Unicode work everywhere in Visual Studioeverywhere in Visual Studio

Page 24: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Migrating Existing Indian Migrating Existing Indian language applications to language applications to Win32 and UnicodeWin32 and Unicode

Page 25: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Step 1 in Migrating Existing Step 1 in Migrating Existing Indian ApplicationsIndian Applications

Follow guidelines for Unicode enabling Follow guidelines for Unicode enabling and complex script enablingand complex script enabling

Page 26: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Step 2 in Migrating Existing Step 2 in Migrating Existing Indian Applications …Indian Applications … Provide conversion facility to migrate Provide conversion facility to migrate

documentsdocuments From your format to ISCIIFrom your format to ISCII From ISCII to UnicodeFrom ISCII to Unicode

MultiByteToWideChar(<codepage>, …MultiByteToWideChar(<codepage>, … Devanagari is codepage 57002Devanagari is codepage 57002 Tamil is codepage 57004Tamil is codepage 57004

See UCONVERT sampleSee UCONVERT sample Included on your CDIncluded on your CD Modified from UCONVERT in Win32 SDK Modified from UCONVERT in Win32 SDK

Page 27: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Using Unicode Text in Using Unicode Text in ResourcesResources

Getting Unicode into Win32 resourcesGetting Unicode into Win32 resources Multilingual Visual C/C++ applicationsMultilingual Visual C/C++ applications

Page 28: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Getting Unicode into Win32 Getting Unicode into Win32 ResourcesResources

Create Unicode RC fileCreate Unicode RC file Resource editor in Visual Studio does not Resource editor in Visual Studio does not

support Unicode yet, sosupport Unicode yet, so Generate rc file for English using IDEGenerate rc file for English using IDE Translate to target language with Unicode Translate to target language with Unicode

editor (e.g., notepad or Word)editor (e.g., notepad or Word) Save as UnicodeSave as Unicode

Compile with resource compiler RC.EXECompile with resource compiler RC.EXE RC.EXE RC.EXE doesdoes support Unicode support Unicode Compile within Visual Studio IDECompile within Visual Studio IDE

Page 29: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Implementing Multilanguage Implementing Multilanguage User Interface in ApplicationsUser Interface in Applications

Use satellite resource DLLsUse satellite resource DLLs Default to user settings, butDefault to user settings, but Allow user to changeAllow user to change For details, see:For details, see:

April 1999 Microsoft System JournalApril 1999 Microsoft System Journal GLOBALDV sample codeGLOBALDV sample code

Page 30: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Multilanguage User InterfaceMultilanguage User Interface Initialize to current UI languageInitialize to current UI language

Windows 2000: Windows 2000: GetUserDefaultUILanguage()GetUserDefaultUILanguage()

Others: Use the language of the O/SOthers: Use the language of the O/S

Allow user to select UI languageAllow user to select UI language Put language-dependent resources in Put language-dependent resources in

resource DLLsresource DLLs Use naming convention, e.g., Use naming convention, e.g.,

res<LANGID>.dllres<LANGID>.dll Find all resource DLLs, put up list box of Find all resource DLLs, put up list box of

choiceschoices

Page 31: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode
Page 32: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode
Page 33: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Agenda: Using Unicode and Agenda: Using Unicode and Complex Scripts in Enterprise Complex Scripts in Enterprise Applications Applications Intranet/internet applicationsIntranet/internet applications Unicode support in SQL Server 7.0Unicode support in SQL Server 7.0 Other ConsiderationsOther Considerations

Page 34: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Intranet/Internet Intranet/Internet ApplicationsApplications

Internet Explorer 5.01 on Win32 PlatformsInternet Explorer 5.01 on Win32 Platforms Displays multilingual text including complex Displays multilingual text including complex

scriptsscripts Supports complex scripts in Document Object Supports complex scripts in Document Object

ModelModel Supports Indian text through UnicodeSupports Indian text through Unicode

Page 35: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Encodings for Multi-lingual Encodings for Multi-lingual Text in Web PagesText in Web Pages Raw UnicodeRaw Unicode

OK for intranet on Windows NT networksOK for intranet on Windows NT networks Not good for internet pagesNot good for internet pages

Number entities, e.g., &#2325Number entities, e.g., &#2325 OK for occasional use, e.g., inserting characters OK for occasional use, e.g., inserting characters

not in the main script of pagenot in the main script of page Not good for large documentsNot good for large documents

UTF-8 – Recommended encodingUTF-8 – Recommended encoding Works just about everywhereWorks just about everywhere Supported by IE 4.0+, Netscape 4.0+Supported by IE 4.0+, Netscape 4.0+

Page 36: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Creating UTF-8 WebpagesCreating UTF-8 Webpages Use charset=UTF-8 in META tagUse charset=UTF-8 in META tag Save HTML page as UTF-8 using Save HTML page as UTF-8 using

notepad, Word, etc.notepad, Word, etc. Saving as UTF-8 in Word:Saving as UTF-8 in Word:

Select File/Save As WebPage/ToolsSelect File/Save As WebPage/Tools Select Web Options/EncodingSelect Web Options/Encoding Change charset designation to UTF-8Change charset designation to UTF-8

Page 37: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Embedded Fonts in Web Embedded Fonts in Web PagesPages

Downloadable fonts used only in web Downloadable fonts used only in web pagespages

Deleted when page is closedDeleted when page is closed WEFT toolWEFT tool

Creates embedded font from TTF fileCreates embedded font from TTF file Saves download time/space by using only Saves download time/space by using only

those glyphs required for the pagethose glyphs required for the page On Microsoft website, see On Microsoft website, see

workshop/author/workshop/author/fontembedfontembed/font_embed.asp/font_embed.asp

Page 38: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Introduction to DHTMLIntroduction to DHTML

Based on Document Object ModelBased on Document Object Model Objects in HTML documentObjects in HTML document

Text in objects including titles, headers, etcText in objects including titles, headers, etc Attributes such as font, color, etcAttributes such as font, color, etc

Are accessible via scripts, e.g., JScript or Are accessible via scripts, e.g., JScript or VBScriptVBScript

Supported in IE 4.0+Supported in IE 4.0+

See various documents under See various documents under www.microsoft.com/workshop/authorwww.microsoft.com/workshop/author for for overviewoverview

Page 39: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Examples of DHTMLExamples of DHTML<H1 id=Head1 style=“font-weight: normal”<H1 id=Head1 style=“font-weight: normal” onmouseover = “makeitalic() ;”onmouseover = “makeitalic() ;” onmouseout = “makenormal() ;” >onmouseout = “makenormal() ;” > Sample Dynamic HTML </H1>Sample Dynamic HTML </H1>

<script language=JavaScript><script language=JavaScript> function makeItalic() {function makeItalic() {

Head1.style.fontstyle = “Italic” ;Head1.style.fontstyle = “Italic” ;}}function makeNormal() {function makeNormal() {

Head1.style.fontstyle = “Normal” ;Head1.style.fontstyle = “Normal” ;}}</script></script>

Heading tagHeading tag

Jscript Jscript functions that functions that change style of change style of heading textheading text

Page 40: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Using Indian Scripts in Using Indian Scripts in DHTMLDHTML

Use same design rules as static HTMLUse same design rules as static HTML Encode in UTF-8Encode in UTF-8 Use embedded fonts if neededUse embedded fonts if needed

Consider multilingual pagesConsider multilingual pages Display initial page in English Display initial page in English Offer option to change to otherOffer option to change to other

Page 41: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Unicode Support in SQL Unicode Support in SQL Server 7.0Server 7.0

Unicode datatypes in SQL Server 7.0Unicode datatypes in SQL Server 7.0 NCHARNCHAR NVARCHARNVARCHAR NTEXTNTEXT Indicate Unicode text by N’text’, in SQL queries:Indicate Unicode text by N’text’, in SQL queries:

create table myTable (col1 CHAR(8), col2 NCHAR(8))create table myTable (col1 CHAR(8), col2 NCHAR(8))

insert into myTable (col1,col2) (‘Japan’, N‘insert into myTable (col1,col2) (‘Japan’, N‘ 日本日本 ')')

Utilities for entering/retrieving Unicode data:Utilities for entering/retrieving Unicode data: Query AnalyzerQuery Analyzer Data Transformation ServicesData Transformation Services Client application using ODBCClient application using ODBC

Page 42: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Accessing Data Through Accessing Data Through ODBCODBC

ODBC supports Unicode data accessODBC supports Unicode data access Use Visual C/C++ for read/writeUse Visual C/C++ for read/write

Use SQL ‘W’ routines, e.g., Use SQL ‘W’ routines, e.g., SQLExecDirectSQLExecDirectWW(SQLHSTMT, LPWSTR, int);(SQLHSTMT, LPWSTR, int);

Specify data type SQL_C_WCHAR as needed:Specify data type SQL_C_WCHAR as needed:SQLBindCol(hstmt, nColumn, SQLBindCol(hstmt, nColumn, SQL_C_WCHARSQL_C_WCHAR, , szCol, nMaxCol, &cbName);szCol, nMaxCol, &cbName);

See GLOBALDV sampleSee GLOBALDV sample Use Visual Basic to retrieve and displayUse Visual Basic to retrieve and display

Page 43: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Accessing SQL Server 7.0 Accessing SQL Server 7.0 Unicode Data through ASP Unicode Data through ASP WebpagesWebpages

Use standard encodings:Use standard encodings: UTF-8 in web pagesUTF-8 in web pages Unicode in SQL Server 7.0Unicode in SQL Server 7.0

Access data through Jscript/ODBCAccess data through Jscript/ODBC Jscript automatically translates Unicode to Jscript automatically translates Unicode to

current codepage in web page current codepage in web page Defaults to system codepageDefaults to system codepage Specify UTF-8 “codepage” using:Specify UTF-8 “codepage” using:

<%Session.CodePage=65001%> // Scope=session<%Session.CodePage=65001%> // Scope=session <%@CODEPAGE=65001%><%@CODEPAGE=65001%> // Scope=page// Scope=page

Page 44: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Summary of SQL Server 7.0 Summary of SQL Server 7.0 Unicode AccessUnicode Access

Tool Storage Retrieval Notes and Restrictions

Enterprise Manager

No No Uses the Tabular viewer tool, which is ANSI based

Query Analyzer

Yes Yes

Data Transformation Services

Yes Yes Import/export file format must support Unicode

ODBC Yes Yes SQL queries through ODBC process Unicode correctly. Must use Unicode APIs and datatypes.

Visual Basic Limited Yes Must use MS Forms 2.0 controls to display properly in Visual Basic. Cannot enter Indian text in text box

JScript in Web page

? Yes Can retrieve and display Indian Text in UTF-8 web page using Jscript. Storage not yet tested.

Page 45: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Other Considerations …Other Considerations …

Handling Indian text in network Handling Indian text in network applicationsapplications Indic Language Group must be installed Indic Language Group must be installed

on clientson clients Only necessary on server if display and Only necessary on server if display and

input is required locallyinput is required locally

Sharing DocumentsSharing Documents Word 2000 Documents: Must have Indic Word 2000 Documents: Must have Indic

language group installed on local machinelanguage group installed on local machine HTML: Can use embedded fontsHTML: Can use embedded fonts

Page 46: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode
Page 47: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode
Page 48: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Break!Break!

Page 49: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

OpenType LayoutOpenType Layout

David C. BrownDavid C. BrownDevelopment Lead, andDevelopment Lead, and

David MeltzerDavid MeltzerProgram ManagerProgram Manager

Microsoft CorporationMicrosoft Corporation

Page 50: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

OpenType LayoutOpenType Layout

File FormatFile Format Benefits of OpenTypeBenefits of OpenType Layout FeaturesLayout Features Indic FeaturesIndic Features

Page 51: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

OpenType File FormatOpenType File Format

sfnt table structuresfnt table structure Extension of the current TrueType file Extension of the current TrueType file

formatformat

A single font file may containA single font file may contain TrueType outline dataTrueType outline data PostScript (CFF) outline dataPostScript (CFF) outline data

Page 52: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Benefits of OpenTypeBenefits of OpenType

Support for large character setsSupport for large character sets Multi-script character setsMulti-script character sets Unicode supportUnicode support Glyph alternates supportedGlyph alternates supported Advanced typography supportedAdvanced typography supported Better protection of font dataBetter protection of font data Font embedding controlsFont embedding controls

Page 53: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Layout FeaturesLayout Features

Glyph substitutionGlyph substitution Glyph positioningGlyph positioning Script and Language informationScript and Language information

Page 54: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Glyph SubstitutionGlyph Substitution

Single glyph substitutionSingle glyph substitution One-to-many substitutionOne-to-many substitution Multiple glyph substitutionMultiple glyph substitution Aesthetic alternativesAesthetic alternatives Contextual glyph substitutionContextual glyph substitution

Page 55: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Glyph PositioningGlyph Positioning

Two-dimensional positioningTwo-dimensional positioning Single glyph adjustmentSingle glyph adjustment Adjustment of paired glyphsAdjustment of paired glyphs Cursive attachmentCursive attachment Mark attachmentMark attachment Contextual positioningContextual positioning

Page 56: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Script and Language Script and Language InformationInformation

Layout features encoded byLayout features encoded by ScriptsScripts Languages within scriptsLanguages within scripts

Page 57: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Indic FeaturesIndic Features

Language FormsLanguage Forms Conjuncts and Typographical FormsConjuncts and Typographical Forms Glyph PositioningGlyph Positioning

Page 58: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Language FormsLanguage Forms

NuktaNukta AkhandAkhand RephReph Below-base FormBelow-base Form Half FormHalf Form Post-base FormPost-base Form Vattu VariantsVattu Variants

Page 59: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Example: Below-base formExample: Below-base form

VVaattttuu ((BBeellooww--bbaasseeffoorrmm ooff RRaa))

Page 60: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Conjuncts and Conjuncts and Typographical FormsTypographical Forms

Pre-base substitutionsPre-base substitutions Below-base substitutionsBelow-base substitutions Above-base substitutionsAbove-base substitutions Post-base substitutionsPost-base substitutions Halant FormsHalant Forms

Page 61: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Example: Pre-base Example: Pre-base consonant conjunctconsonant conjunct

Page 62: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Glyph PositioningGlyph Positioning

Below-base marksBelow-base marks Above-base marksAbove-base marks Distance controlDistance control

Page 63: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Coming Tools for Coming Tools for Developing OpenType FontsDeveloping OpenType Fonts

VTT (Visual TrueType)VTT (Visual TrueType) VOLT (Visual OpenType Layout Tool)VOLT (Visual OpenType Layout Tool)

Page 64: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

Installing Sample Fonts …Installing Sample Fonts …

copy …\cssamp\fonts.exe c:\tempcopy …\cssamp\fonts.exe c:\temp cd c:\tempcd c:\temp fonts /T:c:\temp /Cfonts /T:c:\temp /C Use explorer to drag mangal.ttf and Use explorer to drag mangal.ttf and

latha.ttf into your winnt\fonts latha.ttf into your winnt\fonts directory.directory.

Page 65: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode

ResourcesResources

OpenType SpecificationOpenType Specification http://www.microsoft.com/typography/http://www.microsoft.com/typography/

otspecotspec

Indic Encoding SpecificationIndic Encoding Specification Early draft available on your CDEarly draft available on your CD contact contact [email protected]@microsoft.com

Page 66: Agenda: Guidelines for Supporting Complex Scripts* on Windows 2000  Key Concepts  Overview of Unicode  Migrating existing applications  Using Unicode