Upload
norah-bishop
View
212
Download
0
Embed Size (px)
Citation preview
22nd International Unicode Conference San Jose, CA September 2002 1
Going Global With SQL Server 2000 Going Global With SQL Server 2000
Beverly Sherry
Program Manager
Global Release Services for SQL Server
Microsoft Corporation
22nd International Unicode Conference San Jose, CA September 2002 3
SQL Server Setup
• Clean Install/Collation
• Upgrades/Collation
22nd International Unicode Conference San Jose, CA September 2002 4
Collation
• Collation what is it and why do we need it to support our multilingual data?
22nd International Unicode Conference San Jose, CA September 2002 5
Collation in SQL Server 6.5 and earlier versions
• No Unicode support
• One code page per server
• One collation per server
22nd International Unicode Conference San Jose, CA September 2002 6
Collation in SQL Server 7.0
• Unicode datatypes supported
• Two collations– Unicode– Non-Unicode
• Number of collations distilled down to the minimum necessary
• Collation consistency across OS’s
22nd International Unicode Conference San Jose, CA September 2002 7
Collation in SQL Server 2000
• Combined code pages and Unicode collations into a single entity
• Flexible model to specify collations at a more granular level
22nd International Unicode Conference San Jose, CA September 2002 8
"Windows" collations
• 43 language collations– Added for unique code pages
– Added for unique ordering
• Suffix meanings– _BIN (Binary)
– _CI/_CS (Case sensitivity)
– _AI/_AS (Accent sensitivity)
– _KS - kanatype sensitivity (hiragana/katakana)
– _WS - width sensitivity (full/half width)
22nd International Unicode Conference San Jose, CA September 2002 9
SQL Collations
• Provided for backwards compatibility with prior versions of SQL Server
22nd International Unicode Conference San Jose, CA September 2002 10
Collation at four levels
• Server
• Database
• Column
• Expression
22nd International Unicode Conference San Jose, CA September 2002 11
Collation at the server level
• Acts as a default for all databases
• Can be changed with RebuildM.exe in the tools\BINN dir – why do you want to do this and how.
• Querying the server collation:
SELECT CONVERT(char, SERVERPROPERTY('collation'))
22nd International Unicode Conference San Jose, CA September 2002 12
Upgrade Path
• Unicode datatypes in master database– DB-object metadata converted in Unicode
– Sort order compatibility
• Scripting in Unicode
• Code page override
.
USUS
FrenchFrench
KoreanKorean
8.08.077 ..00
22nd International Unicode Conference San Jose, CA September 2002 13
Defining your data store
• UCS-2 Server storage
• Data type– Unicode
• NCHAR, NVARCHAR, NTEXT,
• META data – SYSNAME
• N’Unicode’
– DBCS/SBCS• Char, VARCHAR, text
22nd International Unicode Conference San Jose, CA September 2002 14
Collation at the database level
•Every database has a collation (default is the server collation)
Create database db_test collate Latin1_General_CI_AI
•Collation can be changed using ALTER DATABASE
SELECT CONVERT(char, DATABASEPROPERTYEX(‘db_test', 'collation'))
22nd International Unicode Conference San Jose, CA September 2002 15
Collation at the column level
• Overrides database level collation
CREATE TABLE jobs( job_id smallint IDENTITY(1,1) PRIMARY KEY CLUSTERED, job_desc varchar(50) COLLATE Arabic_CI_AI_KS NOT NULL DEFAULT 'New Position - title not formalized yet',)
22nd International Unicode Conference San Jose, CA September 2002 16
Collations by Expressions
SELECT
*
FROM
Table1
WHERE
Field1 = Field2 COLLATE Turkish_ci_ai
22nd International Unicode Conference San Jose, CA September 2002 17
The Rules of Precedence for Collations
Explicit B Implicit B Default No Collation
Explicit A Runtime Error Explicit A Explicit A Explicit A
Implicit A Explicit B No Collation Implicit A No Collation
Default Explicit B Implicit B Default No Collation
No Collation Explicit B No Collation No Collation No Collation
22nd International Unicode Conference San Jose, CA September 2002 18
Unicode Data FlowServer
OLE/DBOLE/DB ODBCODBC
• 8.0 Client
– ODBC 3.7+/OLEDB
– TDS 8.0, 8.0 Netlibs support Unicode
– Character data converted to/from Unicode on client (Server code page required on client)
– T-SQL batches received in Unicode, parsed in Unicode on Server.
NetlibNetlib
ODSODS
SQL ServerSQL ServerApplicationApplication
NetlibNetlib
• Downlevel client
– DBLIB, ODBC <3.7 clients
– TDS 4.2
– T-SQL batch received in DBCS/ANSI code page of the client
– Translated to Unicode using the Server code page on the Server
DB-LibDB-Lib ODBCODBC
ApplicationApplication
NetlibNetlib
TDS 8.0TDS 8.0 TDS 4.2TDS 4.2
22nd International Unicode Conference San Jose, CA September 2002 19
Data Flow
ODBC client to server
• Language event is always in Unicode• Client ACP to Unicode to server
• ‘A’ - Unicode to server character set
• N‘A’ as Unicode
SQL_C_CHARSQL_C_CHAR CHARCHARUnicode Svr CHAR
Client side conversion
char
SQL_C_CHARSQL_C_CHARNCHARNCHAR
SYSNAMESYSNAMEUnicodeUnicode
SQL_C_CHARSQL_C_CHAR CHARCHARbytesNO TRANSLATION
ACP
ACP
SQL_C_WCHARSQL_C_WCHAR CHARCHARUnicode Svr CHAR
char
ClientClient ServerServer
22nd International Unicode Conference San Jose, CA September 2002 20
Data Flow
• OLEDB to server: – SSPROP_INIT_AUTOTRANSLATE as
VARIANT_TRUE
• Server code page on the client
DBTYPE_STRDBTYPE_STR CHARCHARUnicode Svr CHAR
Client side conversion
char
DBTYPE_STRDBTYPE_STRNCHARNCHAR
SYSNAMESYSNAMEUnicodeUnicode
DBTYPE_STRDBTYPE_STR CHARCHARbytesVARIANT_FALSE
ACP
ACP
CHARCHARUnicode Svr CHAR
char
ClientClient ServerServer
DBTYPE_WSTRDBTYPE_WSTR
22nd International Unicode Conference San Jose, CA September 2002 22
Data Access
• International T-SQL– NCHAR and N’’– No name strings in date/time– ODBC timestamp– CONVERT with specific style
22nd International Unicode Conference San Jose, CA September 2002 23
Data flow•
Data Transformation…..
OLE DB ODBCFixed fieldASCII delimited
OLE DB ODBCFixed fieldASCII delimitedHTML pageRepl. publication
DTS Data Pump
Xforms
Source Destination
In Out
Steps
22nd International Unicode Conference San Jose, CA September 2002 24
Client Flow
• Session language syslanguages– Precedence
• Set by ‘set language’
• Set by connection attribute
• Set by user record in syslogins
– Cultural behavior• Language of error messages
• Date format, month name
• Day of week and abbreviations
22nd International Unicode Conference San Jose, CA September 2002 25
BCP
– bcp -w : Performs bulk copy operation using Unicode characters.
– bcp -N : Performs the bulk copy operation using the native (database) data types of the data for non-character data, and Unicode characters for character data.
22nd International Unicode Conference San Jose, CA September 2002 26
XML
• You can specify an output encoding in a URL.
• XML templates can specify an encoding.
• Unicode by default
22nd International Unicode Conference San Jose, CA September 2002 27
Full text
• Allows for word or phrase-based indexing of character data.
• Full-text indexing enables the creation and population of the full-text catalogs, which are maintained outside of SQL Server and managed by the Microsoft Search service.
• Full-text search uses the new Transact-SQL predicates (CONTAINS, CONTAINSTABLE, FREETEXT, and FREETEXTTABLE) to query these populated full-text catalogs.
• With a full-text query, you can perform
– A linguistic search of character data in tables enabled for full-text search.
– A linguistic search operates on words and phrases unlike the LIKE predicate which is used to search character patterns.
• Manipulate to get what you want
22nd International Unicode Conference San Jose, CA September 2002 28
ToolsManageability • Unicode based
– SQL-DMO
22nd International Unicode Conference San Jose, CA September 2002 30
Backup and Restore
• Restore uses the collation of the source databases
• Verify the collation is support on the instance of SQL Server
22nd International Unicode Conference San Jose, CA September 2002 31
Replication PublisherPublisher
DistributorDistributor
Updating Updating SubscriberSubscriber
(immediate updates)(immediate updates)
2PC, RPC2PC, RPC
SubscriberSubscriber SubscriberSubscriber
22nd International Unicode Conference San Jose, CA September 2002 32
Analysis Services
OLAP ServerOLAP Server
OLEDB, ADO, XML/AOLEDB, ADO, XML/A
Client TierClient Tier• MD ActiveX ControlsMD ActiveX Controls• MD Extension to OLE DBMD Extension to OLE DB• Office 2000 InterfacesOffice 2000 Interfaces• 3rd Party Clients3rd Party Clients
OLAP Server TierOLAP Server Tier Multidimensional data modeling andMultidimensional data modeling and calculation enginecalculation engine Persistent multidimensional cachePersistent multidimensional cache
OLEDB / ODBCOLEDB / ODBC
22nd International Unicode Conference San Jose, CA September 2002 33
Unicode Data Flow in Fringe Areas
• Script usage
– Command line tools
• ISQL utility does not support Unicode input files.
• OSQL -u (Specifies that output_file is stored in Unicode format).
– Query analyzer, save as Unicode / ANSI / OEM.
22nd International Unicode Conference San Jose, CA September 2002 34
Resources
• International Features in Microsoft SQL Server 2000 Http://msdn.microsoft.com/library/default.asp
• Arabic Language Support in Microsoft SQL Server 2000 Http://msdn.microsoft.com/library/default.asp
• SQL Server Books On Line