12
LIBICONV – An Interface to Team Developer By Jean-Marc Gemperle Technical Support Engineer November, 2005

LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

LIBICONV – An Interface to Team Developer

By Jean-Marc Gemperle Technical Support Engineer

November, 2005

Page 2: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

Abstract ..................................................................................... 3

Introduction............................................................................... 3

What Is LIBICONV?.................................................................... 4

Obtaining and Building LIBICONV for Win32.............................. 4

A DLL Interface to Team Developer............................................ 4

Team Developer ICONV Samples and Tests................................ 5

A Brief Description of the Application......................................... 6

Converting a Hebrew Code Page CP1255 to the UTF-8 Format................................................................................. 7

Converting the Generated UTF-8 Back to the CP1255 Hebrew Code Page .............................................................. 8

Chinese ISO-2022-CN-EXT to UTF-8 .................................... 9

ISO-8859-1 to DOS 437..................................................... 10

ISO-8859-1 to WINDOWS-1250 Cannot Convert ............... 11

ISO88591 to WINDOWS-1250 Using Translit .................... 11

Conclusions .............................................................................. 12

Page 3: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

Introduction

Page 3

Abstract This technical white paper proposes an interface from GUPTA Team Developer to the GNU LIBICONV allowing a Team Developer programmer to convert their documents to a different form of encoding. See http://www.gnu.org/software/libiconv/

Introduction Generally, international text is encoded using a specific country-dependent character encoding. With the Internet, conversion between different encoding has become critical. Conversion is also a problem because some characters which are present in one encoding may not be in another. For all these reasons Unicode as been created as the super-encoding standard over all others and is the default for new text formats such as XML. See http://www.unicode.org/standard/WhatIsUnicode.html

Page 4: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

About LIBICONV and Team Developer How to obtain & build LIBICONV for Windows The LIBICONV DLL wrapper

Page 4

What Is LIBICONV? Many computers still use traditional character encoding. This is also the case for applications built using Team Developer. Team Developer 2006 will be fully Unicode enables. However, some applications must be able to convert from one encoding to another. In Team Developer for example, you may generate XML files using either XML Table Windows, the DOM class Library, or Serialization of UDV, and you may want to transfer the documents to other applications using other encodings. GNU LIBICONV is a conversion library to convert from Unicode to traditional encoding and vice versa. So LIBICONV is a solution for you if you need your applications to support multiple character encodings and is lacking in your current system. See http://www.gnu.org/software/libiconv/ for details on supported encodings.

Obtaining and Building LIBICONV for Win32 You can download the LIBICONV source files from http://sourceforge.net/projects/gettext. To build LIBICONV you will need to have Microsoft Visual Studio 6 and both the libiconv-win32 and gettext-win32 sources. First you will need to build LIBICONV without NLS, then GETTEXT and LIBICONV. Carefully follow the README.woe32 from these packages. By following these steps you should be able to get the WIN32 binary of the main application ICONV.EXE along with its dependencies ICONV.DLL, INTL.DLL, etc. Whether or not you decide to compile or directly use the binary package provided, you can easily interface ICONV.EXE to Team Developer using SalLoadApp(). Simply invoke ICONV.EXE –help for the description of the parameters.

A DLL Interface to Team Developer This document provides a simple interface to GUPTA Team Developer. The Dynamic Link Library (DLL) file LIBICONVDLL.dll is a wrapper to the ICONV main() entry point. See the Visual Studio projects LIBICONVDLL.dsw and td_iconv.c. The wrapper exports 3 functions to Team Developer:

bOK= iConv (BOOL bBinary, BOOL bTransLit, sFromCode,sFromFile, sToCode, sToFile) Parameters bBinary : Open the sFromFile in BINARY mode bTransLit : Limited support for transliteration, i.e. when a character cannot be represented in the target character set, it can be approximated through one or several similarly looking characters. sFromCode: The coding of the document source sFromFile: The source file to convert sToCode: The target coding sToFile: The target file

Page 5: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

Samples and tests

Page 5

Return Ok is TRUE if the function succeeds and FALSE if it fails. iConvListEnc(sListOfCode) Parameter sListOfCode: Return list of supported coding. iConvLastError(sLastError) Parameter sLastError: If iConv fails, get error description with this function.

The interface only provides conversion from a source file to a target file and does not yet provide the capability of converting directly a buffer in memory; although all the logic is available in the convert() function of ICONV.C source.

Team Developer ICONV Samples and Tests In order to perform some interesting tests using Team Developer interface to iconv, install the Supplemental language support from the Languages tab in the Regional and Language Options control panel and select a Unicode font in the Notepad as shown below for XP:

You can also test using Windows 2000 as long as you install the needed language in the Control Panel’s Regional option.

Page 6: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

Page 6

A Brief Description of the Application The Team Developer application that interfaces to LIBICONV can be found in \libiconv\bin\test_iconv.apt and the sample file in \libiconv\bin\Samples. With it you can type in the file you want to convert or, by using the ... button, you can open a File dialog and then open some of the samples provided in the samples directory where different file types are supported, for instance *.TXT, UTF-8 and XML. Here we will open the HEBREW-CP1255.TXT file. This file uses the traditional encoding CP1255 and our goal is to convert it to UTF-8. By default the application chooses an ISO-8859-1 code page both for the source and the target as soon as you select an existing file. Note that the combo boxes are Type Ahead combo. The list of encoding returned in the combo is an encoding type that ICONV supports. Also note that all these encodings are not specific, but that some are aliases to others (i.e. 437 is equivalent to CP437). The Notepad button allows us to view the source file in Notepad while the DOS button will open a command prompt allowing you to type the filename. The same button exists to check the differences after converting both in Windows and DOS. The Del button simply deletes the file that was generated. The Binary check box opens the file in binary mode and the Translit button is for transliteration, i.e. when a character cannot be represented in the target character set it can be approximated through one or several similar looking characters.

Page 7: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

Converting a Hebrew code page to UTF-8

Page 7

Converting a Hebrew Code Page CP1255 to the UTF-8 Format

Microsoft Windows Notepad cannot properly display the file because the operating system used for this test is a United States English version and therefore does not have the Hebrew font installed. Thus, Notepad cannot properly display the CP1255 code page. Converting from CP1255 to UTF-8 will allow Notepad to properly display Hebraic character sets. Obviously DOS CP437 can not show this unless we have the right code page and right raster fonts.

Page 8: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

Converting the Generated UTF-8 Back to the CP1255 Hebrew Code Page

This test just shows that the resulting OUT.CP1255 file is identical to the HEBREW-CP1255.TXT from our previous test.

Page 6

Converting the UTF-8 back to the Hebrew code page

Page 8

Page 9: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

Chinese IS0-2022-CN-EXT to UTF-8

Page 9

Chinese ISO-2022-CN-EXT to UTF-8

This is the same test as above but with a different code page. The C:\libiconv\bin\Samples directory contains other samples such as Japanese snippets. Also, when you obtain the source of libiconv-win32 you will have additional test samples to verify that ICONV is functional.

Page 10: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

ISO-8859-1 to DOS 437

Page 10

ISO-8859-1 to DOS 437

This test converts a WINDOWS ANSI code page ISO-8859-1 to the code page 437. The command prompt on the input file shows garbage as DOS can’t display ISO-8859-1, but can using CP437. Once converted, Windows Notepad can not display the CP437 code page.

Page 11: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

ISO-8859-1 to WINDOWS-1250 cannot convert ISO88591 to WINDOWS-1250 using translit

Page 11

ISO-8859-1 to WINDOWS-1250 Cannot Convert

This test shows that some characters from ISO-8859-1 can’t be converted to the WINDOWS-1250 code page.

ISO88591 to WINDOWS-1250 Using Translit

The “Translit” option tries to approximate the character that could not be represented in the target code page.

Page 12: LIBICONV – An Interface to Team Developersamples.tdcommunity.net/samples/Misc/WhitePapers_Docs/... · 2017-09-28 · bBinary : Open the sFromFile in BINARY mode bTransLit : Limited

Conclusions

Page 12

Conclusions

This whitepaper covered the ability to convert applications from one character encoding format to another in Team Developer using GNU LIBICONV. All the work being done by the ICONV interface to Team Developer can simply be done using a call to ICONV.EXE. The DLL interface proposed here is only a sample and is given without any guarantee. There are many other useful GNU tools that could be used in Team Developer. There is no real need to rebuild the tools, and either a port on WIN32 can be found as is the case with LIBICONV, or one could use CYGWIN found at: http://www.cygwin.com/ and its runtime to execute UNIX tools under WIN32. Copyright © 2005 Gupta Technologies LLC. GUPTA, the GUPTA logo, and all GUPTA products are licensed or registered trademarks of Gupta Technologies, LLC. All other products are trademarks or registered trademarks of their respective owners. All rights reserved.