Intelligent Key Prediction by N-grams and Error-correction Rules Kanokwut Thanadkran, Virach...

Intelligent Key Prediction byN-grams and Error-correction Rules

Kanokwut Thanadkran, Virach Sornlertlamvanich and Tanapong PotipitiInformation Research and Development Division,National Electronics and Computer Technology

Center(NECTEC), Thailand

Outline Introduction Related

Works The Approach Experiments Conclusions

Introduction

The Thai-English Keyboard System, Problems and Solutions

The Thai-English Keyboard System

174 characters. 83 in Thai. 52 in English. 39 in Symbol.

4 characters/1 Button. 2 languages. 2 sets of characters.

Use special keys to select languages and to distinguish a set of characters.

ฤ ฟ

EnglishThai

No shift

The Problems Two annoyances for Thais to type

Thai-English bilingual texts. To switch between languages by

using a language switching key. To distinguish a set of characters by

using a combination of shift-key and a character-key.

Other multilingual users face the same problems.

Related Works

Language Identification and Key Prediction

Language Identification Microsoft Word 2000.

Automatic language identification.

Considering only surrounding text.

Problems. No key prediction. Often detect incorrect language. Difficult to switch to the proper

language.

Key Prediction Zheng et al.

Automatic word prediction. Base on lexical tree. Word class n-grams are

employed. Problems.

No language identification. Required large amount of

memory space.

The Approach

Overview, Language Identification, Key Prediction and Pattern Shortening

The System Overview There are two main

processes in our system.

Automatic language identification.

Automatic Thai key prediction.

Language?

Key Input

Thai KeyPrediction

Output

iiiE KKpEprob

iiiT KKpTprob

Language IdentificationLanguage Identification

Thai Key Prediction

)|(.),|(maxarg 2-1-1

..,2,1

iccccKpcccp

Error-correction Rules Some character

sequences often generate errors.

Error patterns are collected for error-correction process.

Mutual information is used to optimize the patterns.

Training corpus

Tri-gram prediction

Error correction rules

Errors from Tri-gram prediction

Error-collection Rule Reduction

Pattern No.

Error Key Sequences

Correct Patterns

1. k[lkl9

2. mpklkl9

3. kkklkl9

4. lkl9

Pattern Shortening

zxpzxLm

zxpzxRm

The Error-correction Process

Finite Automata pattern matching is applied.

The longest pattern is selected.

Key Input

Finite Automata

Terminal?

No Yes

Correction

Experiments

Language Identification and Key Prediction

Language Identification Result

Bi-Gram is employed. Using artificial

corpus for testing. The first 6 characters

are enough.

m (number of first characters)

Identification Accuracy (%)

3 94.27

4 97.06

5 98.16

6 99.1

7 99.11

Key Prediction Corpus Information

25 MB training corpus.

5 MB test corpus

Training Corpus

Test Corpus

Unshift Alphabets

88.63 88.95

Shift Alphabets

11.37 11.05

Key Prediction Result

Prediction Accuracy from

Training Corpus (%)

Prediction Accuracy from

Test Corpus (%)

Tri-gram Prediction

93.11 92.21

Tri-gram Prediction +

Error Correction99.53 99.42

Conclusions

The Approach and Future Works

The Approach We have applied the tri-gram

model and error-correction rules for intelligent Thai key prediction and English-Thai language identification.

The experiment reports 99 percent in accuracy, which is very impressive.

Future Works Hopefully, this technique is

applicable to other Asian languages and multilingual systems.

Our future work is to apply the algorithm to mobile phones, handheld devices and multilingual input systems.

The End

Questions and Answers

Intelligent Key Prediction by N-grams and Error-correction Rules Kanokwut Thanadkran, Virach...

Documents

Virach Sornlertlamvanich Information R&D Division (iTech) National Electronics and Computer Technology Center (NECTEC) THAILAND 19 January 2001 Symposium

Determinants of Happiness and Academic Performance of Economics … · 2011-05-23 · Lalita T. & Tanapong P. : Determinants of Happiness and Academic Performance of Economics Students

NEWS from KuiPOLL* Nuansri Denwattana Virach Sornlertlamvanich Athaitha Chokananratana National Electronics and Tawatchai Iewpairote Computer Technology

By Verayuth Lertnattee* Sinthop Chomya Virach Sornlertlamvanich Pacific Neighborhood Consortium (PNC) Annual Conference and Joint Meetings (PNC 2011),

Information Scientist needs for Environmental Modeling and Understanding Virach Sornlertlamvanich virach.sornlertlamvanich@nectec.or.th National Electronics

Novel Efficient Energy Saving Technologies Jiri Klemeš, Tanapong Phuengphaeng, Igor Bulatov – UoM, UK Peter Jansen, Jaap Koppejan - TNO-MEP, the Netherlands

NLP Related Activities in Thailand Virach Sornlertlamvanich Information Research and Development Division National Electronics and Computer Technology

PAN Localization, Jan 12-16, 2009, Novotel, Vientiane, Lao PDR Language Resource and Language Technology Virach Sornlertlamvanich NECTEC, Thailand TCL,

1 e-Learning Movement in Thailand and Proposal for Multi-lingual e-Learning Development Virach Sornlertlamvanich 1 Pornchai Tummarattananont 2 1 Thai Computational

(illnexpoint.co.th/wp-content/uploads/2019/07/25493.pdf · ~IG~IP~ ms.4% LLOU~ LLO~~V%LOW~ Dr. Virach & Associates Certified Public Accountants 51 8/3 Panunee Building, 7th Floor

Determinants of Happiness and Academic Performance … Lalita... · Lalita T. & Tanapong P. : Determinants of Happiness and Academic Performance of Economics Students 185 3. Data

The study on the impact of the promulgation of English language as Thai’s second language Virach Sornlertlamvanich Director Information Research and Development

DIAGNOSIS, TREATMENT AND FOLLOW-UP IN AREAS OF LIMITED RESOURCES Virach Wootipoom, MD Prince of Songkla University Songkhla, Thailand Gestational Trophoblastic

ARBITRAL AWARD COURT OF ARBITRATION FOR SPORT · 2016. 4. 17. · CAS 2013/A/3389 Virach Chanpanich v. The Football Association of Thailand ARBITRAL AWARD delivered by the COURT OF

The 1st Joint Seminar on Computational Intelligence (JSCI 2016) · 2016-04-27 · Three NLP Key Challenges in Big Data 28 Apr 09:10 Virach Sornlertlamvanich School of Information,

Faculty of Mechatronics Engineering...Introduction to PLC-BOSCH Assumption University V1.1 (May 2010) Faculty of Mechatronics Engineering Kittiphan Techakittiroj, Virach Wongpaiboon

Kornkarun Cheewatrakoolpong Tanapong Potipiti No. 141.pdf · 2015. 1. 30. · iii NO. 141 | FEBRUARY 2014 Does exporting increase productivity of Thai firms via linkage spillovers?

Summary Report Survey on Research and Development of Machine Translation in Asian Countries Virach Sornlertlamvanich Information Research and Development