22
Using CTW as a language modeler in Dasher Martijn van Veen 05-02-2007 Signal Processing Group Department of Electrical Engineering Eindhoven University of Technology

Using CTW as a language modeler in Dasher

  • Upload
    huslu

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

Using CTW as a language modeler in Dasher. Martijn van Veen 05-02-2007 Signal Processing Group Department of Electrical Engineering Eindhoven University of Technology. Overview. What is Dasher And what is a language model What is CTW And how to implement it in Dasher - PowerPoint PPT Presentation

Citation preview

Page 1: Using CTW as a language  modeler in Dasher

Using CTW as a language modeler in Dasher

Martijn van Veen

05-02-2007

Signal Processing Group

Department of Electrical Engineering

Eindhoven University of Technology

Page 2: Using CTW as a language  modeler in Dasher

2/21

OverviewOverview

• What is Dasher– And what is a language model

• What is CTW– And how to implement it in Dasher

• Decreasing the model costs

• Conclusions and future work

Page 3: Using CTW as a language  modeler in Dasher

3/21

DasherDasher

• Text input method

• Continuous gestures

• Language model

• Let’s give it a try!Dasher

Page 4: Using CTW as a language  modeler in Dasher

4/21

Dasher: Language ModelDasher: Language Model

• Conditional probability for each alphabet symbol, given the previous symbols

• Similar to compression methods

• Requirements: – Sequential– Fast– Adaptive

• Model is trained

• Better compression -> faster text input

Page 5: Using CTW as a language  modeler in Dasher

5/21

Dasher: Language modelDasher: Language model• PPM: Prediction by Partial Match• Predictions by models of different order• Weight factor for each model

Page 6: Using CTW as a language  modeler in Dasher

6/21

Dasher: Language modelDasher: Language model

• Asymptotically PPM reduces to fixed order context model

• But the incomplete model works better!

Page 7: Using CTW as a language  modeler in Dasher

7/21

CTW: Tree modelCTW: Tree model

• Source structure in the model, parameters memoryless

• KT estimator:a = number of zeros b = number of ones

Page 8: Using CTW as a language  modeler in Dasher

8/21

CTW: Context treeCTW: Context tree

• Context-Tree Weighting: combine all possible tree models up to a maximum depth

Page 9: Using CTW as a language  modeler in Dasher

9/21

CTW: tree update CTW: tree update

Page 10: Using CTW as a language  modeler in Dasher

10/21

CTW: ImplementationCTW: Implementation

• Current implementation– Ratio of block probabilities stored in each

node– Efficient but patented

• Develop a new implementation– Use only integer arithmetic, avoid divisions – Represent both block probabilities as fractions– Ensure denominators equal by cross-

multiplication– Store the numerators, scale if necessary

Page 11: Using CTW as a language  modeler in Dasher

11/21

CTW for TextCTW for Text• Binary decomposition

• Adjust zero-order estimator

Page 12: Using CTW as a language  modeler in Dasher

12/21

ResultsResults

• Comparing PPM and CTW language models

– Single file

– Model trained with English text

– Model trained with English text and user input

Input file CTW PPM Differenc

e

Book 2 2.632 2.876 8.48 %

NL 4.356 5.014 13.12 %

Input file CTW PPM Difference

GB 2.847 3.051 6.69 %

Book 2 2.380 2.543 6.41 %

Book 2 2.295 2.448 6.25 %

Input file CTW PPM Difference

Book 2 1.979 2.177 9.10 %

NL 2.364 2.510 5.82 %

Page 13: Using CTW as a language  modeler in Dasher

13/21

CTW: Model costsCTW: Model costs

• What are model costs?

Page 14: Using CTW as a language  modeler in Dasher

14/21

CTW: Model costsCTW: Model costs

• Actual model and alphabet size fixed -> Optimize weight factor alpha– Per tree -> not enough parameters– Per node -> not enough adaptivity– Optimize alpha per depth of the tree

Page 15: Using CTW as a language  modeler in Dasher

15/21

CTW: Model costsCTW: Model costs

• Exclusion: only use Betas of the actual model

• Iterative process– Convergent?

• Approximation: To find actual model

use Alpha = 0.5

Page 16: Using CTW as a language  modeler in Dasher

16/21

CTW: Model costsCTW: Model costs

• Compression of an input sequence

– Model costs significant, especially for short sequence

– No decrease by optimizing alpha per depth?

Symbols Alpha 0.5 Alpha after exclusion

Without model costs

100 5.73 5.21 4.94

1.000 4.22 4.07 3.68

10.000 3.12 3.07 2.77

100.000 2.33 2.32 2.13

600.000 1.95 1.95 1.83

Page 17: Using CTW as a language  modeler in Dasher

17/21

CTW: Model costsCTW: Model costs

SymbolsAlpha 0.5

Alpha after exclusion

Max. probability

in root

Without model costs

100 0.8437 0.8117 0.8113 0.7022

1.000 0.6236 0.6213 0.6209 0.5330

10.000 0.3830 0.3792 0.3794 0.3276

100.000 0.2661 0.2652 0.2647 0.2389

600.000 0.2248 0.2242 0.2241 0.2098

• Maximize probability in the root, instead of the probability per depth

– Exclusion based on alpha = 0.5 almost optimal

Page 18: Using CTW as a language  modeler in Dasher

18/21

CTW: Model costsCTW: Model costs

Language Alpha 0.5 Alpha after exclusion

GB 2.01 2.04

NL 4.34 4.36

Results in Dasher scenario:

• Trained model

– Negative effect if no user text is available

• Trained with concatenated user text

– Small positive effect if user text added to training text, and very similar to it

Language Alpha 0.5 Alpha after exclusion

GB 2.30 2.28

NL 4.12 4.13

Page 19: Using CTW as a language  modeler in Dasher

19/21

ConclusionsConclusions

• New CTW Implementation– Only integer arithmetic– Avoids patented techniques– New decomposition tree structure

• Dasher language model based on CTW– 6 percent more are accurate predictions than

PPM-D

• Decreasing the model costs – Only insignificant decrease possible with our

method

Page 20: Using CTW as a language  modeler in Dasher

20/21

Future workFuture work

• Make CTW suitable for MobileDasher– Decrease memory usage– Decrease number of computations

• Combine language models – Select locally best model, or weight models

together

• Combine languages in 1 model– Models differ in structure or in parameters?

Page 21: Using CTW as a language  modeler in Dasher

21/21

Thank you for your attention

Ask away!

Page 22: Using CTW as a language  modeler in Dasher

22/21

CTW: Implementation CTW: Implementation

• Store the numerators of the block probabilities