22
Open-Tamil Text Processing Library A. Muthiah, T. Shrinivasan , M. Annamalai 13th Tamil Internet Conference – 2014, Puducherry, India [email protected]

Open-Tamil text processing library

Embed Size (px)

DESCRIPTION

Open-Tamil text processing library 13th INFITT-2014, Puducherry, India

Citation preview

Page 1: Open-Tamil text processing library

Open-Tamil

Text Processing Library

A. Muthiah, T. Shrinivasan, M. Annamalai

13th Tamil Internet Conference – 2014, Puducherry, India

[email protected]

Page 2: Open-Tamil text processing library

INFITT-2014 2

Process Tamil Text

Page 3: Open-Tamil text processing library

INFITT-2014 3

Installing

● Python package– Python Package installer (pip)

– https://pypi.python.org/pypi/Open-Tamil/

● Git-Hub collaboration– Open-Tamil core repo

https://github.com/arcturusannamalai/open-tamil/

● Social blogs– http://ezhillang.wordpress.com/

Page 4: Open-Tamil text processing library

INFITT-2014 4

Access

1.Tamil Letters2.Vowels3.Consonants

Page 5: Open-Tamil text processing library

INFITT-2014 5

Access

• Tamil Letters• Vowels

• Consonants

Page 6: Open-Tamil text processing library

INFITT-2014 6

Calculate Length of words

Try the open-tamil library, which you can install from pip:

$ pip install open-tamilHere's how to use it:

import tamilletters_list = tamil.utf8.get_tamil_letters( u” ” நசநேதாஷமாக )

Page 7: Open-Tamil text processing library

INFITT-2014 7

Sort

Page 8: Open-Tamil text processing library

INFITT-2014 8

Word Frequency

Page 9: Open-Tamil text processing library

INFITT-2014 9

Text – to - IPA

Page 10: Open-Tamil text processing library

INFITT-2014 10

Text – to - IPA

Page 11: Open-Tamil text processing library

INFITT-2014 11

Font Conversion

Page 12: Open-Tamil text processing library

INFITT-2014 12

Font Conversion Types

1. anjal

2. bamini

3. boomi

4. dinakaran

5.dinamani

6.dinathanthy

7.kavipriya

8.murasoli

9.mylai

10.nakkeeran

11.roman

12.tab

13.tam

14.tscii

15. pallavar

16. indoweb

17. koeln

18. libi

19. oldvikatan

20. webulagam

21. Diacritic

22. Shreelipi

23. Softview

24. Tace

25. Vanavil

Page 13: Open-Tamil text processing library

INFITT-2014 13

To unicode

Page 14: Open-Tamil text processing library

INFITT-2014 14

ngram

Page 15: Open-Tamil text processing library

INFITT-2014 15

Transliterate

Page 16: Open-Tamil text processing library

INFITT-2014 16

Reverse Words

Page 17: Open-Tamil text processing library

INFITT-2014 17

On Screen Keyboard for Tamil 99

➔ JQuery➔ JQuery UI based➔ Free to use on web➔ e.g. Www.Urbantamil.com

Page 18: Open-Tamil text processing library

INFITT-2014 18

Open-source

➔ Multi-licensed➔ MIT, and other

OSS

➔ Multi-language➔ C, Python,

JavaScript, C

Page 19: Open-Tamil text processing library

INFITT-2014 19

1.Websites:1.Ezhil Language2.UrbanTamil

2.Installs on Python1.1000+ downloads on PIP

Current Users

Page 20: Open-Tamil text processing library

INFITT-2014 20

Contributors

Muthu, Shrini, Sathiand Arulalan.

Page 21: Open-Tamil text processing library

INFITT-2014 21

Demo

Page 22: Open-Tamil text processing library

INFITT-2014 22

Thanks

Contributors, Comments welcome!Thanks for your attention.