1
Associating Gaze Information with Human Reading Strategies Gaze behavior NLP technologies Reading strategies Text optimization s t h r e a t e n i n g t h e i r v e r y e x i s t e n c e ? skipped Clues: word surface, POS, word length, frequency, etc. Prediction with 0.95 similarity to observed data (for distribution across readers) Regardless of individuality / unstableness general reading strategy : saccade : fixation Previous label : input Sentence The man will have to : label Fixation / skip : POS DT The Length 0.07 - Trigram (2,3) NN man 0.07 - (2,3) MD will 0.06 0.38 (3,3) VB have 0.06 0.48 (3,3) TO to 0.08 0.61 (3,3) Screen -2 -1 0 1 2 : Position Features of input sequence : Surprisal : : Word Optimization of comma-placement Prediction of word fixations/skips by readers For smoothing human reading Linguistic Features CRF model CRF model-based Comma Predictor Gaze Features Human Annotation Rule-based Comma Filter + + Comma Distribution for Readability Input (Comma-less) Text

hara-san's research

Embed Size (px)

Citation preview

Page 1: hara-san's research

Associating Gaze Information with Human Reading Strategies

Gaze behavior NLP technologies

Reading strategies

Text optimization

a t t r a c t i o n s t h r e a t e n i n g t h e i r v e r y e x i s t e n c e ?

● ● ● ● ●

skipped

・Clues: word surface, POS, word length, frequency, etc.・Prediction with 0.95 similarity to observed data(for distribution across readers)・Regardless of individuality / unstableness general reading strategy

: saccade●: fixation

Previous label

: inputSentence The man will have to

: labelFixation/ skip

:POS DT

The

Length 0.07

-Trigram(2,3)

NNman

0.07

-

(2,3)

MDwill

0.06

0.38

(3,3)

VBhave

0.06

0.48

(3,3)

TOto

0.08

0.61

(3,3)Screen

-2 -1 0 1 2

:Position

Features of input sequence

:Surprisal

:

:

Word

Optimization of comma-placement

Prediction of word fixations/skips by readers

・For smoothing human readingLinguistic FeaturesCRF model

CRF model-based Comma Predictor

Gaze FeaturesHuman Annotation

Rule-based Comma Filter

+

+

Comma Distribution for Readability

Input (Comma-less) Text