DMML Fel p(5h54 0=0 - Chennai Mathematical Institute · 2020. 2. 20. · DMML, 20 Fel 2020 5h57:...

Preview:

Citation preview

DMML , 20 Fel 2020 5h57 : p(5h54 0=0.607 =p ,PCSHSTI 0--0-50) = pz

P ,Epa#Pitpz

-

0.450.85

Bottleneck if supervised learning - need for beetledtraining data

Volume t Timeliness

Bootstrapping - Semi Supervised learning

i. EM - Expectation MaximizationText classification - Naive Bayes

P (wi Icj)P (dilcj)

P(wife;) ⇐Occ - f wi in cj

Nik = # of occurrences- if wi in dkAll words in Cj O if die#Cj

=E Nik

1 ' fpedueg

dues. Instead den nuepldulei)-

-

E E new qq.eu E neupcduki)weEV duECj du

Semi Supervised leaning

Supervised case - compute Plwilcj) , Pldukj )

Gwen a new d

compute Pki Id) for all coF- Fractional assignments

arg Yax P (Cild) of categories thatwe discard after

ang mate

Semi Supernallabel

, say ,10% of documents

"

randomlyselected

"

tName Bayes model

Ifractional topic allocation per document

classI Now p (dulcie) makes sense

?

New estimates if Plwilcj ), P (dug)

Anther example-

MN1ST digit imagest-

training @o) test data

datalabel only 50

Cluster remain 950 using these50 as centroids

- labels are inherited from centroid

Use distance from cenhnd for logistic regression

50 labelled point IMb→ tooo labelled pawsqooo

I 1Modal Model

I ↳emeralds '' AttawayTest data

accuracy"

nearly"

point

At Nearest 20% Az

pts in each eludes

A, C Azz E Az

↳ Model Az