Upload
harriet-singleton
View
214
Download
0
Embed Size (px)
Citation preview
1
Polynomial Time Probabilistic Learning of a Subclass of Linear Languages
with Queries
Yasuhiro TAJIMA, Yoshiyuki KOTANI
Tokyo Univ. of Agri. & Tech.
2
This talk…
• Probabilistic learning algorithm of a subclass of linear languages with membership queries
• learning via queries + special examples → Probabilistic learning
Use translation algorithms
representative sample → random examples
equivalence query → random examples
3
Motivations
A simple deterministic grammar (SDG) has
at most one rule for every pair of
⇒ learning algorithm for SDG from• membership queries• representative sample
⇒ for linear languages
aA
Σa
NA
CFLs
SDLs
Regular
Linear
(Tajima et al. 2004)
4
Linear grammar
aBcA
A context-free grammar is a linear grammarif every rule is of the form
aBABcA
BA,ca,
: nonterminal
: terminal
Any linear grammar can be written in RL-linear s.t.if every rule is of the form
and
CBCaABaA
CBaCAaBA
,
,
aA
aA
aBABcA
5
has only left linear rules ( or right linear rules)
Strict-deterministic linear grammarAn RL-linear is a Strict-det linearif, for any pair of rules )2|||,(|, vuvAuA
cDvaBu , DcvBau ,or
A
cECcEaEC
bDBbDaDB
cAbACcABbA
aAS
,,
,,,
,,,,
,
Ex) }{ iiii cabaL
for some a,B,c,D
6
Deterministic linear grammar
A linear grammar is deterministic linear (DL)
if every rule is of the form
aBuA Aor
),,,,( * ΣaNBAΣvu
and
vuCBaCvAaBuA ,,
Theorem : detstrict DL
Theorem(de la Higuera, Oncina 2002) :DL : identifiable in the limit from polynomial time and data
7
MAT learning (Angluin1987)
learner
hypothesis
teacher
target language
membership query
counter example
?tLw
yes or no
})1,0{,(w
hGtL
)( hh GLL
hypothesis hG
)()( ht GLGLw
equivalence query
8
PAC learning (Valiant 1984)PAC : Probabilistic Approximate Correct
D : probability distribution
target concept
tLexample
tLw
tLu*Σ
learningalgorithm hypothesis
hL
1))(Pr( ht LLP is PAChL
9
If a hypothesis is consistent with
PAC)1)(2(ln1
ln1
thenini
Equivalence query PAC learning algorithm (Angluin[1987])⇒
examplesni
××
iG
1iG
1iG
If there is a consistenthypothesis PAC learnable⇒
examples
consistent with
examples)( ijn j
10
Probabilistic learning with queries
Learning algorithm
*Σw
*ΣD
})1,0{,(w)Pr(w
Example oracle
target language
tL
Membership query Yes or NohG
hypothesis
1)))((Pr( ht GLLP
11
Representative sample for a Strict-det
: Strict-det),,,( SPΣNG
: representative sample (RS))(GLQ
..,)( tsQwPA
wxxAS**
for some **, Nx
All rules are used to generate Q
12
cECcEaEC
bDBbDaDB
cAbACcABbA
aAS
,,
,,,
,,,,
,
Example :
),},,,{},,,,,,{( SPcbaΣEDCBASNG
{P
then
},,,{ aaacccaaabbbacabQ is a representative sample (RS)
13
Rule occurring probabilitytG : a target grammar
: a probability distribution on for an example : error parameter : confidential parameter : the size of target grammar’s rules
For every rule , define
D *Σ
*ΣD})1,0{,(w
A
})(,
,|{)(
*21
*
2121
**
ΣNsomefor
wASΣwAZtt GG
|| tP
)Pr(w
14
)(
)Pr()Pr(
AZw
wA
is a rule occurring probability s.t. appears in the derivation of an example
*ΣD
})1,0{,(w
)Pr(w
)Pr( A
is an probability that• • and is used in the derivation
)( tGLwA
wStG
*
A
15
LetSuppose
The set of m-examples containsa set of RS with the probability
Proof: “Any rule doesn’t appear in derivations of m-examples”occurs
tP
dm log
1
}|)min{Pr( tPAAd
1
dmt
mt ePdP )1(
RS
*Σ
D
m
16
We can conclude that
1. Equivalence query can be replaced by
random examples
2. Representative sample can be replaced by
random examples
)1)(2(ln
1ln
1ini
tP
dm log
1
17
example oracle
membership oracle
learning algorithm
membership query
equivalence query
representative sample
quer
y response
nega
m-randomexamples
posi
n-random examples
probabilistic learning algorithm with queries
consistency check
18
Learning algorithm via queries and RS
while (finish == 0) begin
make nonterminals from
make rules and hypothesis
if (equivalence query for responds “yes”)
output , finish = 1
else
update by the counterexample
end
hN
ΣT
T w
hP hG
hG
}|),,{( RSuvwwvuM h
hMT ,
hG
19
Making nonterminals
}|),,{( RSuvwwvuM h
)()(),,(),,( xyzMEMuvwMEMzyxwvuT
),,,(T
wvuA
)/(T
hh MN
then
: a nonterminal = an equivalence class contains (u,v,w)
20
Making rules
),,,,(),,,(TT
wvbuaAwavbuA
,),,,(),,,( bwavuAwavbuATT
}),,,( awauAT
Make all rules as follows except for not consistent with query results
{CFGP
)),,,(,,,(T
hhh wAPΣNG
Select a hypothesis randomly CFGh PP
21
a set of Strict-det
(not bounded bya polynomial)
SDSDSD
Exact learning of strict-det
• Strict-det is polynomial time exact learnable via– membership queries, and– a representative samples (RS)
c.f. [Angluin(1980)] for regular sets
RSPossible
rules
The learning algorithm overview:
SD SD SD
Chose one randomly,Equivalence query
SD
The correct hypothesisWitnesses delete incorrect rule
22
Conclusions
• Strict-det linear language can be probabilistic learnable with queries in polynomial time
Future works• Identification from polynomial time and data
(teachability)
• RS → Correction queries
23
24
Theorem
Strict-det linear languages are
polynomial time probabilistic learnable with membership queries
25
Simple Deterministic Languages
• Context-free grammar(CFG)
in 2-standard Greibach normal form is
Simple Deterministic Grammar (SDG) iff
is unique for every and
• Simple Deterministic Language (SDL) is the generated language by a SDG
),,,( SPΣNG
)2||,( * NaβANA Σa
26
Representative sample for an SDG
: SDG),,,( SPΣNG
: representative sample (RS))(GLQ
..,)( tsQwPaA
wxaxAS**
for some **, Nx
All rules are used to generate Q
27
Example :
),},,,{},,,,{( SPcbaΣCBASNG
,,{ cCSaASP
,, bAaABA
},, bCcCBCbB then
},,{ ccbbaabbabQ is a representative sample (RS)
28
PAC learning
1)))((Pr( ht GLLP
tL
)( hGL
Target language :
Hypothesis language :
A PAC learning algorithm outputs such thathG
where
)(
)())((tt GLLw
ht wPGLLP
PProbability distribution : on *Σ
(Valiant1984)
29
Query learning of SDLs
• SDLs are polynomial time learnable via membership queries and a representative sample
tLthe learner the teachermembership query
?tLw
yes / nohG
representativesample
at the beginning
representative sample : a special finite subset of tL
)( tGL
(Tajima2000)
30
Learning model
tLthe learner the teachermembership query
?tLw
yes / nohG
representativesample
at the beginning
representative sample : a special finite subset of tL
)( tGL