Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
BinaryClassification Part 4
Ada Boost FreundSchapire 1997Cg base classifiers g X 3 l ti weak learners
Xi Y r Xn In iid on f e E l ti
Goal return In E conc G use sgn IntoK classifyZ Lk Gk Ix
n Kilfnlxl K
Idkfeel
where K is fixed ft iterations and Lazo and9h C Gg are determined iteratively from data
Note can use Sgn 2k grind to classify
AdaBoost
init w Cwf w't't w't ki ti
for Kel K do
gh aggzing encgwhere engl wf gyi gkB
eh i ekcgk
11 standing assumption eh I 12
w k wcht update
2h f log teake k tr I ele ele Lte 70
WC.zkth w.ie eKYI 9k Cei
2k norm
2 k
e dkYigkiti ek if Yi gacei
e he if Yi goecki
return Inca 7 he grixKdk
ka kPreview n hp L Fn E IT 2jeb I
Key analysis tool 6 a 1 new min 1 e
1 Lip OE U C E l
878 o
o a cel 18 761.12result due to Koltchinsk i Panchenko zoo
Lemma f expf Yi Legaki II 2eTEProof exp f Yi 7 Liege Ki
KIT expt LkYigkkik lKWick 1IT
A 1 WickKk
KITWilk 1 K
IT 2 kRel k I
Wickett µ
willIT 2gk
nwickt II Ek
K
d expl Yi angrixi 2 wet
IT 2gke
where Zh 2 elect ece can be shown fromdefs 8
Per fan 6 Cu I n ew
E en
By K P adaptive margin bound w p Z l S
4th Egfo Aerosmith FERNIGHTL t Cf t
take any0 LT El
Fix Y C 0,13
Aerosmith th IE et I Ii Inki
s 41 f Yi EE 2h Suki
EireEth QC Yi Liege Ki for a properly
chosen 8
take r in
E.fm EE.LTEth expl Yi EE argue II 2Eea
i LINE II 25 t I VII g ER Gail
t w p i s
1for 8 I n
EE a a
Note 25 e l
e ell en ELTeye ear cLy e je ee t 147 o
ek 42 7 o Wo 42 I
EE ah I log tee 2 log f
log II C YK Kisa
eh L 12 the IT 25 ok
Neural Nets
linear classifier x Sgm Lwin Wix CRd
F x w x Hull E Bw
µ I witt
w o restriction on W Rn E Fln
Ralf 1 41 4 Eel ffE I Ei Cw Xi I
T Ee l wTE I Im E cixi I
Ee Il E eixill
Cauchy Schwarz Hull fwyE 12mn71
Eq H E eixill.EE zjEeejceixjJ
V E JE Lxi ECei1 Sits
V Ez Kill
Rn FCK E By
E g if X xn are elements of a ball of red Rcentered at 0 then
Ruff xn E 13 dimension free
add a nonlinearity5 IR IR r 07 0 L Lipschitz
e g rl w tanh a e e u x
eat e u
X t Sgn Croix
Fr x t occur x Hull EB
Rn Ff X Rn ro XM F linear
E 2L Rn FHM contraction
2B_J gzPrinciple
neural nets
r IR 7112W ERM cm c IN No w IRM 3 IR
hi km IRDNo.nu o jEIwjuj
Now Chi hm x Now Leix hmCx1
w
of j wjhj Cx
t
ofh
X WL
hef y qwoChn.r hmDcx
i
Wm
than
deep neural nets define recursively
Cg base classifiers g 2e 3112 2CEIRD
fix l C IN layers5 re 112 7112 Tco _O The Lie Lip
Bi Be 0
Fo Cggiven Fo r Fj 1 5 1 l
Fj Ng wft ifm IIYI.tn ftIEzfIBj
Fe class of all f layer nets w activations0 god weight size constraints B By
Rolfe E
Fo GF fcxi 6 IT gcxI w.FEHYwmkB
i k k kg gm EG
Fz f X rz Ff wkfb.CH f i fm EF
r
W
J f
yWz10 fCx7EFjIwm
ffmCFj WH EB
Naive bound use contraction principle
Fj og o Bj absconu Fj i
Rn Ee E 2Le Be Rn Fj c
eE IT 2LjBj RnlGj I
say L Le El 2Bjt 2 IT Bjneeds tobe ez efor a bdthat does not
Next I Te blow up all
Golovich Rathlin Shamir 2017J
also see Bartlett Foster Telgarsky 2017