Online Max-Margin Weight Learning with Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney...

Online Max-Margin Weight Learning

with Markov Logic Networks

Tuyen N. Huynh and Raymond J. Mooney

Machine Learning GroupDepartment of Computer ScienceThe University of Texas at Austin

Star AI 2010, July 12, 2010

Outline

Motivation Background

Markov Logic Networks Primal-dual framework

New online learning algorithm for structured prediction

Experiments Citation segmentation Search query disambiguation

Conclusion

Motivation

Most of the existing weight learning for MLNs are in the batch setting. Need to run inference over all the training

examples in each iteration Usually take a few hundred iterations to

converge Cannot fit all the training examples in the

memory

Conventional solution: online learning

Background

An MLN is a weighted set of first-order formulas

Larger weight indicates stronger belief that the clause should hold

Probability of a possible world (a truth assignment to all ground atoms) x:

Markov Logic Networks (MLNs)

iii xnw

ZxXP )(exp

Weight of formula i No. of true groundings of formula i in x

[Richardson & Domingos, 2006]

2.5 Center(i,c) => InField(Ftitle,i,c)1.2 InField(f,i,c) ^ Next(j,i) ^ ¬HasPunc(c,i)=> InField(f,j,c)

Existing discriminative weight learning methods for MLNs

maximize the Conditional Log Likelihood (CLL) [Singla & Domingos, 2005], [Lowd & Domingos, 2007], [Huynh & Mooney, 2008]

maximize the margin, the log ratio between the probability of the correct label and the closest incorrect one [Huynh & Mooney, 2009]

Online learning

Regret = R(T) =P T

t=1 lt(wt) ¡ minw2W

Regret = R(T) =TX

ct(wt) ¡ minw2W

(8)Regret = R(T) =TX

ct(wt) ¡ minw2W

ct(w) (1)

Regret = R(T) =P T

t=1 ct(wt) ¡ minw2WP T

t=1 ct(w)

A general and latest framework for deriving low-regret online algorithms

Rewriting the regret bound as an optimization problem (called the primal problem), then considering the dual problem of the primal one

A condition that guarantees the increase in the dual objective in each step

Incremental-Dual-Ascent (IDA) algorithms. For example: subgradient methods

Primal-dual framework [Shalev-Shwartz et al., 2006]

Primal-dual framework (cont.)

Proposed a new class of IDA algorithms called Coordinate-Dual-Ascent (CDA) algorithm: The CDA update rule only optimizes the

dual w.r.t the last dual variable A closed-form solution of CDA update rule

CDA algorithms have the same cost as subgradient methods but increase the dual objective more in each step converging to the optimal value faster

Primal-dual framework (cont.)

CDA algorithms for max-margin structured prediction

Max-margin structured prediction

),();,( yxwwyxf T

),(maxarg);( yxwwxh T

)',(max),();,(\

yxwyxwwyx T

Steps for deriving new CDA algorithms

1. Define the regularization and loss functions

2. Find the conjugate functions3. Derive a closed-form solution for the

CDA update rule

1. Define the regularization and loss functions

Label loss function

1. Define the regularization and loss functions (cont.)

2. Find the conjugate functions

2. Find the conjugate functions (cont.)

Optimization problem:

Solution:

3. Closed-form solution for the CDA update rule

CDA algorithms for max-margin structured prediction

Experiments

Citation segmentation

Citeseer dataset [Lawrence et.al., 1999] [Poon and Domingos, 2007]

1,563 citations, divided into 4 research topics

Each citation is segmented into 3 fields: Author, Title, Venue

Used the simplest MLN in [Poon and Domingos, 2007] Similar to a linear chain CRF: Next(j,i) ^ !HasPunc(c,i) ^ InField(c,+f,i) => InField(c,+f,j)

Experimental setup

Systems compared: MM: the max-margin weight learner for

MLNs in batch setting [Huynh & Mooney, 2009]

1-best MIRA [Crammer et al., 2005]

Subgradient [Ratliff et al., 2007]

CDA1/PA1 CDA2

Experimental setup (cont.)

4-fold cross-validation Metric:

CiteSeer: micro-average F1 at the token level

Used exact MPE inference (Integer Linear Programming) for all online algorithms and approximate MPE inference (LP-relaxation) for the batch one.

Used Hamming loss as the label loss function

Average F1

Average training time in minutes

Microsoft web search query dataset

Used the clean-up dataset created by Mihalkova & Mooney [2009]

Has thousands of search sessions where an ambiguous queries was asked

Goal: disambiguate search query based on previous related search sessions

Used 3 MLNs proposed in [Mihalkova & Mooney, 2009]

Experimental setup

Systems compared: Contrastive Divergence (CD) [Hinton 2002]:

used in [Mihalkova & Mooney, 2009] 1-best MIRA Subgradient CDA1/PA1 CDA2

Metric: Mean Average Precision (MAP): how close

the relevant results are to the top of the rankings

MAP scores

Conclusion

Derived CDA algorithms for max-margin structured prediction Have same computational cost as existing

online algorithms but increase the dual objective more

Experimental results on two real-world problems show that the new algorithms generally achieve better accuracy and also have more consistent performance.

Thank you!

Questions?

Online Max-Margin Weight Learning with Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney...

Documents

The Mooney FlyerThe Mooney Flyer Volume 6 Number 10 October 2017 Page 2 Features Understanding Mooney Rigging Advice tomEditors Paul Loewen, Legendary Mooney Guru. Uncontrolled Airports

Netherlands Workshop Viet Nam...Dr. Tran Mai Kien, NE Dr. Hoang Minh Tuyen, NE Ms Huynh Thi Lan Huong, NE Mr. Dang Quang Thinh, NE Dr. Hoang Duc Cuong, NE Mr. Tran Quynh, NE Mr. Vuong

Nguyen Danh Tuyen Thesis

Tuyen Quang Attractions - TakingITGlobal Quang... · Tuyen Quang Attractions Tuyen Quang has a total of 467 places of interests including such nation-wide important sites as Tan Trao,

Mooney - dam.bakerhughes.com

So+52 kh+tuyen

Mooney Viscometer

phongdaotao2.ntt.edu.vnphongdaotao2.ntt.edu.vn/Resource/Upload/file/2014/Tuyen sinh/TB tuyen...1.3. Ðói tuçng thú ba - xét tuyen thing: Ðói tuçng thú ba - xét tuyen thing:

MOONEY VISCOSITY, MOONEY ELASTICITY AND ...scientificadvances.co.in/admin/img_data/244/images/[5...MOONEY VISCOSITY, MOONEY ELASTICITY AND PROCESABILITY OF RAW NATURAL RUBBER JIŘÍ

TOP NOTCH Mooney M-18topnotchkits.com/index_files/Gallery/Mooney/Mooney... · TOP NOTCH Mooney M-18 Mooney Mite ASSEMBLY MANUAL ... of structure is very light and very strong and

Funding Bioinformatics Chuong Huynh NIH/NLM/NCBI, Bethesda, MD USA huynh@ncbi.nlm.nih.gov

Online Max-Margin Weight Learning for Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science

Reported by Vietnamese group Nguyen Manh Khai, Dinh Thai Hoang, Huynh Thi Thanh Tuyen Khon Kaen, 11/2011 IMPROVE TECHNOLOGY IN COOLING AND COLD TRANSPORTING

Tuyen N. Huynh Adviser: Prof. Raymond J. Mooney

Mooney Regulators - Contro Valve · Mooney Regulators Spec Data | 3 Mooney Flowgrid Product Overview GE’s Mooney Flowgrid Regulator is an easy-to-maintain valve for self-contained

KHANG HUYNH

Tuyen tap truyen_cuoi_vova

sonoivu.vinhphuc.gov.vn · UBND huyên duqc thvc hiên bäng hình thúc thi tuyen. 2. Vi trí viêc làm, chi tiêu tuyen dvng a) Vi trí viêc làm tuyen dung: - Khôi Tieu hqc:

Copyright by Tuyen Ngoc Huynh 2011huynh/papers/huynh_dissertation.pdf · The Dissertation Committee for Tuyen Ngoc Huynh certi es that this is the approved version of the following

QOPColorCover2 - Mooney