Zhenghua Li, Jiayuan Chao, Min Zhang, Wenliang Chen {zhli13, minzhang, wlchen}@suda.edu.cn;...

Zhenghua Li, Jiayuan Chao, Min Zhang, Wenliang Chen {zhli13, minzhang, wlchen}@suda.edu.cn; china_cjy@163.com; Soochow University, China Coupled Sequence Labeling on Heterogeneous Annotations (POS tagging)

An interesting problem in our mind The existence of multiple labeled data, with different annotation guidelines or formulations (heterogeneous annotations) How to effectively utilize such data? How to train a model with heterogeneous data?

An interesting problem in our mind CTB PD Train a better model?

Challenges How to capture the structure/tag correspondences between two guidelines? Usually context-dependent. Hard to represent with rules. The datasets (PD/CTB) are typically non- overlapping. Thus it is difficult to build a model to automatically learn the correspondences.

Previous work Guide-feature based methods (stacked learning) Word segmentation, POS tagging (Jiang+ 09; Sun & Wan 12; Jiang+12; Gao+ 14) Dependency parsing (Li+ 12) Constituent treebank conversion (Zhu+ 11; Jiang+ 13)

Guide-feature based methods PD /n Tagger (PD) CTB /NR Tagger (CTB)

Guide-feature based methods PD /n Tagger (PD) CTB /NR (n) Tagger (CTB) Extra guide features

The problem with guide-feature based methods The methodology is not simple/elegant: twice training/decoding. Although very effective and robust for different problems very simple to implement. The source data is not fully exploited, and not directly contribute to training. The final target model does not directly learn from the source sentences. (Prof. Haifeng Wang, Baidu)

This work Directly learn from two non-overlapping datasets with heterogeneous annotations. Step 1: Bundle the tags from both schemes. (product) Step 2: Learn with ambiguous labeling CTB /NR PD /n A unified model: Tagger (CTB & PD)

The big picture PD /n Tagger (CTB+PD) Trained with ambiguous labeling CTB /NR CTB+PD (bundled tag space) /NR_n Test sentence: Output: /NR_n /VV_v

Illustration of bundled tags

How to create bundled tags?

Mapping functions (Qiu+ 13) A set of bundled tags that include all possible symmetric mappings between two annotation schemes. NN => n vn an v NN NR NT

Conversion Accuracy (PD => CTB) Significantly better than baselines. +2.6 +3.3

Using Converted PD Slight accuracy decrease; much more efficient. +0.9 +0.7

Conclusions We propose a coupled CRF model for utilizing multiple heterogeneous labeled data. Can effectively learn the implicit mappings between annotations, without the need of a manually designed mapping function. Effective on both one-side POS tagging and POS conversion/transfer tasks. We have partially annotated 1,000 sentences for POS tag conversion evaluation.

Future directions Annotate more data with both CTB and PD tags, and investigate the coupled model with small amount of such annotation as extra training data. Propose a more principled and theoretically sound method to merge multiple training data. Efficiency issue Word segmentation guidelines also differ, which is ignored in this work

Thanks for your time! Questions? Codes, newly annotated data, and other resources are released at http://hlt.suda.edu.cn/~zhli for non-commercial usage.

Work going on Our approach is also effective on the word segmentation task. Adapt our approach to dependency parsing.

Coupled model used for conversion Constrained decoding PD=>CTB conversion the search space is constrained by the PD-side tags.

The big picture (conversion) PD /n Tagger (CTB+PD) Trained with ambiguous labeling CTB /NR (n) CTB+PD (bundled tag space) /NR_n Test sentence: /?_n /?_v Output: /NR_n /VV_v

Data annotation

Domain adaptation Previous studies suggest that directly combining out-domain and in-domain training data does not lead to an optimal model.

Zhenghua Li, Jiayuan Chao, Min Zhang, Wenliang Chen {zhli13, minzhang, wlchen}@suda.edu.cn;...

Documents

Inside Jiayuan (NASDAQ: DATE) -By iChinaStock

Feature-Based Laser Scan Matching and Its … Article Feature-Based Laser Scan Matching and Its Application for Indoor Mapping Jiayuan Li 2, Ruofei Zhong 1, Qingwu Hu 2,* and Mingyao

INSE6290 Jia Yuan Yu 7: Network ow - Concordia …users.encs.concordia.ca/~jiayuan/scd15/l7.pdfINSE6290 Jia Yuan Yu 7: Network ow Concordia November 19, 2015 Network ow problems come

Jiayuan International Group Limited 佳源國際控股有 …jiayuanintl.com/backend/upload/5d29b27-1488776192.pdf“Issue Mandate” a general mandate proposed to be granted to the

A Digital Flow for Asynchronous VLSI Systems: Status Update · A Digital Flow for Asynchronous VLSI Systems: Status Update Udit Agarwal1, Samira Ataei2, Jiayuan He1, Wenmian Hua2,

What Can Help Pedestrian Detection? - CVF Open Accessopenaccess.thecvf.com/content_cvpr_2017/papers/Mao_What...What Can Help Pedestrian Detection? Jiayuan Mao∗† The Institute for

Technical Writing S03 Providence University 1 Cause and Effect & Hypothesizing Wu-Lin Chen (wlchen@pu.edu.tw) Department of Computer Science and Information

Impact and Modeling of Anti-Pad Array on Power Delivery System Zhiping Yang 1, Jin Zhao 2, Sergio Camerlo 1, Jiayuan Fang 2 1 SVS Signal Integrity & Packaging

Providence University College of Management I NTRODUCTION - Reviewing Previous Research - Advancing to Present Research Wu-Lin Chen (wlchen@pu.edu.tw)

Jiayuan International Group Limited 佳源國際控股有 …jiayuanintl.com/backend/upload/103e505-1488782333.pdf– 3 – CONSOLIDATED STATEMENT OF FINANCIAL POSITION At 31 December

LearningVisually ...freda/file/presentation/shi2018learning.pdfLearningVisually-GroundedSemanticsfromContrastiveAdversarialSamples Haoyue Shi1, Jiayuan Mao2, Tete Xiao*1, Yuning

Phys 150 Lecture 71 Please sit together in pairs: Onyema and Jason Alejandra and Jiayuan Chris and Erica Ryan and John W. John F. and Dian Walter

10411Y - Yuyao Jiayuan Hydraulic Fit...ORFS MALE O-RING SEAL ISO12151-1 序号产品代号螺纹 E O型圈 SN PART NO. THREAD E 公称内径 DN 标号 DASH O-RING A C S1 1 14211Y-04-04

L/O/G/O Yifeng Zhou Ph.D zhouyifeng@suda.edu.cn New Wine in Old Bottles A New Approach of SNP Function Research Soochow University

images.nature.com · Web viewSupporting information High efficiency all-polymer tandem solar cells Jinan Gu+, Guozheng Shi+, Jianxia Sun, Hai-Qiao Wang and Wanli Ma. wlma@suda.edu.cn

L1: Introduction - INSE6290 Quality in Supply Chain Designusers.encs.concordia.ca/~jiayuan/scd16/l1.pdf · L1: Introduction INSE6290 Quality in Supply Chain Design Jia Yuan Yu Concordia

CS131( Panoramic(Image(S6tching( - Artificial Intelligencevision.stanford.edu/teaching/cs131_fall1617/lectures/lecture_panora… · Jiayuan Ma PA1 Session 16#Oct#15(CS131(Panoramic(Image(S6tching(Ranjay(Krishna(16#Oct#15(1(Leonid

Jiayuan International Group Limited 佳源國際控股有 …jiayuanintl.com/backend/upload/d4c4cf-1584581298.pdffair value change on investment properties, fair value gain upon transfer

Jiayuan International Group Limited 佳源國際控股有 …jiayuanintl.com/backend/upload/355a158-1496793658.pdfprecedents thereunder. As the Placing and the Subscription may or

Sleep Apnea Monitoring and Alarm System Prepared by Jiayuan Wang Ying Zhou Renyuan Cheng