40
The 2010 Secretary’s The 2010 Secretary’s Annual Report Annual Report on ISO/TC37/SC4 on ISO/TC37/SC4 “Language resource “Language resource management” management” 2010-08-15 http://www.tc37sc4.org/

The 2010 Secretary’s Annual Report on ISO/TC37/SC4 “Language resource management” 2010-08-15

Embed Size (px)

Citation preview

The 2010 Secretary’s The 2010 Secretary’s Annual Report Annual Report

on ISO/TC37/SC4 on ISO/TC37/SC4 “Language resource “Language resource

management”management”

2010-08-15

http://www.tc37sc4.org/

ContentsContents

MembershipWorking Groups

Thematic Domain GroupsTask Forces

Project ItemsOn-going Ballots

MeetingsIssues and Proposals

2

MembershipMembership

MembershipMembership

OrganizationP-members (23 ->24)O-members (9 -> 8)Liaisons (12 ->11)

Key contact persons

4

Organization Organization

Chairperson Body: AFNOR (France) Name: Romary, Laurent

Secretary Body: KATS (Korea) Name: Choi, Key-Sun Choi

5

P-members P-members (24)(24)

1. AENOR (Spain) 2. AFNOR (France) 3. ANSI (USA) 4. ASI (Austria)5. BSI (United

Kingdom)6. CSK (Korea, DPR)7. DIN (Germany)8. DS (Denmark)9. GOST R (Russian

FED.)10. JISC (Japan)11. KATS (Korea, Rep.)12. MSA (Malta)

1. NEN (Netherlands)2. NSAI (Ireland)3. PKN (Poland)4. SABS (South Africa)5. SAC (China)6. SCC (Canada)7. SIS (Sweden)8. SN (Norway)9. TISI (Thailand)10. TSE (Turkey)11. UNI (Italy)12. UNMZ (Czech Rep.)

6

O-members O-members (8)(8) and liaisons and liaisons (11)(11)

1. ASRO (Romania)2. BDS (Bulgaria)3. DSSU (Ukraine)4. ISS (Serbia)5. NBN (Belgium)6. SFS (Finland)7. STAMEQ (Vietnam)8. SUTN (Slovakia)

1. ISO/IEC JTC 001/SC 29

2. ISO/TC 46/SC 093. ISO/TC 184/SC 044. ELRA5. Infoterm6. LISA7. OMG8. TEI9. TERMNET10. UIC11. UNESCO

7

Key personsKey persons

(Miss) Hyojin Won International Standards Support, KSA [email protected]

Jenny Pellaux ISOCS-TPM [email protected]

8

Working GroupsWorking Groups

Working GroupsWorking Groups

WG 01 “Basic descriptors and mechanisms for language resources” Convenor: Nancy Ide

WG 02 “Representation schemes” Convenor: Kiyong Lee

WG 03 “Multilingual information representation” Convenor: Nasredine Semmar

WG 04 “Lexical resources” Convenor: Nicoletta Calzolari

WG 05 “Workflow of Language resource management” Convenor:

10

Thematic Domain Thematic Domain GroupsGroups

Thematic Domain GroupsThematic Domain Groups

Status: ad hoc Established in May 2004, Lisbon Triple Function:

(1) Liaison to ISOCat (2) Incubator for new work item proposals (3) Working with international groups: e.g. ISA with IWCS, LREC, FLaReNet

12

TDG 01 Metadata: Peter Wittenburg TDG 02 Morphosyntactic data categories: Gil Francopoulo TDG 03 Semantic content representation: Harry Bunt

Activity 01 Discourse relations: Koiti HasidaActivity 02 Dialogue acts: Harry BuntActivity 03 Referential structures and Links: Laurent RomaryActivity 04 Logico-semantic relations: Scott FarrarActivity 05 Temporal entities and relations: Kiyong Lee Activity 06 Semantic roles and argument structures: Thierry Declerk

TDG 04 Syntactic data categories: Thierry Declerk

TDG 05 Machine readable dictionary: Monte George TDG 06 Multilingual Ontology: Koiti Hasida TDG 07 Lexical semantics: Monica Monachini

13

Task ForcesTask Forces

Task ForcesTask Forces

Task Force for the Harmonization of Principles (TFH)

Convenor: Nancy Ide

Task Force for Terminology Coordination (TFTC)

Convenor and liaison to TC 37/TCG: Alex C. Fang

15

Project ItemsProject Items

Project ItemsProject Items

14 Active project items: WG 01 (4), WG 02(9), WG 03 (1)

3 Unregistered project items

2 ISO Published Standards◦ ISO 24610-1: 2006 “Language resource

management - Feature Structures - Part 1: Feature Structure Representation (FSR)”

◦ ISO 24613: 2008 “Language resource management - Lexical Markup Framework (LMF)”

17

WG 01: BASIC DESCRIPTORS AND MECHANISMS FOR LANGUAGE RESOURCES

Convenor: Nancy Ide4 Projects

18

WG 01-01: WD 24610-1 “Language resource management - Feature structures – Part 1: Feature structure representation (FSR)” Project leaders: Kiyong Lee, Gerald Penn• revision of ISO 24610-1:2006 Feature Structures Part 1:

Feature structure representation (FSR:2006) Joint work with TEI: Lou Burnard

WG 01-02: FDIS 24610-2 “Language resource management - Feature Structures - Part 2: Feature Systems Declaration (FSD)” Project leaders: Kiyong Lee, Gerald Penn

WG 01-03: DIS 24612 “Language resource management - Linguistic Annotation Framework (LAF)” Project leader: Nancy Ide

• WG 01-04: DIS 24619 “Language resource management - Persistent identification and access in language technology applications (PID)” Project leader: Daan Broeder

19

WG 02: REPRESENTATION SCHEMES

Convenor: Kiyong Lee9 Projects

20

21

WG 02-01: DIS 24611 “Language resource management - Morphosyntactic annotation framework (MAF)”• Project leader: Eric de la Clergerie

WG 02-02: DIS 24614-1 “Language resource management - Word segmentation of Text – Part 1: Basic concept s and general principles (WordSeg-1)”• Project leader: SUN Maosong

WG 02-03: WD 24614-2 “Language resource management - Word Segmentation of Text – Part 2: Word Segmentation for Chinese, Japanese and Korean (WordSeg-2)”• Project leaders: SUN Maosong, Key-Sun Choi, Hitoshi

Isahara

WG 02-04: FDIS 24615 “Language resource management -Syntactic annotation framework (SynAF)”• Project leader: Thierry Declerck

WG 02-05: DIS 24617-1 “Language resource management - Semantic Annotation Framework – Part 1: Time and events (SemAF/Time, ISO-TimeML)”• Project leader: Kiyong Lee• Editors: James Pustejovsky (chair), Branimir Boguraev, Harry

Bunt, Nancy Ide, Kiyong Lee• (Cancellation date: 2010-10-13)

WG 02-06: DIS 24617-2 “Language resource management -Semantic Annotation Framework – Part 2: Dialogue acts (SemAF/ Dacts ) ”• Project leader: Harry Bunt• Editors: Harry Bunt (chair), Jan Alexadersson, Jean Carletta,

Jae-woong Choe, Volha Petukhova, Alex C. Fang, Koiti Hasida, Andrei Popescu-Belis, Claudia Soria, David Traum,

22

WG 02-06: NP 24617-3 “Language resource management - Semantic Annotation Framework – Part 3: Named entities (SemAF/NE) ”

Project leader: Gil Francopoulo

WG 02-07: NP 24617-4 “Language resource management - Semantic Annotation Framework – Part 4: Semantic roles (SemAF/SRL) ”

Project leader: Martha Palmer

WG 02-08: NP 24617-5 “Language resource management - Semantic Annotation Framework – Part 5: Discourse Structures entities (SemAF/DS) ”

Project leader: Gil Francopoulo

WG 02-09: PWI 24617-6 “Language resource management - Semantic Annotation Framework – Part 6: Space (SemAF/ISO-Space) ”

Project leader: James Pustejovsky

23

WG3 MULTILINGUAL INFORMATION

Convenor: Nasredine Semmar1 Project

24

WG 03-01: DIS 24616 “Language resource management - Multilingual information framework (MLIF)”◦Project leader: Samuel Cruz-Lara◦Limit date: 2010-10-15

25

WG 4 Lexical Resources

Convener: Nicoletta Calzolari 1 Project

26

WG 4-1: ISO 24613 Lexical Markup Framework (LMF)◦ Project leaders: Monte George, Gil

Francopoulo◦ Status: ISO International Standard 2008

27

Unregistered PWIUnregistered PWI

ISO NP 24620 (OMG) “Language resource management – Simplified natural languages – Part 1: Basic concepts and general principles (simpL-1)”

Project leaders: Thierry Declerck, Sung-Kwon Choi Editor: Doug Lawrence ISO NP 2462x “Language resource management –

Segmentation rules eXchange (SRX)” Proposed project leader: Arle Lommel

ISO PWI 2462x (OMG) “Language resource management – Temporal Vocabulary ”

Proposed project leader: Mark Linehan

28

On-going BallotsOn-going Ballots

end dateNP 2462x SRX 2010-08-08FDIS 24614-1 WordSeg-1 2010-09-05NP 24617-4 SemAF-SRL 2010-09-14FDIS 24615 SynAF 2010-10-02NP 24617-5 SemAF-DS 2010-10-17DIS 24614-2 WordSeg-2 2010-10-26DIS 24617-2 SemAF-Dacts 2010-12-30

30

MeetingsMeetings

Meetings 2009 Meetings 2009

2009-09-14/16: Tilburg, The Netherlands WG 2: MAF, SynAF, SemAF-Dacts

2009-09-24/26: Fragrant Hill Hotel, Beijing, ChinaWG 2 WordSeg-1/2 editorial meeting

2009-11-01/05: Brandeis, Waltham, MA, USA WG 1-2, FLaReNet, SILT

32

Meetings 2010 Meetings 2010

2010-01-15/20: City University of Hong Kong WG 1, WG 2, WG 3, WG 4, ISOCat ISA-5, ICGL 2010

2010-03-20/22: Beijing Xijiao Hotel, Beijing, China WG2 WordSeg-2 Editorial Meeting

2010-05-17/21: Valletta, Malta Tutorial + LRT workshop + WG2 + TDG 3, LREC 2010

2010-08-15/20: Dublin, Ireland TC 37 and SCs Annual Meetings

2010-10-13/15: DIN, Berlin, Germany TDG 1 + WG 2 + WG 4

33

Meetings 2011Meetings 2011

2011-01-10/11 Oxford, United Kingdom WG 2 + ISA-6, IWCS 2011 (2011-01-12/14)

2011-05: to be discussed

2011-08-14/19: TC 37 + SCs meetings, Seoul Palace Hotel, Seoul, South Korea

2010-10: to be discussed

34

Issues and ProposalsIssues and Proposals

Cross-institutional collaborationsCross-institutional collaborations

ISO/TC 37/SC 4• generic models for LR management• target expert groups with wide international coverage• stability - consensus

ISO/TC 37/SC 4• generic models for LR management• target expert groups with wide international coverage• stability - consensus

TEI – Text Encoding Initiative• reference XML vocabularies• specification infrastructure (ODD)• back office format for ISO documents• reactivity larger community

TEI – Text Encoding Initiative• reference XML vocabularies• specification infrastructure (ODD)• back office format for ISO documents• reactivity larger community

W3C• dedicated application profile for web-based applications• articulation with other web-based standards (e.g. web service)• industry based requirements• bridge to various industries, e.g. localization

W3C• dedicated application profile for web-based applications• articulation with other web-based standards (e.g. web service)• industry based requirements• bridge to various industries, e.g. localization

Consequences for SC 4Consequences for SC 4

Work on a wide coverage of language resource levels◦Ex.: Systemacity of SemAF components (Time,

Space, Dialogue Acts, Named entities, discourse structures, semantic roles)

Articulate SC 4 standards with industry standards◦Ex.: MLIF XLIFF, TMX, SMIL

Avoid maintaining XML formats as ISO standard◦Ex.: SynAF. Tiger or TEI can be good serialisations

Proposal for wikiProposal for wiki

• Website of TC37/SC4–Purpose: • To give information to the experts• To communicate with standard users• To show the feasible solution based on standards

–Maintenance• Convenor and project leaders will put the information

– Idea collection stage• Organization of wiki

–Please access to: http://swrc.kaist.ac.kr/isotc37wiki/ • id: WikiSysop (case-sensitive) pw: isowiki$&14

Practical problemsPractical problems

PWI -> NP stage (1) Working draft (2) Editorial or consulting groupManagement: co-PLsEditorial: DIS -> FDIS stage (1) Producing documents in MS Word

format (ODD) (2) Figures Volume control on each document

39

Thank you.Thank you.