1 Today’s Plan 900am–915am:Sort out questions regarding refund forms 915am–945am:Finalise...

Preview:

Citation preview

1

Today’s Plan

• 900am–915am: Sort out questions regarding refund forms

• 915am–945am: Finalise today’s agenda• 945am-1045am: Breakout Groups Session 1• 1045am–1115am: Coffee + recalibration at 11• 1115am–1215pm: Breakout Groups Session 2• 1215pm–145pm: Lunch• 145pm–245pm: Report-backs• 245pm–300pm: Wrap-up

2

Yesterday’s Shifting Sentiment #1

0

1

2

3

4

5

6

0 1 2 3 4 5 6 7 8 9 10

Round 1

Round 2

Round 3

3

Yesterday’s Shifting Sentiment #2

0

2

4

6

8

10

12

14

16

Round 1 Round 2 Round 3

0-2

3-6

7-10

4

Yesterday’s Main Themes #1

• General Methodological Concerns:– Reasons to be cautious [Scott, Moore, Di

Eugenio, McCoy, Green]• Frameworks for Evaluation:

– Frameworks for evaluation [Paris, Stent]– GENEVAL [Belz and Reiter] – Building on RAGS [Mellish + Scott]

• Resources:– A Wiki [Stent]– Shared data resources [Walker]

5

Yesterday’s Main Themes #2

• Shared Tasks:– DUC or other existing STECs as a host [McKeown]

– Referring Expression Generation [Gatt, Viethen]

– Virtual Environments [Koller et al]– Question Qeneration [Rus + Graesser]– Textual Variation [McDonald]

6

Today’s Goals

• Observation:– Opinion is spread between those who favour a Shared Task (one or many) and those who are cautious

• Hypothesis #1:– Those who favour a Shared Task will push ahead anyway

• Hypothesis #2:– Those who are cautious are the best people to identify the relevant desiderata

7

Possible Working Groups

• Desiderata Development• Evaluation• Shared Resources• Shared Task #1: Text to Text (perhaps

via DUC?)• Shared Task #2: Referring Expression

Generation• Shared Task #3: Data to Text• Shared Task #4: Virtual Environments

8

Target Outcome

• A document• How far can we get today?

– A set of PowerPoint files that plans the sections of the document to one bullet point per subsubsection or paragraph

– Report-back on planned content to the assembled group

• After the workshop:– Flesh out the text

9

Task 1: Desiderata

• Main Sections:– What’s unique about NLG Evaluation– Approaches and Frameworks– A ‘due diligence checklist’ —` a clear enumeration of the things you should consider if you are going to pursue a Shared Task in NLG

10

Tasks 2 through N: Shared Tasks Proposals

• Main Sections:– A definition of the Shared Task and its type– The aims of the Shared Task: what it will achieve– The subcommunity it seeks to engage, and how it will do this

– Whether the task will involve evaluation and how– The resources required, and how they will be obtained

– A plan of execution– [Later] How this Shared Task addresses the desiderata

11

Today’s Plan

• 900am–915am: Sort out questions regarding refund forms

• 915am–945am: Finalise today’s agenda• 945am-1045am: Breakout Groups Session 1• 1045am–1115am: Coffee + recalibration• 1115am–1215pm: Breakout Groups Session 2• 1215pm–115pm: Lunch• 115pm–245pm: Report-backs• 245pm–300pm: Wrap-up

12

Possible Working Groups

• Desiderata Development [5]• Shared Task #1: Text to Text (perhaps

via DUC?) [3]• Shared Task #2: Referring Expression

Generation [5]• Shared Task #4: Virtual Environments

[5]

13

New Thoughts and Considerations

• Is this working?• Is the spec broken or can it be refined?• What have you thought of that you

think the others could benefit from in working out their plan?

14

Today’s Plan

• 900am–915am: Sort out questions regarding refund forms

• 915am–945am: Finalise today’s agenda• 945am-1045am: Breakout Groups Session 1• 1045am–1115am: Coffee + recalibration• 1115am–1215pm: Breakout Groups Session 2• 1215pm–115pm: Lunch• 115pm–245pm: Report-backs• 245pm–300pm: Wrap-up

15

Next Steps

• One or two people in each group commits to making sure there is follow-through

• Each group commits to writing up their section of the report and delivering a draft to MW + RD by June 1st.

• At time June 1st, all sections are redistributed to all workshop participants.

• All can send comments on any part to MW+RD by mid June.• In mid-June, comments are collated and redistributed to

authors of sections.• Revised versions are returned to MW + RD by 15th July.• Distributed to community for input to be received by mid

August.• Comments taken on board and final report made available

to the community at end August.

16

The Responsible Adults

• Group 1: Donia Scott and Cécile Paris• Group 2: Marilyn Walker and Amanda

Stent• Group 3: Albert Gatt• Group 4: Johanna Moore and Alex Koller

17

Report-Backs

• Desiderata Development• Shared Task #1: Text to Text• Shared Task #2: Referring Expression

Generation• Shared Task #3: Virtual Environments

Recommended