Writing Analytics for Epistemic Features of Student Writing #icls2016 talk

Preview:

Citation preview

Writing Analytics for Epistemic Features of Student Writing

Simon Knight@sjgknight

www.sjgknight.com

Simon Knight, University of Technology SydneyLaura Allen, Arizona State University

Karen Littleton, Open UniversityDirk Tempelaar, Maastricht University

• Arizona State University (Laura Allen)• Open University (Karen Littleton & Bart Rienties)• Maastricht University (Dirk Tempelaar & team)• Rutgers University (Chirag Shah & Matthew Mitsui)

Acknowledgements

Epistemic Cognition

Sites of epistemic cognition: Situations

a parent is attempting to understand information around childhood vaccinations;

Public domain image from https://en.wikipedia.org/wiki/File:Fluzone_vaccine_extracting.jpg

Sites of epistemic cognition: Situations

a voter wants to investigate the plausibility of a politician’s climate change denial;

By Twm CC-By-NC-ND https://www.flickr.com/photos/twmlabs/29463820/

Sites of epistemic cognition: Situations

someone seeking to lose weight wishes to investigate the merits of diet versus regular foodstuffs or supplements.

Public domain image from https://commons.wikimedia.org/wiki/File:%22Miracle_Cure!%22_Health_Fraud_Scams_%288528312890%29.jpg

Sites of epistemic cognition: Activities

The information seeker requires more than just the ability to read content; they must make complex decisions about where to look for information, which sources to select (and corroborate), and how to synthesise (sometimes competing) claims from across sources.

Rouet [39] – students should be taught:• Skill of integration• Skill of sourcing• Skill of corroboration

“reading literacy is understanding, using, reflecting on and engaging with written texts, in order to achieve one’s goals, to develop one’s knowledge and potential, and to participate in society.” (OECD, 2013, p. 9).

Sites of epistemic cognition: Activities

“epistemological beliefs are a lens for a learner's views on what is to be learnt” (Bromme, 2009)

The Lens of Epistemic Beliefs

“exploring students’ thought processes during online searching allows examination of personal epistemology not as a decontextualized set of beliefs, but as an activated, situated aspect of cognition that influences the knowledge construction process” (Hofer, 2004, p. 43).

The Lens of Epistemic Beliefs: Activities

• Certainty – static to tentative & evolving• Simplicity – discrete to holistic• Source – external to constructed by self• Justification – authority to evaluationof knowledge (Mason, Boldrin, & Ariasi, 2009)

Epistemic Cognition

Epistemic cognition

• Certainty – static to tentative & evolving• Simplicity – discrete to holistic• Source – external to constructed by self• Justification – authority to evaluationof knowledge (Mason, Boldrin, & Ariasi, 2009)• source, corroborate, and integrate claims – key

facets of literacy for mature internet use (Rouet, 2006, p. 177)

AC

B¬A

B

…………………

ACBy x (2002)

…………………

B¬ABy y (2014)

BBy Gov (nd)

MD-TRACE & epistemic cognition relationships (Bråten et al., 2011)

Facet of cognition

Less adaptive More adaptive

Simplicity Accumulation of facts, prefer simple sources

Integrated, downplay simple sources

Certainty Single document sourcing Corroboration, represents complex perspectives and views showing the diversity of angles

Source Emphasizes own opinion, differentiates between sources less

Emphasizes source characteristics, distinguishes between source trustworthiness

Justification Emphasizes authority, less corroboration

Emphasizes use of argument schema and combination of corroboration and authority

Sites of epistemic cognition: Products

• Written outputs (summaries, reports, tests, etc.)

• Cognitive process (think aloud)• Problem navigation (pages

viewed, searches made, etc.)• Help seeking & collaborative

dialogue• Implicit/explicit assessments of

source-trust

Learning Analytics

• Increasing technology use:– foregrounds some learning needs around

complexity of literacy– affords opportunity for research & feedback

Current Study

Study Design

• ~1100 Maastricht 1st year business & economics students

Participants

Study Design

• Maastricht study credit• Coagmento terms• + wider research consent

Consent & ethics

Study Design

• ~1100 Maastricht 1st years• ~250– individual (software issues)

Participants

• ~1100 Maastricht 1st years• ~250– individual (software failure)• ~250 – collaborative but discarded data (software

issues)

Study DesignParticipants

• ~1100 Maastricht 1st years• ~250– individual (software failure)• ~250 – collaborative but discarded data (software

failure)• ~600 – collaborative & data used

Study DesignParticipants

Levels of Description:

Products (salient learning

indicators from a task)

Situation

Task (a mapping of task to activity)

Activity

Tasks• Two collaborative tasks facilitated by a browser

add-on• ‘Warm up’ task – fact retrieval• One group provided with documents; the second

group searches on the web• “A review of the best supported claims around

the risks” of a substance (herbicide or food supplement)

28

Friends of the Earth:

PressRelease(Urine

presence)

FoE

Commissioned report

(‘scientific’)(-ve)

Science-Literacy website:

Refutation(+ve)

Farmer’s Weekly

Reprints(+ve)

Related peer-review publication

(Limited risk)

Peer-review publication

Health danger

Reuters

Reprints main claims

Blogger

Critiques journal &

author

Peer-review publication

(Limited health risks)

Peer-review review of literature

(Limited risk to health or

plants)

Peer-review of lit

(Limited risk; control suggestions)

Urine

Health

Agriculture

Coagmento Tool

• Chat• Foreground searches• Share ‘snippets’• Etherpad• Tracks pageviews & copy

(/ctrl+c)

Figure3:3: Coagmento Screenshots (from top: 3.3.1 A full screen display from a browser window; 3.3.2 The toolbar element; 3.3.3 Sidebar with Chat displayed; 3.3.4 Sidebar with Snippets displayed)

Sites of Epistemic Cognition• Situation– ‘best supported claims around the risks of x’ as a

government advisor• Activity– Multiple document literacy

• “Products”– Process data, written output, survey items

• Units– Collaborative pairs, with both snapshot (survey, product

assessment) & dynamic (process, chat analysis) analyses

Output document indicators

• 1-3 score on:– Topic coverage– Source diversity (largely ‘3’)– Source quality/evaluation– Synthesis

• 1-12 total score• Peer/self/diagnostic assessment

34

Source Diversity

Source Quality/Evaluation

Synthesis Topic coverageOutcome:

(Peer assess?)

Product Textual Indicators

Product Textual Indicators

Analysis of written outputs for implicit/explicit sourcing and trustworthiness evaluations (e.g. Anmarkrud, Bråten, & Strømsø, 2014; Bråten, Braasch, Strømsø, & Ferguson, 2014)

Doc / Rank

= 1

= 2 = 3

Doc A

Doc B

Doc C

Product Textual Indicators

• Goldman, Lawless, Pellegrino and Gomez (2012) identified three clusters of students from their written outputs: satisficers, who selected few sources; selectors who selected many sources but did not connect them; and synthesisers who selected sources and integrated them.

Doc A

Satisficer

Doc B

Doc C

Lots of text A

Selector

•Text C•Text A•Text B

Synthesiser

A ¬ B, C supports B but…

Product Textual Indicators

Hastings, P., Hughes, S., Magliano, J. P., Goldman, S. R., & Lawless, K. (2012). Assessing the use of multiple sources in student essays. Behavior Research Methods, 44(3), 622–633. http://doi.org/10.3758/s13428-012-0214-0

Doc A

Doc B

Doc C

“A quotation from text A”, followed by some paraphrased text B. Some key language is copiedfrom text A drawing inference between A and B…

No shared lang

Why Study Writing?

Graham, 2006; MacArthur, Graham, & Fitzgerald, 2016

Writing skills are important for success in our school, workplace, and personal lives

Geiser & Studley, 2001; Light, 2001; Powell, 2009; Sharp, 2007

By ccarlstead CC-By https://www.flickr.com/photos/cristic/359572656/

Geiser & Studley, 2001; Light, 2001; Powell, 2009; Sharp, 2007

By ccarlstead CC-By https://www.flickr.com/photos/cristic/359572656/

Writing skills are important for success in our school, workplace, and personal lives

By ccarlstead CC-By https://www.flickr.com/photos/cristic/359572656/

Geiser & Studley, 2001; Light, 2001; Powell, 2009; Sharp, 2007

Writing skills are important for success in our school, workplace, and personal lives

Geiser & Studley, 2001; Light, 2001; Powell, 2009; Sharp, 2007

Writing skills are important for success in our school, workplace, and personal lives

Natural Language Processing

Bird, Klein, & Loper, 2009; Crossley, Allen, Kyle, & McNamara, 2014

What aspects of text can we analyze?

Words

What aspects of text can we analyze?

Sentences

What aspects of text can we analyze?

Words

Paragraphs

What aspects of text can we analyze?

Sentences

Words

For example, see Allen et al. (2015)

CohesionSyntax

ReadabilityEtc.

Analyzes texts on a variety of

dimensions

Utilized to explore

properties of natural

language

Coh-Metrix

Words serve as proxiesto

actions, skills, interactions, emotions,

thoughts…

NLP tools calculate numerous indices related to the characteristics of language

Words

Syntax

Reasoning

Affect

Identified a number of individual differences related to proficiency on writing assessments

Vocabulary1

Motivation2

Strategy Knowledge3

Working Memory4

1 Allen et al., 20142 Pajares et al., 2001; 2003

3 Roscoe & McNamara, 20134 Kellogg, 2008

Natural Language Processing

Analysis of the language produced by humans

Uses:• Various statistical techniques • Various sources of information in language

In order to:• Understand language • Respond to the “speaker” appropriately

Bird, Klein, & Loper, 2009; Crossley, Allen, Kyle, & McNamara, 2014

NLP can inform…Understanding…

Essay Quality

Self-Explanation Quality

Freewriting Quality

Paragraph Classification

Grammar and Mechanics

Product measures of essay quality

External Linguistic Text Features (not

discussed here)

TAACO: Internal Linguistic Text

Features

Text Analyses

There’s a lot of text on the following slides - sorry

Product Textual Indicators - Qualitative

Across the rubric facets variations in outcome were characterized by, for example:• Synthesis: Lists v integration• Topic Coverage: Sparse keywords/tight subtopic focus

vs range of themes & keywords• Source Diversity: ‘One best’ article vs. multiple sources• Source quality: Uncritical citation of claims, even where

claims disagreed, versus identification, critique & connection of source quality & disagreement

Product Textual Indicators (MDP only)TAACO: • basic indices (‘information’ indicator)– Tokens, word types, type-token ratios

• sentence overlap (local cohesion)– All, content (e.g. topic), and function (e.g. rhetorical) word overlap

• paragraph overlap (global cohesion)– Overlap at paragraph level per sentence level

• connectives (local cohesion)– basic connectives, sentence linking connectives, and reason and

purpose connectives

Product Textual Indicators – TAACO to RubricExploratory analysis

• Synthesis – global cohesion• Topic Coverage – basic indices• Source Diversity – basic indices + local cohesion• Source Quality – reason & purposive connectives

Product Textual Indicators (MDP only)Low to moderate correlations (.1-.4 range) of indices to scores on rubric facets• Synthesis: – -ve association to basic indicators (i.e. Longer texts

synthesised less)– +ve association to sentence & connective indicators (i.e.

more synthesis related with local but not global cohesion in these texts)

– No sig association to paragraph level indices (perhaps due to thematic shifts & copy-pasting)

Product Textual Indicators (MDP only)Low to moderate correlations (.1-.4 range) of indices to scores on rubric facets• Topic coverage: – +ve association to lexical diversity (rather than n of words) – -ve association to local sentence cohesion & connectives -

indicating that higher topic scores perhaps tended to involve more ‘listing’ of claims from sources, with less integration of those claims on a local level (a feature observed in the scoring exercise)

• Source diversity. – Similar to topic coverage, with stronger associations to logical

connectives (linking sources for similar claims)

Product Textual Indicators (MDP only)Low to moderate correlations (.1-.4 range) of indices to scores on rubric facets• Source quality. – +ve association to lexical diversity (or information given) in the

type/token ratio (§1), – +ve association (as in synthesis) a relationship to sentence overlap

(§2) indicating that local cohesion was being built (suggesting local argumentation focused on specific topics).

– But also +ve association to paragraph overlap (§4) indicating that those who evaluated tended to build a cohesive argument through their text, making purposeful connections (§3) between sentences.

Future Directions

• Analysis of source documents

• Collaborative contribution

• Other measures of ep-cog comparison with writing

By Andrea_44 CC-By https://www.flickr.com/photos/andrea_44/2680944871/

Thank you (and questions)

Acknowledgements:• Arizona State University (Laura Allen)• Open University (Karen Littleton & Bart Rienties)• Maastricht University (Dirk Tempelaar & team)• Rutgers University (Chirag Shah & Matthew Mitsui)

@sjgknight http://sjgknight.com/

MDPFor this task, you will be researching a chemical used in herbicide (Roundup) called Glyphosate.Your task is to act as an advisor to an official within the science ministry. You are advising an official on the issues below. The official is not an expert in the area, but you can assume they are a generally informed reader. They are interested in the best supported claims in the documents. Produce a summary of the best supported claims you find and explain why you think they are. Note you are not being asked to “create your own argument” or “summarise everything you find” but rather, make a judgement about which claims have the strongest support.A colleague has already found a number of documents for you to process with your partner, you should use these to extract the best supported claims (without using the internet to find further material).You should:Read the questions/topic areas provided, these will require you to find information and arguments in the documents to present the best supported of these, you should decide with your partner which are best as you read.Group information together by using headings in the EditorYou should work with your partner to explain why the claims you’ve found are the best availableYou should spend about 45 minutes on this taskA review is coming up for the license of Glyphosate, the official would like to know the best supported claims around its risks.A colleague has collected some documents, available from the

ICLS presentation notes• Each of the three presenters will give a 25 minute talk followed by a 5 minute

discussion. The chair is responsible for keeping times and for creating the conditions for productive discussions. Since people will be moving between sessions, it is important that everyone keeps to the time allocated in the program.

The computer (a desktop PC) in our conference rooms are connected to the internet and have Windows 7 operating system and Microsoft Office 2013 suite installed. You need to bring your presentation files on a PC-formatted USB stick if you want to use this computer.

If you are going to use a Mac (Apple device) or any other device, please remember to bring along the necessary adapter (e.g. mini Display port to VGA adaptor) so that you can project your presentation on our system via a VGA port.

Recommended