Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
[Richard Hudson. Draft November 2020. For The Routledge Handbook of International
Research on Writing, Vol. II, edited by Rosalind Horowitz.]
Computational measures of linguistic maturityAbstract: Previous research points to a number of quantitative measures of linguistic maturity, some
applied to vocabulary (e.g. type-token ratios) or lexical morphology (e.g. Latinate affixes) and others
to syntax, and all sufficiently superficial to be applied by a computer system. Some features that are
linked to maturity are very general (e.g. use of subordinate clauses) while others are very specific
(e.g. choice of particular prepositions). The paper outlines some of the known measures, and argues
that a computer system which can apply such measures quickly could be a valuable tool both for
teaching writing and for assessing it.
1. IntroductionMaturity is a reasonable name for the target of all education, and linguistic maturity for the target of
language education, so this is the term I use in this paper. In the specific context of how to teach
and assess writing, linguistic maturity means the ability to write like a skilled and experienced adult
writer, including the ability to vary style according to the demands of context. The research question
is how best to teach and measure the maturity of an individual learner as they move from the
immaturity of a five-year old to the relative maturity of a school leaver (who of course still has a long
way to travel in pursuit of further maturity).
The claim of this chapter is that there are a number of relatively superficial features of
writing which are sensitive to maturity (in this sense). These features can be identified and counted
by a computer, so they could in principle be used in assessing writing for both formative and
summative purposes. I say ‘in principle’ because I am aware that teachers of language arts may feel
uncomfortable with the idea of handing responsibility over to a computer, but I hope to make a
sufficiently strong case to persuade them that the idea is worth pursuing. My suggestion is not that
writing can be taught by a computer, but that computers can take over some of the lower-level
work, leaving the human teacher with more time to handle the more challenging – and possibly
more interesting – tasks.
The research method is essentially comparative: comparing the writing of children and of
more mature writers, so linguistic features are the dependent variables, with the independent
variables fixed by other independent measures such as age and examination gradings. There has
already been a great deal of research along these lines (Stormzand & O’Shea, 1924; Loban, 1963;
Hunt, 1965; O’Donnell et al., 1967; Harpin, 1976; Yerrill, 1977; Kress, 1979; Wilkinson et al., 1983;
Perera, 1984, 1990; Allison et al., 2002; Green et al., 2003). This body of research is informative and
often insightful about selected features, but it doesn’t provide the systematic, detailed and
quantitative information that would be needed in a definitive description of linguistic maturity.
The relevant research (Hudson, 2009) suggests two kinds of application which could build on
the same implemantation. In summative assessment of a piece of writing (i.e. an exam answer), the
computer could produce one or more figures for its superficial maturity which might then be
combined with a human marker’s assessment of its more abstract qualities such as interest,
originality and coherence. And in formative assessment, the same system could be applied to any
piece of work produced by a learner, giving not only an overall maturity grade but also detailed
analysis of how this grade might be improved. This advice would go well beyond the familiar
recommendation of ‘using more adjectives’ and might even encourage children to pay more
attention to the linguistic features of their writing – its vocabulary, its grammar, its spelling and its
punctuation.
I am aware of one such system: the ‘computer tutor’ called HARRY (Holdich et al., 2004;
Holdich & Chung, 2003), whose reported progress is very promising. Its aim is not just to provide
feedback to children, but to encourage them to revise and edit their texts – something which they
rarely do under normal circumstances, in contrast with mature writers who do a great deal of
rewriting. HARRY includes a wide range of tools, including one (called CHECK TEXT) which builds on a
number of the quantitative measures mentioned below.
The underlying assumption behind my proposal is that children will write on computers. I am
aware that our education system is still in a transition period and that, at least in the UK, children
still produce a great deal of written work by hand, and even have to hand-write in public
examinations. However, judging by the startling changes of the last few decades it won’t be long
before every child in countries like the UK will have easy access to a computer keyboard and will use
this as the preferred means of writing both in the classroom and in examinations. My suggestions
will become more relevant as this time draws nearer.
2. VocabularyThe most familiar measure of vocabulary maturity is probably the type-token ratio, which measures
diversity of vocabulary by dividing the number of ‘types’ – distinct dictionary words – by the number
of ‘tokens’ – the running words in the current text. For example, the previous sentence contains 38
tokens but the word the occurs six times, and various other words are repeated, so there are only 26
distinct types, so its type-token ratio is 26/38 = 0.68. If every token had belonged to a distinct type,
the ratio would have been 1; and at the other extreme a string of 38 repetitions of the same word
would have a ratio of 1/38 = 0.03, so a high ratio suggests a broad vocabulary, which is related in an
obvious way to maturity. It’s true that type-token ratios offer technical difficulties, and in particular
the fact that the length of the text being measured affects the outcome (the longer the text, the
lower the ratio – at one extreme, a one-word text must have a ratio of 1.0!); but there are also
technical solutions (Chipere et al., 2001; Covington & McFall, 2010).
Calculating type-token ratios can easily be done mechanically; indeed, as of late 2020 there
is at least one website (https://www.usingenglish.com/resources/text-statistics/) that offers this
service, along with many other measures based on superficial features such as the average sentence
length and the average length of words. Scepticism is appropriate, but the ultimate test for such
measures is whether they ‘work’ in the sense of distinguishing mature from immature writing; and
many of them do turn out to correlate closely with maturity, measured in terms of age. For example,
consider Figure 1, which shows how the average type-token ratio of a text varies with the age of its
writer (Durrant & Brenchley, 2018). In this diagram, the simple type-token ratio is corrected to
reduce the effects of text length, and age is measured in school year (so Year 2 is age 6-7 and Year 11
is 15-16). The effect of age is obvious, as is also the effect of text type, with literary texts showing
more mature writing than non-literary.
Figure 1: Corrected type-token ratio by age
It’s equally easy for a computer to take a piece of writing and calculate a score to show how
commonplace or rare the vocabulary is, as in Figure 2 (Durrant & Brenchley, 2018). In order to
produce this figure, the computer assigned each word a score to show its rarity – how rare it is, as
measured by how often it occurs in a gigantic collection of adult writing: the rarer the word, the
higher its score. The computer calculated a score for each piece of writing, and then averaged them
across the groups of writers. Once again, the effect of age is obvious, and once again, the scores
obviously depend on whether the text is classified as literary or non-literary. Another noteworthy
feature of both these graphs is how the genre difference increases with age, showing how children
gradually learn the important skill of fitting their writing to the task at hand.
Figure 2: Word-rarity by age
Another superficial property of words is their spelling, where it’s easy for a computer to run
a spell-check and give a spelling score for each piece of work. This might be presented simply as a
percentage of words that are correctly spelt, but the familiar objection is that this kind of scoring
discourages lexical ambition – the use of less familiar words that carry the risk of an error. To
counter this effect the scoring could take account of the difficulty of the words used, perhaps giving
two spelling scores: correct spelling of all words, and correct spelling of rarer words. However
spelling was handled, it would provide useful information for any marker or teacher because spelling
has been shown to be a good predictor of composition – good spellers tend to be good writers
(Daffern et al., 2017).
3. Lexical morphologyOn the borderline between vocabulary and grammar lies lexical morphology (aka derivational
morphology), in which distinct dictionary words are morphologically related (as in farmer – farm or
derivational – derivation – derive). This is an area of growth in young writers, as can be seen in Figure
3. This shows the percentage of children (in a group of 247) who used lexical morphology correctly in
a sample of writing; the children were in Year 3 or 4 (i.e. aged 8 or 9) and they were tested in the
autumn and then again the following spring (Green et al., 2003). Once again there is an obvious
effect of age which could be measured quite easily by a computer.
autumn spring0
10
20
30
40
50
60
Y3 Y4
Figure 3: Percentage of children using lexical morphology correctly
Lexical morphology is a relatively easy part of grammar to teach as it generally doesn’t need
much context (other than the context of the students’ existing knowledge of English). The obvious
tool for teaching it is a table such as Table 1 where the pattern revealed in the first rows should be
enough to make pupils aware of the relevant pattern. This kind of focused activity serves three
purposes: raising awareness of lexical morphology, enriching their existing lexical network and
expansion of their vocabulary. It’s reasonable to expect such teaching to raise a student’s score for
lexical morphology.
do doable
read readable
rely
suitable
Table 1: Verbs and adjectives in -able
4. General syntaxWhat I mean by ‘general syntax’ is the syntactic structure of a sentence as analysed in terms of
general categories such as ‘noun’ or ‘subordinate clause’. This was the territory of grammar teaching
before it was abandoned (in the English-speaking world, though not elsewhere) in the middle of the
twentieth century. A long series of quantitative research projects has supported what every teacher
and parent knows, which is that sentence structure is an area of significant growth in children’s
writing.
The most obvious quantitative measure of sentence structure is the average length of
sentences. This was investigated manually as early as 1924 in texts produced (in the USA) across a
broad age span, from 4th grade (age 10) to final-year undergraduates, with the results shown in
Figure 4 (Stormzand & O’Shea, 1924, p. 19). In this graph, g 4 means ‘Grade 4’, hs 1 means the first
(Freshman) year of high school, and uni 1 means the first year of university. In spite of evident
problems in calculating sentence lengths where sentence punctuation is itself fragile, the trend is
clear enough, and very easily calculated by computer.
g 4 g 6 g 7 g 8 hs 1 hs 2 hs 3 hs 4 uni 1 uni 30
5
10
15
20
25
Age
wor
ds
Figure 4: Sentence length by age
The same study in 1924 produced the figures in Figure 5, in which the sentences in each text
are classified as simple, compound or complex – a very crude profile of its structure which modern
grammarians reject, but which at the time was popular in education. The database was very small,
including only 500 sentences from grade-level schools, which may explain the odd dip for compound
sentences in Grade 6, but overall the trend is clear: as age increases, fewer sentences qualify as
simple and more qualify as complex, with compound sentences (the dotted line) showing a slight
increase.
g 4 g 6 g 7 g 8 hs 1 hs 2 hs 3 hs 4 uni 1 uni 30
10
20
30
40
50
60
70
simple compound complex
Age
perc
ent o
f all
sent
ence
s
Figure 5: Simple or compound or complex sentences by age
The 1924 study was the first of many quantitative studies of children’s writing, all done by
hand and painstakingly counted and analysed, including one carried out in England in 1976 (Harpin,
1976). This time the body of data was much larger – 800,000 words produced by 290 children in
years 1, 2, 3 and 4 (i.e. aged from 5 to 9). One of the interesting trends to emerge from this study
was the change in the types of subordinate clauses that children use as shown in Figure 6 (Harpin,
1976, p. 71). Harpin distinguishes three types: noun, adjective (i.e. relative) and adverb, as in (1) to
(3).
(1) noun: I think that you’re wrong.
(2) relative: They’re the ones who did it.
(3) adverb: I’ll come when I’m ready.
When counted as a proportion of all subordinate clauses, noun clauses decrease with age while
adjective clauses increase, while adverb clauses show little change with age.
Y1 Y2 Y3 Y405
101520253035404550
noun relative adverb
age
% o
f all
sub
claus
es
Figure 6: Types of subordinate clause in children’s writing.
Clearly this graph is much more informative for teachers than if it had shown a general
increase in the use of subordinate clauses without distinguishing the different types. It suggests a
teaching strategy: focus on relative clauses as a more significant growth point than either noun
clauses or adverb clauses. Incidentally, two other studies have confirmed the trend for relative
clauses to increase in writing at the expense of noun clauses (Perera, 1984, p. 233).
Interestingly, the research data seem to point to a complicated separation of writing and
speech, with maturity defined differently in each. A study of children’s speech (Perera, 1984, p. 134)
reported the figures shown in Figure 7, where the proportions for noun clauses and adverb clauses
are similar to those for writing, but where relative clauses head in the opposite direction: with
increasing age, relative clauses increase in writing, but they decrease in speech. If further research
confirms this trend, it will confirm the widely-shared view that speech and writing develop
separately, but it won’t directly affect the teaching of writing as such.
5 6 7 8 10 130
10
20
30
40
50
60
noun relative adverb
age
% o
f all
sub
claus
es
Figure 7: Types of subordinate clauses in children's speech
The studies reported so far in this section are indicative of what can be done in quantitative
syntax by giving a numerical score to a piece of writing for various general parameters such as the
length of sentences or the amount of subordination. But the studies were all carried out by hand,
with a researcher producing a syntactic analysis of a text, sentence by sentence. My general claim in
this chapter is that computers can now provide the kind of linguistic analysis that used to come from
humans; this claim is as true of general syntax as of vocabulary. In the computer world, the analysis
of sentence structure is called ‘parsing’ (a word which used to be used for the more limited analysis
of single words, and which derives from the Latin pars orationis, ‘part of speech’).
When I google (in late 2020) for <online sentence parser>, I find four systems that will
automatically parse any sentence that I type in. My test sentence is (4), which is a typical sentence
written by a fourteen-year old in an English test.
(4) The end paragraph is different from the rest of the article because it changed to a more
friendly tone of writing.
The end paragraph is different from the rest of the article …
… because it changed to a more friendly tone of writing.
All of the online systems produced a reasonable analysis for this sentence, give or take various
theoretical issues to do with their assumptions about the nature of sentence structure. But the most
important point is that they did this task very fast – in some cases, almost immediately, while others
took a second or two. None of the systems produce a user-friendly presentation of the analysis, but
they all give an analysis similar to the one in Figure 8, which happens to be almost the same as the
one I would give it as a grammarian (Hudson, 2010). And the main point of this analysis is that it
could easily be used for identifying such general syntactic categories as subordinate clauses – in this
case, because it changed to a more friendly tone of writing..
Figure 8: A syntactic analysis of a typical sentence.
Of course these systems make mistakes. One of the many challenges of parsing is that
English words are very ambiguous, in the sense that a given word-form may have numerous
meanings and belong to many different word classes – one classic case being round, which may be a
noun (a round of drinks), a verb (He’ll round the corner soon), an adjective (a round table), an adverb
(came round) or a preposition (He’s coming round the corner). This being so, the computer can’t
simply look up every word in its internal dictionary and then build up the structure from there: it has
to take account of the syntactic context of each word (which, of course, consists of other words that
are equally ambiguous) – a very complex process which humans manage easily but which used to
defeat computers. Take this sentence (from the Marx Brothers), which is the ultimate test of a
computer’s ability to solve the problem:
(5) Time flies like an arrow and fruit flies like a banana.
All of the online systems fail on this sentence, but then, so does the human brain – that’s the whole
point of the joke. The and in the middle tempts us all to treat both halves of the sentence in the
same way, and this temptation is reinforced by the repeated word-forms flies, like and a(n).
Computer systems for parsing are considered successful if they work most of the time, with just (say)
5% errors – which of course explains why Google Translate generally works, but sometimes fails
miserably.
My point is that even if a computer analysis was only 95% reliable it would provide valuable
information about a piece of writing for three people: examiner, teacher and author. The examiner
would be able to leave judgements on the syntactic maturity of the writing to the computer in order
to concentrate on higher and more abstract issues of originality and organisation. The teacher could
build the computer’s analysis into a more informative formative assessment of the writing. And the
author (i.e. the student) would get instant feedback on the writing either immediately after finishing
it, or maybe even while writing. If the computer could suggest changes which would raise the
maturity rating, then the student could implement these on the spot, and grow as a writer in the
process – the perfect teaching arrangement for building new writing skills.
5. Particular syntaxThe difference between what I am calling ‘general syntax’ and ‘particular syntax’ is a matter of
granularity. Whereas general syntax deals in general categories such as ‘subordinate clause’ or
‘adverb’, particular syntax deals in much more narrowly-defined categories such as ‘relative clause
introduced by whose’ (… the man whose house I visited) or ‘the syntax of the verb think’ (think him
honest, think him to be honest, etc.). Particular syntax is the area of grammar where syntax meets
vocabulary – and indeed it would be easy to argue that these things are really just matters of
vocabulary, though I doubt if this argument would be particularly productive as there’s very little
evidence for any real boundary between syntax and vocabulary.
Leaving theoretical issues aside, the difference between general and particular is merely a
matter of degree. There is a hierarchy of categories In grammar, ranging from the most general
(‘word’ and maybe ‘phrase’) to the most particular (individual words, or even particular uses of
individual words). One path down this hierarchy is shown here:
word > verb > auxiliary verb > BE > modal BE
The verb BE is an auxiliary verb, and has a number of different uses that need to be distinguished, such as the ‘modal BE’ followed by to as in (6).
(6) You are to hand in your essays tomorrow.
The modal use of BE is part of syntax, with its own particular properties; for instance, like the
ordinary modal verbs (will, can, may and so on) it is always finite so we can’t use it in (7), where the
‘*’ shows that it is ungrammatical.
(7) *Being to hand our essays in tomorrow is a pain.
On the other hand, unlike the other modal verbs, it has a 3rd-person singular form in -s:
(8) She is to hand in her essay tomorrow.
So the modal BE really is unique – the only word that has this particular combination of properties.
The point of this digression into a tiny corner of English grammar is that there are a great
many such corners which make up mature knowledge of English grammar. This is where most of the
learning on the way to maturity takes place, and where teaching is probably most helpful. Once a
class of children have some kind of metalanguage for talking about syntax, the teacher can guide
them through an exploration of tiny areas of grammar which are intellectually manageable within
the vast maze of English grammar. This exploration will be part instruction, part self-discovery and
part research.
To take another example, consider one of the grammatical growth-points of adolescence:
the choice of prepositions. Why in January but on Sunday and at five o’clock? Here native speakers
have a firm grasp of the facts, but probably aren’t aware of the underlying principles. And what
about tired of but fed up with? Here their grasp of the facts may be more shaky and a useful
classroom activity would be to explore a range of adjectives and the preposition they take (such as
sick, bored, interested, anxious, concerned, surprised, afraid).
We know a certain amount about the growth of children’s knowledge of particular grammar.
To take yet another area of particular grammar, consider words used in expressions of time, such as
then, sometimes, while or once. We all know that then enters children’s speech and writing long
before once, as in (9).
(9) Once I had found the key, it was easy.
However, we can be more precise, thanks to a careful study by the late Katharine Perera of recorded
speech and written work by the same children aged 8, 10 and 12 (Perera, 1990). She found that the
only time words used by 8-year olds in speaking were then and when, but that when writing they
also used a handful of others including after – but that both when and after were always followed by
a finite verb such as played as in (10).
(10) When/after he came we played together.
In mature writing, both these words can be followed by a non-finite verb such as playing in (11).
(11) When/after playing together we had to be careful.
By mapping the time words used in her data against age and speech or writing Perera was able to
construct a developmental hierarchy of such words, ranging from the universal then and when to
once and a number of other words which were only found in the writing of 12-year olds.
For a final example of particular syntax, consider the subordinating conjunction although.
This is highly relevant to teaching because we know that children take a long time to reach a full
mature understanding of the contrast inherent in its meaning. Summarising five independent
studies, Perera concludes “that nine is the earliest age at which a rudimentary understanding of
although can be expected and that comprehension is not fully established by fifteen.” (Perera, 1984,
p. 144). Words like although clearly deserve explicit discussion in class.
Returning to our main theme of computer analysis, particular grammatical patterns like
those discussed above are easy to identify by computer once the more general and abstract patterns
have been identified. For instance, given a sentence analysis which shows subordinate clauses, it is
easy to ask for any occurrence of before that introduces such a clause. The trouble, of course, is that
the more specifically we define a pattern, the less often we may expect it to occur, so there’s not
much point in scoring student writing for how often it uses a very specific pattern such as once with
a following verb. The absence of this pattern in a piece of writing isn’t in itself evidence of immature
writing; even mature writers might write hundreds of pages without seeing the need to write this
particular word.
But to offset this obvious problem, we can note the very large number of patterns that are
known to grow with maturity. Suppose we had a large pool of such features – say, a thousand of
them – and our computer could identify all of them when they were present. In that case, it would
be reasonable to count the mature features in a text as a measure of maturity. A text that had, say,
half a dozen such features would probably impress a human examiner or teacher, and would receive
appropriate credit from the computer. Indeed, one possible aid for markers would be to run scripts
through the computer before presenting them to the human marker, with mature features
highlighted so as to be visible at a glance, before the marker had read a word of the script.
A computer system that was sensitive to the particular syntax of maturation would also be a
particularly valuable teaching tool. Most obviously, it would provide formative assessment for the
teacher, highlighting areas of weakness for most of the students in a class – either areas where they
made mistakes (as with the choice of prepositions), or areas of grammar that they simply didn’t use,
suggesting some combination of ignorance or uncertainty. The teacher could use this information
for choosing topics to address in class.
But even more helpfully, the system could be used to guide students as they wrote. At the
very least, it could help them to avoid errors (again, the choice of prepositions is an obvious case),
but we could aim much higher, with maturity as the target. The system could work like the Microsoft
grammar-checker, but much better, suggesting mature alternatives as each sentence grows. It might
‘know’ thousands of little markers of syntactic maturity, from mature time words to mature relative
clauses, and at the end of every sentence it might be able to suggest ways of making it better. For
example, suppose the child wrote (12).
(12) A man came to the door and he rang the bell.
The system could help in different ways.
At one extreme, it might suggest concrete alternatives such as (13) to (14), from which the
writer could select if they agreed the alternative was better.
(13) The man came to the door and rang the bell.
(14) The man who came to the door rang the bell.
(15) Coming to the door, the man rang the bell.
At the other extreme, the system could use grammatical terminology to suggest abstract alternative
structures, such as (16) to (18), which present the same changes as those that produced (13) to (15),
but in a much more challenging way.
(16) Omit the subject of the second clause, because it has the same referent as the first clause’s
subject.
(17) Turn one of the clauses into a finite subordinate clause.
(18) Turn one of the clauses into a non-finite subordinate clause.
In between these two extremes no doubt there are plenty of mixed approaches. The choice could be
made either by the teacher or by the student, and all being well, the student would mature to the
point where the system had no improvements to suggest.
6. ConclusionThe conclusion to which these arguments lead is that the research community should now be
turning its attention to developing more sophisticated ways of using computers in both promoting
and assessing the maturity of children’s writing. A caveat is that computers can obviously only be
applied to writing that is stored on a computer, so a computer system is irrelevant so long as
children handwrite their compositions and exam answers (as they still do in the UK). Assuming that
the world is inexorably moving towards more use of computers in classroom teaching and
examinations, we now have a short window of opportunity to prepare for the time where every
piece of writing produced by a school child could be processed by a computer.
A computer system as described here could, in principle, offer a number of different services
to different people:
For a learner, it could offer instant feedback on a growing piece of writing, whenever the
learner is ready for feedback – during a sentence, at the end of a sentence or at the end of
the entire text. The feedback could take a number of different forms, from a global score for
the maturity of the vocabulary or the syntax, to specific suggestions for alternative wordings.
This feedback would prepare learners for the harsh reality of mature writing, with all its re-
writing. In contrast, learners rarely re-write (Holdich et al., 2004).
For a teacher, it could offer formative assessment in the form of one or more scores for the
linguistic maturity of the writing, covering both its vocabulary and its grammar. Since the
features that show maturity vary with age and ability, it should be possible to produce
benchmark scores to guide teachers in identifying learners who need help.
For an examiner, it could deal with all the low-level features of presentation, and do so in an
objective way without interference from the content. This would lighten the examiner’s
workload, and leave them with the important task of assessing the higher-level features
such as coherence and originality. However, since low-level features such as spelling actually
predict higher-level quality, the computer could provide the examiner with a predicted
grade. The examiner could reject this prediction in the light of other evidence, but it might
provide a helpful starting point.
I would like to be able to finish by announcing that the research world of linguistics is ready
to provide whatever information was needed, including a reliable list of the distinctive
characteristics of linguistic maturity. Unfortunately that’s not quite so, but we’re not far off. I believe
the same is true of research in computational linguistics.
References
Allison, P., Beard, R., & Willcocks, J. (2002). Subordination in Children’s Writing. Language and
Education, 16(2), 97–111. https://doi.org/10.1080/09500780208666822
Chipere, N., Malvern, D., Richards, B., & Duran, P. (2001). Using a Corpus of School Children’s
Writing to Investigate the Development of Vocabulary diversity. In P. Rayson, A. Wilson, T.
McEnery, A. Hardie, & S. Khoja (Eds.), Proceedings of the Corpus Linguistics 2001 Conference
(pp. 126–133). Centre for Computer Corpus Research on Language.
Covington, M. A., & McFall, J. D. (2010). Cutting the Gordian Knot: The Moving-Average Type–Token
Ratio (MATTR). Journal of Quantitative Linguistics, 17(2), 94–100.
https://doi.org/10.1080/09296171003643098
Daffern, T., Mackenzie, N., & Hemmings, B. (2017). Predictors of writing success: How important are
spelling, grammar and punctuation? Australian Journal of Education, 61, 75–87.
Durrant, P., & Brenchley, M. (2018). Development of vocabulary sophistication across genres in
English children’s writing. Reading and Writing, 32, 1927–1953.
https://doi.org//10.1007/s11145-018-9932-8
Green, L., McCutchen, D., Schwiebert, C., Quinian, T., Eva-Wood, A., & Juelis, J. (2003).
Morphological Development in Children’s Writing. Journal of Educational Psychology, 95,
752–761.
Harpin, W. (1976). The second “R”. Writing development in the junior school.
Holdich, C. E., & Chung, P. (2003). A “computer tutor” to assist children develop their narrative
writing skills: Conferencing with HARRY. International Journal of Human-Computer Studies.
https://doi.org/10.1016/S1071-5819(03)00086-7
Holdich, C. E., Chung, P., & Holdich, R. (2004). Improving children’s written grammar and style:
Revising and editing with HARRY. Computers and Education, 42, 1–23.
Hudson, R. (2009). Measuring maturity. In R. Beard, D. Myhill, M. Nystrand, & J. Riley (Eds.), SAGE
Handbook of Writing Development (pp. 349–362). Sage.
Hudson, R. (2010). An Introduction to Word Grammar. Cambridge University Press.
Hunt, K. (1965). Grammatical structures written at three grade levels. National Council of Teachers of
English.
Kress, G. (1979). Conjoined Sentences in the Writing of 7 to 9 Year Old Children. UEA Papers in
Linguistics, 10, 1–18.
Loban, W. (1963). The Language of Elementary School Children. A study of the use and control of
language and the relations among speaking, reading, writing and listening. National Council
of Teachers of English.
O’Donnell, R. C., Griffin, W. G., & Norris, R. C. (1967). Syntax of kindergarten and elementary school
children: A transformational analysis. National Council of Teachers of English.
Perera, K. (1984). Children’s writing and reading. Analysing classroom language. B. Blackwell in
association with A. Deutsch.
Perera, K. (1990). Grammatical differentiation between speech and writing in children aged 8 to 12.
In R. Carter (Ed.), Knowledge About Language and the curriculum (pp. 216–233). Hodder and
Stoughton.
Stormzand, M., & O’Shea, M. (1924). How Much English Grammar? An Investigation of the
Frequency of Usage of Grammatical Constructions in Various Types of Writing together with
a Discussion of the Teaching of Grammar in the Elementary and the High School. Warwick
and York. https://babel.hathitrust.org/cgi/pt?id=mdp.39015009312466&view=1up&seq=9
Wilkinson, A., Barnsley, G., Hanna, P., & Swan, M. (1983). More Comprehensive Assessment of
Writing Development. Language Arts (NCTE), 60, 871–881.
Yerrill, K. (1977). A consideration of the later development of children’s syntax in speech and writing:
A study of parenthetical, appositional and related items. Newcastle upon Tyne.