138 CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER STUDY Summary of Findings The purpose of this study was to explore the relationship between students’ writing samples and their LMS usage patterns, and to investigate their potential predictive value towards a specific student’s usage patterns within the course LMS. How do what and how students write in submitted assignments relate to their usage patterns in an LMS? Do specific types of writing (for example, word length of assignments) relate more directly to LMS usage? Does a specific course type (file/assignment-dominant, assignment/file-dominant, module/assignment-dominant, or file/module-dominant) relate more directly to LMS usage? What does modeling tell us about how students’ writing samples connect to how they engage themselves with an LMS in a hybrid course? Based on the literature provided for similar studies in tutorial software programs (Cocea & Weibelzahl, 2009), it was decided to use the Fast Clicker (FC) and High Times Out (HTO) LMS usage patterns to emulate patterns presented in the literature by these same phenomena. Since there was no intention of tying these patterns together with performance data, the patterns were used as means to illustrate pattern training and detection, and treated throughout the study as having neutral value. Therefore, the terms “engagement” and “disengagement,” though applied liberally in the literature in regard to the absence and presence (respectively) of these patterns among the subjects, were of no applicability in this study.

CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting

Embed Size (px)

Citation preview

Page 1: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting




Summary of Findings

The purpose of this study was to explore the relationship between students’

writing samples and their LMS usage patterns, and to investigate their potential

predictive value towards a specific student’s usage patterns within the course LMS. How

do what and how students write in submitted assignments relate to their usage patterns in

an LMS? Do specific types of writing (for example, word length of assignments) relate

more directly to LMS usage? Does a specific course type (file/assignment-dominant,

assignment/file-dominant, module/assignment-dominant, or file/module-dominant) relate

more directly to LMS usage? What does modeling tell us about how students’ writing

samples connect to how they engage themselves with an LMS in a hybrid course?

Based on the literature provided for similar studies in tutorial software programs

(Cocea & Weibelzahl, 2009), it was decided to use the Fast Clicker (FC) and High Times

Out (HTO) LMS usage patterns to emulate patterns presented in the literature by these

same phenomena. Since there was no intention of tying these patterns together with

performance data, the patterns were used as means to illustrate pattern training and

detection, and treated throughout the study as having neutral value. Therefore, the terms

“engagement” and “disengagement,” though applied liberally in the literature in regard to

the absence and presence (respectively) of these patterns among the subjects, were of no

applicability in this study.

Page 2: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


Outlier and quartile rankings were determined for each student in each course and

each writing assignment selected for each course, generating correlation tables to display

relationships between the page views (PV) session metrics and URL token categorical

rankings overall, and for each course type (FA, AF, MA, and FM). Correlation tables

were also generated for writing sample (WS) metrics and linguistic categorical rankings

against the entire PV dataset as well as tables for each WS assignment type (A, an

average word count less than 1000; B, an average word count between 1000 and 2000; C,

an average word count between 2000 and 3500; and D, an average word count greater

than 3500).

After that, decision trees were generated for both the Fast Clicker (FC) usage

pattern and the High Times Out (HTO) usage pattern. 72 decision trees were generated

in all, 36 per usage pattern, which included WS-only and ALL versions of the entire data

set of records, each of the four course types (FA, AF, MA, and MF), and each of the four

writing sample classes (WA, WB, WC, and WD). For these 18 combinations of decision

trees for each LMS usage pattern examined (FC and HTO), both full and pruned versions

of all decision trees were generated, giving a total of 36 per pattern, and 72 overall.

After the decision trees were generated, they were then used for a discussion of

how they might be used in a typical hybrid course to suggest students who have

tendencies, through their writing samples, to adopt the FC and HTO LMS usage patterns.

The generation of decision trees for other patterns of interest, the effect of the patterns on

student learning, as well as the systematic embodiment of these decision trees in software

for the actual prediction of these patterns, were left for future research.

Page 3: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


Interpretation of Findings and Suggestions for Further Research

What evidence do the findings provide to answer to these questions? Student

LMS usage is largely dictated by the instructor and the course, and instructors generally

structure courses the same way in LMS’s, no matter how much the content and level of

courses differ. They generally do what they’re comfortable doing in the LMS. The page

views (PV) metrics, which include not only the total page views (TPV) but also the

course page views (CPV), and some calculated values based upon these page views as

they exist in the date and time matrix (called session metrics) are shown in Chapter 4. It

reveals the means, medians, and standard deviations of these session metrics over all of

the courses.

The average of the total page views means for all students in all courses was

872.7, whereas the median of all students over all courses was 588.5, which is

considerably less than the average. This indicates that those students who had ranked on

the high side of their peers in total page views for the semester tended to have higher

values than those who ranked on the low side. This is not surprising, since the number of

page views that a student could have in a semester is virtually. Whereas there is a lower

limit: zero. For students who wish to exist within and pass these hybrid courses, 0 page

views are not an option. Minimally, they should have some minimum number of total

page views for the semester that they would need to accumulate in the process of

submitting assignments and perhaps minimally participating in graded discussions.

It should be noted that the total page views (TPV) category is highly dependent

upon the number of courses in which the student is enrolled that semester that used the

LMS. Some students will have a large difference between their TPV levels and their

Page 4: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


CPV (course page views) levels. As such, the CPV levels are more effective at revealing,

in a raw manner, an individual student’s participation in the LMS aspects of that

particular course. In this study, the mean of means for all students in course page views

was 416.2. The median of all medians produced by the CPV values over all courses was

341.5. The standard deviation over all of these values for CPV is 94.5, which shows

much less variation than the TPV standard deviation of 202.2. This isn’t surprising,

because all students were in the courses in which they enrolled, but some students were in

other courses, which inflated the TPV value but not the CPV value.

The other page views session metrics of interest here are the AMC (average

minutes per click) and RSD (repeated sessions days) values, mostly because they most

affected which students were pre-classified with Fast Clicker and High Times Out

patterns, respectively. In the AMC category, both the mean of means and the median of

medians were 0.5. This finding provides an excellent baseline for evaluating students

against this pattern: The more a student is below .5 in her AMC metric, the more that

student looks like a Fast Clicker in her LMS usage.

The RSD mean of means was 17.7, but the median of medians was only 11.75.

Unfortunately, this value does not give the apparent baseline that was observed in the

AMC metric above, and provided less confidence in finding a meaningful pattern as the

study continued. However, since High Times Out patterns were drawn from higher

values of RSD, the same general conclusion might be drawn regarding this pattern: The

higher the value that a student has for her RSD metric, the more that student looks like a

High Times Out student in her LMS usage. The only element that is missing in this

finding is a solid number with which to determine what is meant by a “high” value for

Page 5: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


RSD. As shall be seen, this concern will be nullified through preprocessing the values

for each student into rankings against the values for that student’s course peers.

Among the page views URL token categorical dimensions (Assig, Conve, Files,

Grade, Modul, Topic, View, and Wiki), the mean of means produces a virtual tie for

dominance among the categories. Assignments (Assig, 26.6) and Files (Files, 24.5)

clearly lead all other categories. In itself, this does not say much about the students, but

actually more about the courses that were included in the study. For these courses, the

predominant activity in the LMS was completing and submitting assignments and

viewing files to access course content. With this finding, it was not surprising that A or F

ended up as one of the top two values of all four course types (FA, AF, MA, and FM).

Even in an LMS like Canvas ™ which is rich with tools and options, most hybrid

courses fall into one of about four basic categories, mostly differentiated by how content

is delivered. Distance courses will likely be different, as course discussions and

socialization are borne by the LMS and do not happen in the brick and mortar classroom.

So URL token categories such as Topic and View and perhaps Conversations will

become more prominent in the scenario of the distance course.

The surprising result is the virtual tie for third place of prominence between

Modules (Modul, 10.7) and Grade (10.4). One might expect the Grade value to be higher

among courses that are primarily used the LMS as a grade book to report grades to

students. However, all of the courses included in this study provided some content via

the LMS, in the Files, Modules, or Wiki areas. Therefore, the high value for Grades is

somewhat mystifying, until one considers that Grades is the one element that is common

to all courses. All of the courses, no matter how they were designed, generally made

Page 6: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


grades available to students. So a mean of means should naturally reflect this reality to a

certain extent.

The Modules category (Modul) was lower than the Files and Assignments

categories simply because fewer courses were designed to deliver content via modules

than those which were designed to deliver content via Files. The surprising result is the

consistently low value of the Wiki category. The Wiki pages inside the Canvas™ LMS

are provided for instructors to generate pages to deliver course content directly to the

students within the course. It is apparent that none of the instructors in the study courses

were using that feature to any significant extent. As this LMS was still being rolled out at

the time of data collection, it is a great possibility that the courses were still being

migrated from the old LMS into the new one, and instructors were more likely to simply

link to files (even HTML files) brought over from the old LMS than to create new Wiki

pages to distribute course content. Perhaps as this LMS becomes more mature in this

institution, the value of the Wiki category in the page views will increase to prominence.

Among the descriptive statistics from writing samples, the metrics cannot be

directly compared to the page views metrics because there are more writing samples than

there are courses. In all, there were 27 writing assignments captured among 14 courses,

averaging almost two writing assignments per course (some had three collected, others

only one). Therefore, it is prudent to view the descriptive statistics of these writing

samples separately from the course descriptive statistics.

The mean of means among all writing samples for total word count (WC in the

table) is 1528. The median of medians is 1067. This median is well below the mean,

again due to the relatively unlimited upper bound and the rock solid lower bound (it

Page 7: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


would not be prudent for a student to turn in an assignment with a word count of 0, but

may be prudent to turn one in with a word count of 5000). Among average words per

sentence (WPS), this metric is a difficult one because it is already a mean when reported

for each student. So a mean WPS among students in a course is already a mean of

means. So the descriptive statistic revealed at the beginning of this chapter is a mean of

means of means. However, for what it’s worth, this value was 28.1 among means

(averages) and 24 in the median of medians, which is a pattern that is consistent with the

rest of our data. The percentage of six-letter or greater words in the writing samples,

percentage of dictionary words used in the writing samples, and number of function

words used in writing samples, were all consistent between their means of means,

medians of medians, and standard deviations.

In the writing samples categorical values, which are drawn from percentages of

words used in writing samples that were included within the various categories (and some

could be in more than one category), the use of cognitive-mechanic (cogmech) words

trumped all other categories with a mean of means of 16.6. The next closest category

was the prepositional word use category (preps, 14.8), followed by a trio of categories

that had similar values (verb, 12.4; relativ, 11.5, and pronoun, 10.8). In all of these

categories, the means of means were very close in value to the medians of medians,

though curiously the medians were always lower than the means, just slightly so. The

article category was a visible exception, but was slightly lower at a higher precision.

Just purely focusing on descriptive statistics alone, future research could be done

in observing more closely the balance of percentages within any specific course offering.

As there are a number of parameters that can be used (this study chose 8 from among 25

Page 8: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


PV URL token categories), a serious study in this area may include more such categories,

or perhaps new ones as they are added as new features in the LMS.

Interpretation of Correlational Results

Three separate correlational tests were conducted. The key to these tests was

found in the procedure created to rank students into low outliers (0), quartiles (1 through

4, with 1 being the lower 25%), and high outliers (5). This ranking system allowed

students in one course, for example, which required a writing assignment of 3000 words,

to be compared in terms of word count to students in a course which only required a

writing assignment of 750 words. Had analysis been done on the raw word count values

among these students, they would have simply been grouped according to the courses

within which they were enrolled. By ranking the students among their peers (for

example, a student submitting an assignment with 2500 words in the first example would

rank similarly to a student submitting an assignment with 600 words in the second

example), it was possible then to do much of this correlational analysis and to compare

the page views and writing sample features of all 366 students in the study and various

subsets of those students. This system also gave some credence to outliers in various

categories, but did not allow the outliers to inordinately affect the results in those

categories where they existed. As will be apparent in the decision tree analysis,

sometimes being an outlier in a specific category was a chief discriminator in

classification of that student.

The first set of correlational tests, the page views session metrics (TPV, CPV,

etc.), were tested for relationships with the page views URL token categories (Assig,

Page 9: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


Conve, etc.). The table, and a brief analysis were provided in Chapter 4. What is perhaps

most striking about this table are the strong relationships that occur between the students

pre-classified as Fast Clicker students through their low average minutes per click (AMC)

rankings, and their page views category rankings. Certainly one would expect a strong

relationship between AMC rankings and other session metrics, such as TAM (total

accumulated minutes). But the categorical values are based on percentages, not

frequencies, so by chance they should not be directly correlated with the session metrics

according to instrument design. Yet strong relationships, positive and negative, existed

within this table, with some significant as high as a virtual p=0.000.

With a degree of freedom of 364, a relatively low Pearson’s r coefficient value

becomes significant, as with this number of samples in any given correlation test it would

be truly difficult for chance to produce an erroneous relationship.

One of the striking results of this test is that, in general, students who rank higher

than their peers in visiting Assignments in the LMS have the lowest rankings in nearly all

session metrics. Activities that are associated with the process of preparing and

submitting assignments do not lend themselves to a high number of clicks in the LMS.

In almost opposite (though equally significant) relationship to the PV session

metrics is the Grade category. A student’s percentage of visits to the Grade area of the

LMS is generally positive in relation to the session categories. Therefore, students who

have the highest rankings in the Grade category also have the highest rankings in total

page views (TPV), course page views (CPV), and the other session metrics. The only

exception is Grade’s negative significant correlation to the average minutes per session

(AMS) and average minutes per click (AMC), where higher rankings in Grade checking

Page 10: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


activity meant lower rankings in average minutes per session and average minutes per

click. This is logically understandable, as it stands to reason that students who are doing

non-productive (purely from a work production perspective) activities tend to spend less

time in the LMS as they quickly login to check grades and then log back out again.

In viewing those students pre-classified as Fast Clicker (FC) students, the only

positive significant correlation with page views URL token categories is with

Conversations and Grades. As noted above, grade-checking activity inside the LMS does

not produce work in terms of assignments nor consume course content. The only other

category that is not obvious in terms of work producing and content consumption is

Conversations, where a student composes and reads messages to and from the instructors

or other students in the course. This category, too, is positively correlated with Fast

Clicker students, though it is not as intuitively logical than was the Grades activity.

Composing conversations (internal email-like messages) takes time, and does not seem to

fit with the Fast Clicker approach. However, reading messages is a very quick and click-

laden process. With this in mind, the interpretation is that students with higher rankings

in Conversations often simply read their messages, but rarely compose them. It is the

only scenario that makes sense in this context.

The only other results from this PV Aggregate table that is of great interest is the

Modules category (Modul). This category is negatively correlated with ACS (average

clicks per session) and with the Fast Clicker classification, but positively correlated with

TAM (total accumulated minutes), AMS (average minutes per session) and AMC

(average minutes per click). From these results it could be interpreted that students who

rank high in accessing modules (in those courses which use them to provide course

Page 11: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


content) proceed through a session at a much slower rate than those who rank low in

Modules-related activity. This means that the Modules category can be associated, more

than any of the other categories, with course content. Modules activity is course content

consumption activity, which is antithetical to Fast Clicker LMS usage patterns.

The High Times Out classification of students showed much lower correlation

with page views URL token categorical rankings, though it joined Fast Clicker students

in being positively correlated with grade activity (showing that students who checked

their grades a lot sometimes logged in more than once a day at various times in the day to

do so) and negatively correlated with assignments activity (showing that students who

spent time viewing, producing, and submitting assignments tended to do so in one sitting

and did not have as many repeated logins during a given day).

Therefore, without even testing the writing sample data against the LMS usage

data, some patterns of LMS use emerged which provided answers to this study’s research

questions. These patterns could be further explored in future work within this area,

especially since all parameters lay within the page views data freely available to all

instructors and administrators within the LMS. This study only used session metrics that

were generated from the time-date stamps provided through the page views data.

Matrices of day and time were also processed for each student, but were not used since

they had little relevance towards the Fast Clicker or High Times Out patterns that were

explored in this study. These matrices are a whole other area of exploration that could be

undertaken in future work in this field.

The second set of correlational tests, the Page Views Course Types Aggregates,

was conducted to primarily explore one of the sub-questions of the study, “Does a

Page 12: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


specific course type (file/assignment-dominant, assignment/file-dominant,

module/assignment-dominant, or file/module-dominant) reveal itself more readily in its

corresponding LMS usage?” The FA (Files--Assignments) course type revealed strong

negative correlations between the TPV (total page views) metric and the two primary

categories of this course type, Assignments and Files. Again, Grades and Files had

relatively similarly strong correlations except with opposite polarity to one another. As

with the PV Aggregate analysis of the entire dataset of records, students who were ranked

high in visits to course content areas were ranked low in visits to their grade books. The

AF (Assignments--Files) Course Aggregate correlational analysis confirmed the results

from the FA course test, with some small variations. The MA (Modules--Assignments)

Course Aggregate correlational analysis also confirmed the antithetical relationship

between visits to the grade book and content areas, but in this case the strongest negative

correlations were in the Assignments category, with a sprinkling of significant negative

correlations in the Modules category, as one might expect. Finally, the FM (Files--

Modules) Course Aggregate correlational analysis confirmed the results as mentioned

above, but with two glaring differences: First, the category with the strongest significant

correlations was not one of the primary categories for that course type: it was the

Assignments category. Secondly, Grades and Assignments actually matched polarity in

one page views session metric, that of AMS (average minutes per session), where they

were both negatively correlated.

Page 13: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


Table 7.1

Type FC -- Neg FC -- Pos HTO -- Neg HTO -- Pos

FA None None None None

AF Assignments Conversations Files None

MA Module Conversations,


None None

FM Files, Modules, View,


None None None

Table 7.1: Course Type Correlations by Category with FC and HTO Usage Patterns

Table 7.1, above, displays the results In terms of correlating with the two example

usage patterns for this study, Fast Clicker (FC) and High Times Out (HTO). It is

apparent from the table that none of these course types provided solid relationships

between the page views URL token categories and the High Times Out pattern, but three

of the four provided solid relationships between those categories and the Fast Clicker

usage pattern.

Future work in this area could include an expansion of how the datasets might be

sub-divided, such as by course requirement type (Gen-Ed, elective, major-required,

major-elective, etc.) or by class level (freshman, sophomore, junior, senior), or any other

way that might make sense. Each of these different views of the courses would provide

their own correlations among the parameters and would also provide their own decision

trees to produce tendencies as has been done in this study with the two subgroups

discussed herein.

The third set of correlational tests, the Writing Sample Metrics Aggregate

Analysis, finally begins to focus upon the primary research question of the study, “How

Page 14: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


do what and how students write in submitted assignments relate to their usage patterns in

an LMS?”

As the page views course type analyses gave insight into the Fast Clicker usage

pattern, the writing samples seem to give the most insight into the High Times Out usage

pattern. For HTO classified students, the SIXLev (student ranking in usage percentage of

six-letter words or greater) was positively correlated with the HTO usage pattern

students, and the use of prepositions and relativity words were both negatively correlated

with those students. The only category correlating with the Fast Clicker usage pattern

was a positive one in the student’s use of verbs in writing samples.

The student’s use of verbs in writing was also positively correlated with that

student’s visit to the Grades area of the LMS, but negatively correlated to the student’s

visit to the Files area. This supports and confirms the antithetical relationship between

viewing the grade book and viewing content areas. What these students’ uses of verbs in

writing has to do positively with checking the grade book and negatively with viewing

files is another area for further study.

The final set of correlational tests was conducted with classifications of writing

assignments by word count, in an attempt to answer the research question regarding

writing sample type and its affect on recognizing patterns in LMS use. As detailed

above, the writing samples were classified as A through D, with A being assignments

with average word count of less than 1000, B with 1000 to 2000, C with 2000 to 3500,

and D with 3500 or above.

In Class A writing assignments, the use of six-letter or greater words (SIXLev)

and the use of prepositions were the writing categories with the highest number of

Page 15: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


correlations with page views URL token categories. The FC usage pattern correlated

negatively with the use of six-letter or greater words, while the HTO usage pattern

correlated positively with the same category (SIXLev), but negatively with prepositions


In Class B writing assignments, the relativity word use category (relLev) becomes

prominent, with 10 significant correlations in the 19 categories. However, it only

correlates with one of the two usage patterns examined in this study, negatively, with the

HTO pattern. Relativity words usage is also positively correlated with visits to the

Assignments area of the LMS, and negatively with the View area. What is most

interesting about this class of writing assignments is what is missing: Grade and

Conversations have no significant correlations with any of the writing metrics or

categories in Class B assignments.

In Class C and Class D writing assignments, most of the significant correlations

fall away as the degree of freedom drops as well. In Class C writing, only HTO

correlates with the use of prepositions, but as this is the only significant correlation

(except for its closely-associated by design session metric, RSDLevs), it is suspected of

being spurious. In Class D, there are a few more significant correlations, but none

affecting the FC or HTO LMS usage patterns into which this study seeks some insight.

One final set of correlational tests was performed with “binned” students, or with

only sets of students that were pre-classified as FC, HTO, or Both (FC and HTO). These

tests were performed to determine if there were specific relationships between writing

sample metrics and categories and page views metrics and URL token categories within

these groups of students. The plan was to use any correlations that were produced in

Page 16: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


these tests to inform and guide the construction of decision trees in the next set of


In the Bin All (all FC, HTO, and/or Both pre-classified students), a number of

significant correlations were generated by the tests. The dominant writing categories

were SIXlev (six-letter or greater) word use, functional words used (FUNlev), pronoun

use, and cognitive-mechanical word use. What is surprising with this test is the

emergence of dominant categories that have not shown up as dominant (except for

SIXlev) in previous correlational tests. Also surprising in this Bin All table is that the

visits to URLs where the students viewed content (such as streaming videos) were

significantly correlated over five writing categories, both positively with SIXlev (which

was also positively correlated with most of the page views session metrics) and

negatively correlated with DIClev, FUNlev, pronouns and relativity word use.

The binned FC students also had a number of significant correlations, especially

in FUNlev, pronoun use, and prepositions use. The View category was the only page

views URL token category with a significant relationship to the LMS usage pattern.

In HTO and Both, the correlations drop out considerably, demonstrating that

binning these two categories for this pattern is unproductive.

In summary, the correlational tests do show that there are a number of significant

relationships between what students write and submit in their writing assignments, and

how they use their course LMS. As this study is not designed to show causality,

inferences were not made regarding these correlations. However, the most significant

result of these findings is that they set up a lot of avenues for future research, some of

Page 17: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


which may examine the issue of causality or at least explore each of the significant

relationships more closely.

Decision Trees

Decision trees were generated to produce tendencies toward both the FC and HTO

LMS usage patterns, as a way to look at the relationships discovered in the correlation

mining from a different perspective. Although decision trees are generally used for

predicting outcomes in test data based on the training data, the number of parameters

involved in this study makes the use of the generated decision trees as predictive tools

less than adequate. However, the value of these decision trees is squarely centered upon

the eliciting of the metrics and categories that tend to arise again and again among these

72 trees. As such, it was helpful to create Wordle™ images that represented the

prevalence of metrics and categories as they arose within the 36 decision trees generated

for each of the two patterns.

FC Metrics and Categories Frequencies in Generated Decision Trees

Among the decision trees generated to produce tendencies of students fitting the

Fast Clicker LMS usage pattern, the Wordle™ image (Figure 7.1) shown below

represents the metrics and categories that were included in the trees, with their number of

inclusions making them more prominent (in terms of size) in the display.

Page 18: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting

Figure 7.1

Figure 7.1: Wordle™ Diagram of FC Decision Tree Discriminators

A quick glance at

among the branches of all 36 trees that were generated for the FC pattern: cogLev,

SIXlev, DIClev, Module, and artLev.

metrics (SIXlev and DIClev), two were from

artLev), and one was from the page views

a good balance between parameters

note that Grades, Files, and Assignments,

descriptive statistics and correlations, were not prevalent enough in the decision trees to

be prominently displayed in the diagram.

: Wordle™ Diagram of FC Decision Tree Discriminators

at the image reveals five parameters that are the most prevalent

among the branches of all 36 trees that were generated for the FC pattern: cogLev,

SIXlev, DIClev, Module, and artLev. Two of the parameters were from writing sample

metrics (SIXlev and DIClev), two were from writing sample categories (cogLev and

artLev), and one was from the page views URL token categories (Modul). This pro

parameters in all three major datasets. It is of interest to also

note that Grades, Files, and Assignments, despite how prominent they were in the

descriptive statistics and correlations, were not prevalent enough in the decision trees to

be prominently displayed in the diagram.


s five parameters that are the most prevalent

among the branches of all 36 trees that were generated for the FC pattern: cogLev,

of the parameters were from writing sample

writing sample categories (cogLev and

categories (Modul). This provides

. It is of interest to also

despite how prominent they were in the

descriptive statistics and correlations, were not prevalent enough in the decision trees to

Page 19: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting

Future work with students in the Fast Clicker pattern should include

investigation into those five

Fast Clicker LMS usage pattern.

HTO Metrics and Categories Frequencies in Generated Decision Trees

Among the decision trees generated to produce tendencies of students

High Times Out LMS usage pattern, the Wordle™

represents the metrics and categories that were included in the trees, with their number of

inclusions making them more prominent (in terms of size) in the display.

Figure 7.2

Figure 7.2: Wordle™ Diagram of

A quick glance of the image presents more parameters of prominence than the FC

pattern image, which is understandable as this usage pattern showed more variation

Future work with students in the Fast Clicker pattern should include

those five specific parameters, and their actual predictive value for this

Fast Clicker LMS usage pattern.

HTO Metrics and Categories Frequencies in Generated Decision Trees

Among the decision trees generated to produce tendencies of students

MS usage pattern, the Wordle™ image shown below, in Figure

represents the metrics and categories that were included in the trees, with their number of

inclusions making them more prominent (in terms of size) in the display.

: Wordle™ Diagram of HTO Decision Tree Discriminators

of the image presents more parameters of prominence than the FC

pattern image, which is understandable as this usage pattern showed more variation


Future work with students in the Fast Clicker pattern should include further

specific parameters, and their actual predictive value for this

HTO Metrics and Categories Frequencies in Generated Decision Trees

Among the decision trees generated to produce tendencies of students fitting the

, in Figure 7.2,

represents the metrics and categories that were included in the trees, with their number of

inclusions making them more prominent (in terms of size) in the display.

of the image presents more parameters of prominence than the FC

pattern image, which is understandable as this usage pattern showed more variation

Page 20: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


among the parameters during the correlation tests. Therefore, a more thoughtful

investigation of the diagram is required. One parameter dominated all others: prepsLev.

Three parameters were of secondary prominence: Grade, SIXlev, and artLev. The other

parameters gradually were reduced in stature indicating that they were liberally and

variously distributed among the decision trees generated for the HTO pattern. It was

interesting that one of the four highest parameters was from writing sample metrics

(SIXlev), two were from writing sample categories (prepsLev and artLev), and one was

from the page views URL token categories (Grade). The SIXlev and artLev parameters

were also prominent in the FC decision trees. It is also of interest to note that the Grade

category did appear as prominent in these trees, but again, Files, Modules, and

Assignments, despite how prominent they were in the descriptive statistics and

correlation tests, were not prevalent enough in the decision trees to be prominently

displayed in the diagram.

Future work with students in the High Times Out pattern should include further

investigation into those four specific parameters, and their actual predictive value for this

High Times Out LMS usage pattern.

Sample Test Class Results from Decision Trees

Finally, in running data through the generated decision trees from a sample class

of students taught in the same LMS during the Spring 2013 semester, these results were


In the FC Decision Tree Tendency Composite Table (Table 6.5), several

tendencies stood up as consistently high among scores from all three writing assignments

Page 21: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


in the sample course. Four students, from among the 15, were identified as having

tendencies towards the Fast Clicker LMS usage pattern based upon the features of their

writing samples. Two of those students were shown to have composite tendencies by two

different decision trees. Perhaps it is more interesting to note the decision trees that

provided the composites, over the students who were tentatively classified. The

instruments were F4 and F6, which are the DecTreeFC_AF_All and

DecTreeFC_WA_All, respectively. This shows that future work in building predictive

models (over the exploratory ones here), should focus more on decision trees generated

with both WS and PV parameters (All), and subsets of students based on course type

(AF) and writing assignment type (WA), over the entire dataset of students from all

classes and exhibiting all writing types.

In the HTO Decision Tree Tendency Composite Table (Table 6.10), only two

students were classified consistently over all three writing assignments, and neither were

classified by more than one tree. However, the two trees that classified the two students

were the same versions of the trees as the FC trees mentioned above. This confirms that,

for both patterns, exploration should be centered on decision trees generated from a

combination of both the PV and WS data, and on subsets gathered from the same specific

course types and writing sample assignment classes.

Summary of Conclusions

The conclusions may be summarized in this manner:

• There is a relationship between how a student writes and how a student

uses an LMS. This relationship can be represented by some of the general

Page 22: CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER …dwb5.unl.edu/Diss/Kokensparger/Chapter7.pdf · CHAPTER 7: CONCLUSIONS AND AREAS FOR FURTHER ... the LMS was completing and submitting


patterns produced in the study, such as the tendency for students who visit

the Grades and Conversations (messaging) part of the LMS to spend less

time and clicks in the content areas of the courses, such as in Files and


• Because this relationship exists, there is great potential for building

predictive models of LMS usage based on student writing samples (e.g.,

with Decision Trees). These models were explored and discussed, and

show promise for future studies in this area.

• The writing characteristics that are the best ones to use for this modeling

will vary somewhat according to the way the course is built and managed

in the LMS as well as the type and length of the writing sample involved.