24
+ Research Seminar Fall 15 December 1 , 2015 Good Afternoon! Mr. Jigar Jadav [email protected]

Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+

Research Seminar Fall ’15

December 1, 2015

Good Afternoon!

Mr. Jigar Jadav [email protected]

Page 2: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Mobile Learning & Big Data Analysis on Student Generated Data

Outline: 1. Introduction 2. Current Work 3. Challenges & Literature 4. Data Analytics Lifecycle 5. Text Mining 6. Contribution, Conclusion & Future Work

Page 3: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ The Big Question:

Can data analysis (term frequency analysis) of student generated data provide some insight into mobile learning in a K-12 setting?

Page 4: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Introduction

■ Mobile Learning: mobile learning can simply be defined as learning that is supported or delivered by a handheld or mobile device. (Hutchinson, 2012) ■ Data Analysis:

Data analysis is the process of transforming raw data into usable information. (Statistics Canada, 5th edition., 2009)

Page 5: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Current Work: 1. Mobile Learning Theories and Assessment (TPACK model)

The TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one another to produce effective teaching with technology. (Koehlar, 2009)

Page 6: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Current Work: 1. Data Analysis in Other Fields: ■ Marketing ■ Finance ■ Sports ■ Advertising ■ Medicine ■ Customer Service

2. Data Analysis in Education: ■ Only qualitative data analysis done in mobile learning ■ No extensive research with quantitative data using machine

learning and Big Data algorithms

Page 7: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Challenges & Literature: ■Privacy

Sensitive student data – getting access to this data is a long painstaking process, requiring multiple approvals

■Literature Not enough literature available in data/text analytics related to K-12 education

Page 8: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Data Analytics Lifecycle:

Reference: EMC Education Services, 2015

Page 9: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Text Mining with Dummy Data in RStudio

• STEPS: o Loading Texts (skip) o Preprocessing Stage the Data

o Explore the data Word Frequency Plot Word Frequencies

o Relationships Between Terms Term Correlations Word Clouds!

o Clustering by Term Similarity Hierarchal Clustering K-means clustering

Page 10: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Preprocessing (Very Time Consuming)

• Remove numbers, capitalization, common words, punctuation, and otherwise prepare your texts for analysis.

• Removing punctuation: • Generic: docs <- tm_map(docs, removePunctuation) • Customized:

for(j in seq(docs)) { docs[[j]] <- gsub("/", " ", docs[[j]]) docs[[j]] <- gsub("@", " ", docs[[j]]) docs[[j]] <- gsub("\\|", " ", docs[[j]]) }

Page 11: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Preprocessing Continued:

• Removing numbers: docs <- tm_map(docs, removeNumbers) • Converting to lowercase: docs <- tm_map(docs, tolower) • Other clean-ups performed:

o Removing particular words o Stripping unnecessary whitespace from your

documents

Page 12: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Preprocessing: Stage the Data

• Create a document term matrix dtm <- DocumentTermMatrix(docs) ## A document-term matrix (6 documents, 2197 terms) ## ## Non-/sparse entries: 3867/9315 ## Sparsity : 71% ## Maximal term length: 40 ## Weighting : term frequency (tf)

Page 13: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Explore the data: Word Frequencies

• There are a lot of terms, let’s check out some of the most and least frequently occurring words.

• Least: freq[head(ord)] ## absent abstractfre abstractionedit accentu ## 1 1 1 1 ## accommod accompani ## 1 1 • Most: freq[tail(ord)] ## research qualit qda analysi code data ## 105 126 128 130 203 302

Page 14: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Another View of Word Frequencies

wf <- data.frame(word=names(freq), freq=freq) head(wf) ## word freq ## data data 302 ## code code 203 ## analysi analysi 130 ## qda qda 128 ## qualit qualit 126 ## research research 105

Page 15: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Plot Word Frequencies • words that appear at least 50 times

library(ggplot2) p <- ggplot(subset(wf, freq>50), aes(word, freq)) p <- p + geom_bar(stat="identity") p <- p + theme(axis.text.x=element_text(angle=45, hjust=1))

Page 16: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Relationships Between Terms: Term Correlations • Analysis of a particular term: identify the words that most

highly correlate with that term • If words always appear together, then correlation=1.0 findAssocs(dtm, c("question" , "analysi"), corlimit=0.98) ## question analysi ## across 0.99 0.99 ## flow 0.99 0.99 ## format 0.99 0.99 ## group 0.99 0.99 ## less 0.99 0.99 ## matter 0.99 0.98 ## multipl 0.99 0.98 ## eight 0.98 0.98 ## still 0.98 0.98 ## time 0.98 0.98

Page 17: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Relationships Between Terms: Word Clouds! • People are generally strong at visual analytics. That is part of the

reason that these have become so popular. • Plot words that occur at least 20 times set.seed(142) wordcloud(names(freq), freq, min.freq=20, scale=c(5, .1), colors=brewer.pal(6, "Dark2"))

Page 18: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Clustering by Term Similarity: Hierarchal Clustering • First calculate distance between words & then cluster them

according to similarity d <- dist(t(dtmss), method="euclidian") fit <- hclust(d=d, method="ward")

Page 19: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Cluster Dendrogram with Groups plot.new() plot(fit, hang=-1) groups <- cutree(fit, k=5) # "k=" defines the number of clusters you are using rect.hclust(fit, k=5, border="red") # draw dendogram with red borders around the 5 clusters

Page 20: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Clustering by Term Similarity: K-means clustering • cluster words into a specified number of groups library(fpc) d <- dist(t(dtmss), method="euclidian") kfit <- kmeans(d, 2) clusplot(as.matrix(d), kfit$cluster, color=T, shade=T, labels=2, lines=0)

Page 21: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Contribution, Conclusion & Future Work

■ Contribution:

Test mobile learning models like TPACK with quantifiable data

Apply text analytics algorithms and machine learning algorithms to data mining student data

Page 22: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ Contribution, Conclusion & Future Work

■ Conclusion: potential in analyzing this data

Real-time feedback for teachers & parents to provide evidence for time-on-task analysis for individual students Better monetary decision making by the school district

Page 23: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+

A new web filter architecture with build-in data mining tools for K-12 education Create/modify algorithms specifically for text analytics of student data

Page 24: Computer Science 1csis.pace.edu/~ctappert/it691-projects/jigar.pdfThe TPACK framework describes how teachers’ understanding of educational technologies and PCK interact with one

+ References ■Koehler, M. J., & Mishra, P. (2009). What is technological pedagogical content knowledge? Contemporary Issues in Technology and Teacher Education, 9(1), 60-70. ■Hutchison, Amy, Beth Beschorner, and Denise Schmidt‐Crawford.

"Exploring the use of the iPad for literacy learning." The Reading Teacher 66.1 (2012): 15-23. ■Statistics Canada, "Statistics Canada Quality Guidelines", 5th edition.

Hyperlink: http://unstats.un.org/unsd/dnss/docs-nqaf/Canada-12-539-x2009001-eng.pdf ■EMC Education Services. Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data. John Wiley & Sons, 2015. ■Williams, Graham. Data mining with Rattle and R: the art of excavating data for knowledge discovery. Springer Science & Business Media, 2011.