75
Advanced Artificial Intelligence Educational Data Mining Chung-Ang University, KwanHee Kim 2019.06.07 Good afternoon. Everyone. My name is Kwanhee Kim. The topic of this presentation is ‘Educational Data Mining’. Let's start the presentation.

AdvancedArtificialIntelligence EducationalDataMiningmi.cau.ac.kr/teaching/lecture_aai/EDM.pdf · Introduction EDM describes the areas of research related to data mining, machine learning

Embed Size (px)

Citation preview

Advanced Artificial Intelligence

Educational Data Mining

Chung-Ang University, KwanHee Kim

2019.06.07

Good afternoon. Everyone. My name is Kwanhee Kim.The topic of this presentation is ‘Educational Data Mining’. Let's start the presentation.

Introduction

EDM describes the areas of research related to data mining, machine learning and statistical applicationof information generated in the educational environment.

2Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Introduction

EDM is created by people learning in an educational environment.This data refers to technology and research designed to automatically extract meaning from large repositories.

3Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Introduction

For example, a learning management system (LMS) is a software application for the administration, documentation,tracking, reporting and delivery of educational courses or training programs or learning and development programs.

4

Introduction

learning management system can analyze data about the classes students take and the final grade.Based on this level, you can provide insight into your learning environment.

5Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Introduction

An intelligent tutoring system records data each time a learner submits an answer to a question.You can analyze your answers to identify the status and status of your learning.

6Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Introduction

7Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Real World Sensor 1

Sensor k

Non-TextData

Text Data

Joint Mining of Non-Text

and Text

PredictiveModel

Multiple Predictors(Features)

Predicted Values of Real World Variables

Change the WorldTeacher

Student

A similar example is based on data from students in the real world, Big Data helps students to analyze and make optimal decisions.

Introduction

Big Data creates both challenges and opportunities for education

8Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Challenges Opportunity

Introduction

Education supplies workforce for developing innovative Big Data technology and applicationsBig Data supplies technology for scaling up and improving quality of education

9Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Challenges for educationEducate many data scientists & engineers quickly and affordably

Opportunities for education Leverage Big Data technology to scale up and improve educationBig Data and education are mutually beneficial ->Integration!

Introduction - MOOCs

10Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Moocs are the foundation for these opportunities and challenges.MOOCs is an online course aimed at unrestricted participation and open access via the web.An important element of MOOC is the ability to do it anytime, anyone, anywhere.

Introduction - MOOCs

11Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Interest in MOOCs has increased exponentially from 2004 to 2015.You can also view search volume by country for MOOCS.

Data from Google trends

Introduction - MOOCs

MOOCs have a lot to consider.The reliability of the assessment and the background and needs of the students.From a big data perspective, you need real student data sets and programming that can be calculated automatically.If EDM and artificial intelligence are fused and more research is done, it is expected that MOOCs will be used for innovative growth in educational field.

12Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

General challenges in MOOCsVariable student background

Variable student needs

Reliability of assessment

Special challenges to “big data”Programming assignments are essential: variable student resources & background

Availability of interesting real-world data sets

Automated grading of programming assignments

CAPT

Here is a Computer assisted pronunciation training, we call this CAPT.That is a representative part of educational data mining.

13Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

In parallel with the existing learning, it is possible to improve the learning ability of English through CAPT.

14Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Education (School Or University)

ComputerAssisted

PronunciationTraining

To Improve Pronunciation

quality

ToEnhanceEnglish

Oral sills

To inspirePassion of LearningEnglish

CAPT

Why use CAPT?The previous pronunciation teaching system is effective but has a costly disadvantage.

15Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Look at the picture above.Most English pronunciation classes depend on native speakers.Finding native speakers is very expensive and time consuming.

16Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Most people think that immature pronunciation is a shame

17Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

CAPT enables users to self-correct their pronunciation.So it eliminates the difficulty associated with finding tutors who are native speakers.This system is helpful to users who feel uncomfortable in oral presentations.

18Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

19

I wearblue close

today

Here is a simple procedure of CAPT.

user speak sample sentences.

Close?Cloth?

Do youmeancloth?

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

And the computer senses the user's pronunciation.

20

I wearblue close

today

Close?Cloth?

Do youmeancloth?

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

As shown in the figure above, the computer can identify the user's environment and context and calculate words by analogy.

21

I wearblue close

today

Close?Cloth?

Do youmeancloth?

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Computers provide feedback to users.

22

I wearblue close

today

Close?Cloth?

Do youmeancloth?

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Another example is the pronunciation training application.

23Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

However, conventional CAPT systems are not available to execute on such lightweight devices, because they require significant computational resources.

CAPT

24

Word recommendationtechnique

Recommendationfunction

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

In conventional pronunciation systems, users practice the same words or sentences repeatedly until acceptable.The system bored users to death.To avoid this situation, CAPT to provide diverse and effective feedback.

25

CAPT

clothclothclothcloth

… clothclothcloth …

Tired…

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

When a person mispronounces a word, the person is likely to mispronounce the word with a similar pronunciation.The CAPT system improves this.

26

vase

valance

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

For example, suppose that a person mispronounces the words “vase” and “valance”.Then there is high probability also mispronounce the word “various”.

27

vase

valancevarious

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

In this case, our system can recommend the word “various'' for a user.

28

vase

valance

various

Feedback

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

phoneme is one of the units of sound that distinguish one word from another in a particular language.The word umbrella consists of eight phonemes.

29

Phoneme

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Definition : One of the units of sound that distinguish one word from another in a particular language.

/U/m/b/r/e/l/l/a/

CAPT

To accomplish this, the word pronunciation is represented as a bag of phonemes.Using this bag of phonemes, the relationship of phonemes with error words is determined.

30

Bag of phoneme model

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Let’s know how to process.Step 1, Users pronounce testing words displayed on the smartphone.In this example, test word set is consisted of the three words – “vase”, “let”, and “valance”.

31Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Step 2, the system tests how the spoken words are actually recognized.In this case, “vase” is recognized as “base” because of mispronunciation.As same, “valance” is recognized as “balance”.

32Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Step 3, The system represents the pronunciation of a word as a bag of phonemesand generates a dataset based on the bag of phoneme model.

33

Bag of phoneme model

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Step 4, the system assesses the correlation of the phonemes and mispronunciation.This shows that two phonemes /v/ and /s/ are highly correlated, because they include the error words “vase” and“valance”.

34Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Easily exemplified, the user mispronounced "V" of the Vase word and "V" of the Valance word.The user also pronounced the words "s" at the end of the words "Vase" and "Valance".

35Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Incorrect Pronunciation(units of phonemes)

CAPT

The user accurately pronounced the phonemes marked in a green circle.

36Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Correct Pronunciation(units of phonemes)

CAPT

A word is made up of several phonemes.The words you pronounce are composed of phonemes, and each phoneme has a test result value called "R".The value "R" is 0 or 1 when the user accurately or incorrectly pronounces the phoneme.

37Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

R = Testing Result Value( 1 or 0)

CAPT

Step 5, the system assigns a selection probability to each word.This shows an example of the word recommendation based on correlation analysis.A circle with a thick line indicates that the phoneme is strongly correlated with the error pronunciation.

38Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Phoneme /v/ and /s/ are included three words.(word : Valance, Various, Vase)Based on this, the system assigns a selection probability value to each word to recommend a word set.

39Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

Identify the phonemes of the words the user pronounces.The phonemes extracted by the user's pronunciations are compared with the phonemes of the words and the phonemes are estimated using the selection probability to estimate the words.

40Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Assumption of word pronounced by the user : Valance

correctly pronounced phonemes: v / s (red color) Incorrectly pronounced phoneme : æ / l / ə / n(green circles)

CAPT

The word "valance" contains five phonemes, while the word "variable" contains four phonemes.

41Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Assumption of word pronounced by the user : Valance

correctly pronounced phonemes: v / s (red color) Incorrectly pronounced phoneme : æ / l / ə / n(green circles)

5

4

Number of phonemes included

CAPT

Step 6. The system recommends selected words.This shows that the two words, “valance” and “various”, were selected because of their high selection probability.

42Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

CAPT

At last Step 7.These two words were used to train the user’s pronunciation in the practice phase.The system plays a native speaker’s pronunciation for the user.After listening, the user pronounces the given words again.

43Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Test word set

A total of 700 words were selected according to the use frequency of the words used in Korean elementary schoolsfor English education.

44

Collected 700 words

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

It because the purpose of this system is a pronunciation correcting.For example If unknown words are included, then user will mispronounce the words.

45

Test word set

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Why Korean elementary schools ?

Word recommendation method

Now, I will show you the word recommendation method.Here are sample words and bag of phonemes.

46Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

Look at the "bag of phonemes“ section.This system creates a bag of phonemes and test results to evaluate the importance of each phonemeEach word is represented as a vector of phonemes, where a 1 indicates that corresponding phonemes are included in the pronunciation of a word, and 0 otherwise.

47Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

Let's look at the word "around."The word "around" can see the resultant value of 1 because it pronounces the contents of the phoneme "a,d."

48Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

After the pronunciation test for each word, the test result to indicate unacceptable(acceptable) is also encoded as 1(0).In other words, the user can infer that the pronunciation of the word "around" was inaccurate.

49Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

Let W denote a set of words that is composed of n words.

�� can be represented based on the occurrence of d phonemes in its pronunciation

d represents a set of phonemes in a word.

50Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

n

d

Word recommendation method

We can summarize the phonetic vector set (length to d) data (0,1) of the words pronounced by the user.In other words, you can check if the phonemes of the word are pronounced correctly.

51Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

n

d

Word recommendation method

52

If the value of the j -th element in � � is 1

If the value of the j -th element in � � is 1 denoted as like this,and then it indicates that the j -th phoneme is required for the pronunciation of �� .(In another case, � �,� to 0)

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

53

As above, the word "around" is displayed as follows.And a green circle indicates a value of 1.

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

W0 = “around word”

W0(0,0) = 1

j

i

d represents a set of phonemes in a word.

Word recommendation method

54

1

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Formula 1

From the point of view of each phoneme, it can be expressed as "Formula 1“

�� is (where 1 ≤ j ≤ d ) is a column vector of the j -th phoneme in the word �� , if the value of the i -th element is 1.

j

i

Word recommendation method

Next, the test result formula2 is represents acceptable / unacceptable pronunciation for ��.

55

1

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Formula 1

Formula 2

Word recommendation method

We employed the Pearson correlation coefficient.Which is one of the most widely used statistical measuresIt calculate the correlation between two variables.

56

1

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

Pearson's correlation coefficient is the covariance of the two variables divided by the product of their standard deviations.

57Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Pearson correlation coefficient

Word recommendation method

Pearson's correlation coefficient is the covariance of the two variables divided by the product of their standard deviations.

58Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Pearson correlation coefficient

Word recommendation method

Pearson Correlation Coefficient The Pearson Correlation Coefficient measures the linear correlation between two variables and has a value between [-1, 1]

When the Pearson correlation coefficient of X and Y is 1, it appears as a straight line when (X, Y) is drawn on the coordinate plane. 0 means no correlation between two variables.

59Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

..

...

..

. ... .

Temperature

Am

ou

nt

of

be

er

sold

Temperature

Am

ou

nt

of

coff

ee s

old

Pearson correlation coefficient

Word recommendation method

cov ( � � , T ) denotes the covariance between � � and T.

var ( � � ) and var ( T ) are the variance of � � and T , respectively.

60

1

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

The covariance measures how the two random variables change together.

A covariance has a positive value if the direction of the magnitude change is the same,such as when one variable grows larger when the other variable grows or when the other variable becomes smaller.

On the contrary, when one variable becomes small, or when another variable becomes small,when another variable becomes large, the covariance becomes negative..

61Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Covariance

Word recommendation method

It is a statistic that indicates the degree to which data are randomly distributed.It is possible to estimate how much it spreads from the expected value (average value) of the data.

62Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Variance

Word recommendation method

The lower and upper bound of C ( � � , T ) is given as like this.value is 1 in the case of a perfect correlation, -1 a perfect inverse correlation.

63Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

The C value can be estimated as follows.

-1: Correct pronunciation1: Inaccurate pronunciation

64Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

Here is an example data set after performing the pronunciation test.The example data set shows that this user mispronounced all of the words including /g/.

65Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

In this case, the degree of correlation between /g/ and the test results is calculated as:Therefore, the existence of /g/ is perfectly correlated to unacceptable pronunciation.

66Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation method

The phoneme "/k/" is correlated to an acceptable pronunciation.The example indicates that to create a recommended word set from the perspective of a user, phonemes with a high degree of correlation need to be included more frequently than those with a low degree of correlation.

67Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Word recommendation by selection probability

68Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Frequently, the phonemes of words that the user mis-pronounces occur.In this case, we often find words for the phonemes of the mistaken words and recommend them to the user.Users can improve their pronunciation through these functions.

Word recommendation by selection probability

69Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

U is a set of correlation values.This is the previously calculated value. A vector set of result values from the formulas above.To summarize, the correlation value between the value for the /J/ th position of the phoneme of the word and the pronunciation of the user and the "test result" value.

Word recommendation by selection probability

70Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Finally, the selection probability for �� is calculated as formula.

Y is a normalized importance ensuring the bound [0, 1].

Word recommendation by selection probability

71Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

In this example, let v be set to 1 for simplicity. Then, the importance value for the word ‘‘grove’’ is calculated as formula.

Word recommendation by selection probability

72Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

The example shows that the word ‘‘grove’’ is assigned a higher importance than the word ‘‘cone’’.The word ‘‘grove’’ includes phonemes with a higher degree of correlation with unacceptable pronunciations.the word ‘‘grove’’ is selected 41 times more frequently than the word ‘‘cone’’ to compose the recommended word set.

Word recommendation by selection probability

73Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

If you calculate the number of words that are registered in the formula in the previous slide, you can extract words that show high values.These extracted words are composed of phonemes erroneously pronounced by the user.

n

repeat

Word recommendation by selection probability

74Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

If you practice these words repeatedly, you can have confidence in English pronunciation and improved skills.

Q&A

Advanced Artificial Intelligence / Chung-Ang University / KwanHee Kim

Thank you. Do you have any questions?

75

Thank you