31
AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES LEARNING COMMUNITIES Anatoliy Gruzd [email protected] du Dissertation Defense April 1, 2009

AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

Embed Size (px)

Citation preview

Page 1: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIESONLINE LEARNING COMMUNITIES

Anatoliy Gruzd [email protected]

Dissertation DefenseApril 1, 2009

Page 2: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

2

Online Social Networks

http://www.visualcomplexity.com/vc

• Email networks

• Forum networks

• Blog networks

• Friends’ networks on MySpace, Facebook, etc

• Networks of like-minded people on

Page 3: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

3

Users’ contributions and networks are growing daily!

Source: IDC white paper, “The Diverse and Exploding DigitalUnverse,” sponsored by EMC, March 2008.

Usenet newsgroups4.6 terabytes of text *daily*

Blogs900,000 new blogs *daily*

Emails100 billion emails *daily*

Page 4: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

4

Users’ contributions and networks are growing daily!

Usenet newsgroups4.6 terabytes of text *daily*

Blogs900,000 new blogs *daily*

Emails100 billion emails *daily*

• What the group’s interests and priorities are?

• How and why one online community emerges and another dies?

• How people agree on common practices and rules in an online community?

• How knowledge and information is shared among group members?

Page 5: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

5

© kelleyw

Automated Discovery of Social Networks

Page 6: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

6

• Research Goal

– Use computers to discover online social networks automatically

• Case Study– Discussion forums in online classes

Automated Discovery of Social Networks

Page 7: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

7

Research Questions

• Extracting Social Networks from Forum PostingsQuestion 1: What content-based features of postings help to uncover nodes and ties between group members?

Page 8: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

8

Extracting Social Networks from Forum Postings Approach 1: Chain Network (Reply-to)

FROM: SamREFERENCE CHAIN: Gabriel “ Nick, Gina and Gabriel: I apologize for not backing this up with a good source, but I know from reading about this topic that … ”

Posting header

Content

Source Posting HeaderMethod Connects a sender to the previous poster

in the thread

Discovered Tie(s) Sam -> Gabriel

Possible Missing Connections:• Sam -> Nick • Sam -> Gina• Nick <-> Gina

Page 9: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

9

Extracting Social Networks from Forum Postings Approach 2: Name Network

Method Connect the sender to people mentioned in the message

Connect people whose names co-occur in the same message(s)

Discovered Tie(s)

Ann -> Steve Ann -> Natasha

Steve <-> Natasha

FROM: Ann

“Steve and Natasha, I couldn't wait to see your site.

I knew it was going to [be] awesome!”

Page 10: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

10

Extracting Social Networks from Forum Postings Approach 2: Name Network

• Compare each word from the posting against a dictionary of all names collected from the US Census data

• Find names that are NOT in the name dictionary (e.g., international names, informal names and nicknames) using contextual and structural information about words such as – Capitalization– Context words – Position in text

Step 1. Automatically find all personal names in the postings

Page 11: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

11

Extracting Social Networks from Forum Postings Approach 2: Name Network

EXAMPLEFrom: [email protected] (= Wilma)Reference Chain: [email protected], [email protected]

Hi Dustin, Sam and all, I appreciate your posts from this and last week […]. I keep thinking of poor Charlie who only wanted information on “dogs“. […] Cheers, Wilma.

Wilma – Dustin Wilma – SamWilma – Charlie

Challenges to overcome:– One person can have many names – Many people can have the same name – Names can belong to students in the class and outsiders

Step 2. Connect a sender of the posting to all names discovered in the previous step

Solution: - Name alias resolution

Dustin – Sam – Charlie

Page 12: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

12

Research Questions

• Extracting Social Networks from Forum Postings

• Evaluating Name NetworksQuestion 2: How are the proposed name networks similar to or different from networks derived from other methods?

Page 13: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

13

Evaluating Name Networks

Name Network Chain Network

Forum Postings

Self-Reported Network

SurveyComparison Procedure:• QAP correlations• Exponential random graph models (p* models)• Manual exploration using network visualization

vs.

vs. vs.

Page 14: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

14

Evaluating Name Networks Data collection

DatasetClasses 6School year Spring 2008Duration of each class 15 weeks

No. of students per class 15 – 28

Data source• Bulletin board

messages• Online

questionnaire

Response rate 54%-86% (63%)

No. of all postings

0500

100015002000

Class#1

Class#2

Class#3

Class#4

Class#5

Class#6

No. of students

0

10

20

30

Class #1 Class #2 Class #3 Class #4 Class #5 Class #6

Page 15: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

15

Evaluating Name Networks Online Questionnaire

Section 1. Students’ perceived social structures I learned a lot about the subject matter from this person …

0 – never; 1 - rarely; 2 - for some of the course; 3 - during most of the course; 4 - throughout the whole course;

Section 2. Influential members of the class Indicate five students who you consider most important or influential in this class

regarding each of the following types of interaction:(1) Providing information; (2) Promoting discussion; (3) Giving help; (4)

Making class fun;

Section 3. Interactions in the class as a whole I felt that the class worked together …

[ Based on C. Haythornthwaite’s 1999 LEEP study protocol ]

Sample question:

Sample question:

Page 16: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

16

Evaluating Name NetworksExample: Youtube comments

Name Network Chain Network

Chain Network

(less connections)

Name Network

(more connections)

Page 17: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

17

Evaluating Name Networks Results from Online Learners Dataset

NName networks provide on average 40ame networks provide on average 40%% more information more information about social ties in a group as compared to about social ties in a group as compared to CChain networkshain networks

“New” Info(considering only the 40%)

82%82%An addressee has not

posted to the thread

18%18%An addressee is not the most

recent poster

70%70%Thread-starting posting

30%30%A subsequent posting

in the thread

Name Network Chain NetworkQAP correlation ~ 0.5

Page 18: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

18

Evaluating Name NetworksResults from Online Learners Dataset

Structurally, the name and self-reported networks are far more Structurally, the name and self-reported networks are far more similar.similar.Based on p* models, the self-reported network is almost twice as likely to share the same ties with the name network than with the chain network.

Chain NetworkName Network

Self-Reported Network

Friends’ network for one of the classes

Page 19: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

19

Research Questions

• Extracting Social Networks from Forum Postings

• Evaluating Name Networks

• Identifying Social Relations in Name Networks Question 3: What types of social relations do name networks include?

Page 20: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

20

Identifying Social Relations in Name Networks Results

• The following social relations were found by the “name The following social relations were found by the “name network” methodnetwork” method

Learning ● Collaborative Work ● Help

Page 21: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

21

Identifying Social Relations in Name Networks Results

• The following social relations were found by the “name The following social relations were found by the “name network” methodnetwork” method

Learning ● Collaborative Work ● Help

– Postings that show attention to subject matter discussed by someone else

“… it made me think of the faceted catalogs' display that Karen posted ”

Page 22: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

22

Identifying Social Relations in Name Networks Results

• The following social relations were found by the “name The following social relations were found by the “name network” methodnetwork” method

Learning ● Collaborative Work ● Help

– Organizing group work, taking a leadership role

“ Some quick poking around shows that Steve and myself are here in Champaign, [...] and Nicole is in Chicago. [...] does anyone have a strong desire to be our contact person to the administrators ”

Page 23: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

23

Identifying Social Relations in Name Networks Results

• The following social relations were found by the “name The following social relations were found by the “name network” methodnetwork” method

Learning ● Collaborative Work ● Help

– A reference to an event or interaction that happened outside the bulleting board

“ Anne and I have been corresponding via e-mail and she reminded me that we should be having discussion here "

Page 24: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

24

Identifying Social Relations in Name Networks Results

• The following social relations were found by the “name The following social relations were found by the “name network” methodnetwork” method

Learning ● Collaborative Work ● Help

– Postings that ask others for help

“ [Instructor’s name] if you see this posting would you please clarify for us ”

Page 25: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

25

Using the results in the learning context• Identify students who might need extra attention/help from the

instructor

• Discover if lectures or other class materials were unclear• Identify peer-help

• Find active group members who often take a leadership role in a group

StudentStudent Instructor

StudentStudentGroup Group Leader Leader StudentStudent

StudentStudent

Page 26: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

26

Contributions of the Research

1. Development of a novel approach (name network) for content-based, automated discovery of social networks from threaded discussions in online communities and a framework for evaluating this new approach– The “name network” method can be used

• to transform even unstructured Internet data into social network data;

• where more traditional methods for data collection on social networks such as surveys are too costly or not possible;

Page 27: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

27

Contributions of the Research (cont.)

2. Empirical comparison of name networks to chain and self-reported networks using data collected from 6 online classes

3. Demonstration of the proposed automated approach for collecting social network data is a viable alternative to the costly and time-consuming collection of self-reported networks

4. Demonstration of how name networks can be used to study online classes and assess collaborative learning

5. Development of the ICTA web-based system for content and network analysis (http://textanalytics.net)

Page 28: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

28

http://TextAnalytics.nethttp://TextAnalytics.net

Page 29: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

29

Limitations

• The ‘name network’ method

– is more expensive computationally then the ‘chain network’ method

– uses an email address as a unique identifier of a participant

– relies only on postings that include personal names (on average only about 25-30% of all postings)

Page 30: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

30

Future Research

• Study other types of online communities• Study online communities using multiple data sources

such as forums, chats, wikis, etc• Develop automated techniques to identify types of social

relations and social roles

Page 31: AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIESONLINE LEARNING COMMUNITIES

Anatoliy Gruzd [email protected]

April 1, 2009

Contributions• Developed the Name Network method and evaluated it in the

context of e-learning• Identified types of social relations in Name Networks• Developed ICTA – a web-based system for content and network

analysis (http://textanalytics.net)