13
Twitter 2 Day 32 - 11/10/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University

TWITTER 2 DAY 32 - 11/10/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University

Embed Size (px)

Citation preview

Twitter 2Day 32 - 11/10/14LING 3820 & 6820

Natural Language Processing

Harry Howard

Tulane University

Course organization

10-Nov-2014NLP, Prof. Howard, Tulane University

2

http://www.tulane.edu/~howard/LING3820/

The syllabus is under construction. http://www.tulane.edu/~howard/CompCu

ltEN/ Chapter numbering

3.7. How to deal with non-English characters

4.5. How to create a pattern with Unicode characters

6. Control

Open Spyder

10-Nov-2014

3

NLP, Prof. Howard, Tulane University

Twitter

Review

10-Nov-2014

4

NLP, Prof. Howard, Tulane University

Get an app account

The first thing to do is to sign up for a Twitter account at twitter.com, if you don’t already have one. Then point your browser at Twitter Apps and log in with your new account credentials. At the top right corner, click on the Create New App button. In the form that opens up, give your new app any name you want, describe at as “computational culture with Twitter”, use Tulane’s URL “http://www.tulane.edu/” as the website, click the button to agree with the Developer Agreement, and click on Create your Twitter application.

On the next page, select API Keys from the menu. On the Application settings page, for the time being, you can keep the access level at Read-only. Scroll down the page and click on create my access token. You will get a confirmation message at the top of the page. You may want to click the reset link to hurry the process along.

There are now four crucial pieces of information that you will need to make note of: API key, API secret, Access token and Access token secret. Since these are long and unwieldy strings, you should copy and paste them into some handy place immediately

10-Nov-2014NLP, Prof. Howard, Tulane University

5

Tweepy installation

In the Terminal:$ pip install –U tweepy

10-Nov-2014NLP, Prof. Howard, Tulane University

6

I emailed you the script tweepies.py

§10 Twitter

10-Nov-2014

7

NLP, Prof. Howard, Tulane University

logon()

1. def logon():

2. import tweepy

3. API_KEY = 'your_info_here'

4. API_SECRET = 'your_info_here'

5. ACCESS_TOKEN = 'your_info_here'

6. ACCESS_TOKEN_SECRET = 'your_info_here'

7. key = tweepy.OAuthHandler(API_KEY, API_SECRET)

8. key.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)

9. return key

10-Nov-2014NLP, Prof. Howard, Tulane University

8

The other functions of tweepies.py1. stream2screen(num, terms)

2. stream2var(num, terms)

3. stream2file(num, terms)

4. json2screen(num, terms)

5. json2screenpretty(num, terms)

6. dict2screen(num, terms)

7. dict2var(num, terms)

10-Nov-2014NLP, Prof. Howard, Tulane University

9

Usage

>>> from tweepies import stream2screen

>>> stream2screen(20, ['#KickMe'])

10-Nov-2014NLP, Prof. Howard, Tulane University

10

What they do

1. stream2screen(num, terms)

2. stream2var(num, terms)

3. stream2file(num, terms)

4. json2screen(num, terms)

5. json2screenpretty(num, terms)

6. dict2screen(num, terms)

7. dict2var(num, terms)

10-Nov-2014NLP, Prof. Howard, Tulane University

11

Quiz

Task: can you find a group of words that will distinguish two Twitter topics?

How to do it Collect 500+ tweets from two trending topics

into different variables. Run each through a FreqDist to find frequent

words that may be unique to each topic (filter out the stop words).

Use these key words in a ConditionalFreqDist to show how well they would work in identifying or classifying each topic.

10-Nov-2014NLP, Prof. Howard, Tulane University

12

More twitter

Next time

10-Nov-2014NLP, Prof. Howard, Tulane University

13