When recommendation systems go bad - machine eatable

When Recommendation

Systems Go Bad

Evan Estola3/31/17

About Me

● Evan Estola

● Staff Machine Learning Engineer, Data Team Lead @ Meetup

● [email protected]

● @estola

Meetup

● Do more

● 270,000 Meetup Groups

● 30 Million Members

● 180 Countries

Why Recs at Meetup are Hard

● Cold Start

● Sparsity

● Lies

Recommendation Systems: Collaborative Filtering

http://www.slideshare.net/NYCPredictiveAnalytics/building-a-recommendation-engine-an-example-of-a-product-recommendation-engine

Recommendation Systems: Rating Prediction

● Netflix prize

● How many stars would user X give movie Y

● Ineffective!

Recommendation Systems: Learning To Rank

● Treat Recommendations as a supervised ranking problem

● Easy mode:

○ Positive samples - joined a Meetup

○ Negative samples - didn’t join a Meetup

○ Logistic Regression, use output/confidence for ranking

You just wanted a kitchen scale, now Amazon thinks you’re a drug dealer

● “Black-sounding” names 25% more

likely to be served ad suggesting

criminal record

http://www.technologyreview.com/view/510646/racism-is-poisoning-online-ad-delivery-says-harvard-professor/

●

● Fake profiles, track ads

● Career coaching for “200k+”

Executive jobs Ad

● Male group: 1852 impressions

● Female group: 318

http://www.theguardian.com/technology/2015/jul/08/women-less-likely-ads-high-paid-jobs-google-study

● Twitter bot● “Garbage in,

garbage out”● Responsibility?

“In the span of 15 hours Tay referred to feminism as a

"cult" and a "cancer," as well as noting "gender equality

= feminism" and "i love feminism now." Tweeting

"Bruce Jenner" at the bot got similar mixed response,

ranging from "caitlyn jenner is a hero & is a stunning,

beautiful woman!" to the transphobic "caitlyn jenner

isn't a real woman yet she won woman of the year?"”

Tay.ai

http://www.fastcompany.com/3048093/fast-feed/holy-fk-when-facial-recognition-algorithms-go-wrong

Know your data

● Outliers can matter

● The real world is messy

● Some people will mess with you

● Not everyone looks like you

○ Airbags

● More important than ever with

more impactful applications

○ Example: Medical data

Keep it simple

● Interpretable models

● Feature interactions

○ Using features against

someone in unintended ways

○ Work experience is good up

until a point?

○ Consequences of location?

○ Combining gender and

interests?

● When you must get fancy, combine

grokable models

Ensemble Model, Data Segregation

Data:*InterestsSearchesFriendsLocation

Data:*GenderFriendsLocation

Data:Model1 PredictionModel2 Prediction

Model1 Prediction

Model2 Prediction

Final Prediction

Diversity Controlled Testing

● CMU - AdFisher

○ Crawls ads with simulated user profiles

● Same technique can work to find bias in your own models!

○ Generate Test Data

■ Randomize sensitive feature in real data set

○ Run Model

■ Evaluate for unacceptable biased treatment

● Florian Tramèr

○ FairTest

https://research.google.com/bigpicture/attacking-discrimination-in-ml/

https://research.google.com/bigpicture/attacking-discrimination-in-ml/

Human Problems

● Auto-ethics

○ Defining un-ethical features

○ Who decides to look for fairness in the first place?

By restricting or removing certain features aren’t you sacrificing performance? Isn’t it actually adding bias if you decide which features to put in or not?If the data shows that there is a relationship between X and Y, isn’t that your ground truth?

Isn’t that sub-optimal?

It’s always a human problem

● “All Models are wrong, but some are useful”

● Your model is already biased

Bad Features

● Not all features are ok!

○ ‘Time travelling’

■ Rating a movie => watched the movie

■ Cancer Surgery

Misguided Models

● “It’s difficult to make predictions, especially about the future”

○ Offline performance != Online performance

○ Predicting past behavior != Influencing behavior

○ Example: Clicks vs. buy behavior in ads

Asking the right questions

● Need a human

○ Choosing features

○ Choosing the right target variable

■ Value-added ML

“Computers are useless,

they can only give you

answers”

Bad Questions

● Questionable real-world applications

○ Screen job applications

○ Screen college applications

○ Predict salary

○ Predict recidivism

● Features?

○ Race

○ Gender

○ Age

Correlating features

● Name -> Gender

● Name -> Age

● Grad Year -> Age

● Zip -> Socioeconomic Class

● Zip -> Race

● Likes -> Age, Gender, Race, Sexual Orientation...

● Credit score, SAT score, College prestigiousness...

At your job...

Not everyone will have the same ethical values, but you don’t have to take

‘optimality’ as an argument against doing the right thing.

You know racist computers are a bad idea

Don’t let your company invent racist computers

@estola

Data & Analytics

When recommendation systems go bad - machine eatable