55
Think: Bing It On! Compares Bing to Google

Week 10 fraud copy

Embed Size (px)

Citation preview

Page 1: Week 10 fraud copy

Think:Bing It On!

Compares Bing to Google

Page 2: Week 10 fraud copy
Page 3: Week 10 fraud copy

How would you design this?Tell me:

Page 4: Week 10 fraud copy

Me?And I’m guessing:

Hypothesis: Students in Toronto do not prefer one SE to another.

Page 5: Week 10 fraud copy

How?100 Senecans will be surveyed by 10 paid

surveyors.Asked to compare two frames with fonts, colours and text sizes randomized.Search terms Senecans choose.Choose frame they like best: Google or BingResults not revealed to participants

Page 6: Week 10 fraud copy

Why?Identify sample and population I’m

trying to sample.Removing my bias by asking surveyorsSurveyors will not know how survey is

designed.“Double blind”

Page 7: Week 10 fraud copy

Why 100?10 is too few1000 is too many.For sufficiently large n, the distribution of will be closely approximated by a normal distribution with the same mean and

variance.[1] Using this approximation, it can be shown that around 95% of this distribution's probability lies within 2 standard deviations of the mean. Because of this, an interval of the form

will form a 95% confidence interval for the true proportion. If this interval needs to be no more than W units wide, the equation

can be solved for n, yielding[2][3] n = 4/W2 = 1/B2 where B is the error bound on the estimate, i.e., the estimate is usually given as within ± B. So, for B = 10% one requires n = 100, for B = 5% one needs n = 400, for B = 3% the requirement approximates to n = 1000, while for B = 1% a sample size of n = 10000 is required. These numbers are quoted often in news reports of opinion polls and other sample surveys.

“Sample Size Determination”

Page 8: Week 10 fraud copy

Say that works60 prefer Bing40 prefer Google

What does that mean?

Page 9: Week 10 fraud copy

I have no idea!Well, sort of.

60% (±5%, p=.05) prefer Bing to Google

You tell me, what does that mean?

Page 10: Week 10 fraud copy

Maybe nothing?Maybe something?

Page 11: Week 10 fraud copy

Look: that was as easy as it gets!

Population identification, sample size calculation, double blinding, within two standard deviations, after stripping CSS—all that before I do the statistics

Which I can’t understand!

Page 12: Week 10 fraud copy

Good methodology● Design your experiment before hand● Run the experiment according to design● Without peeking

– Or changing● Collect all data● Interpret all data● Make all data available● Analyze data according to good analysis principles.

Page 13: Week 10 fraud copy

DucklingsYou have no idea how to do this.

No idea.Neither do I.

Page 14: Week 10 fraud copy

QuestionsHow many people do you need to

survey?How do you test them?Double blind?Blind?What do you ask them?

Page 15: Week 10 fraud copy

You have to do this● It’s too easy to fool yourself

Page 16: Week 10 fraud copy

Let’s reviewPublish or perish?

Who perishes? And where do they publish?

Page 17: Week 10 fraud copy

JournalsWhat are the most prestigious

journals in the world?How do you know?

Page 18: Week 10 fraud copy

Impact factorNature

Proceedings of the National Academy of Science

Science

Physical Review Letters

Journal of the American Chemistry Society

Physics Review B

Journal of Biological Chemistry

Applied Physics Letters

New England Journal of Medicine

Cell

(Eigenfactor.org data for 2011, most recent available)

Page 19: Week 10 fraud copy

RoughlyNumber of in-citationsNumber of out-citations

Page 20: Week 10 fraud copy

But?Top-ranked are mostly medicine w.

some physicsNo computers in top 100

Bioinformatics: 68

Page 21: Week 10 fraud copy

Get publishedOr get fired.

Science, Nature, Cell, NEJM, JAMA

You get ‘tenure’—never fired, made for life.

Page 22: Week 10 fraud copy

● Japanese researcher in anaesthesiology– Worked in Canada too

● Published 212 papers in 20 years(about one a month)

(Hmmmmm).

Yoshitaka Fujii

Page 23: Week 10 fraud copy

You’ll never guessHe made them up.

● 172 are demonstrably false.

Page 24: Week 10 fraud copy

As an aside:● Retractions still need work:

– Of Fujii’s first ten articles on GS ● 4 was clearly retracted● 1 was less clearly retracted● 5 were not labelled as retracted

Page 25: Week 10 fraud copy

Jan Hendrik Schön● Nano-physics genius!

– Won $100,000 as best young scientist

– Published, at his best, one paper every eight days● Including in Science and Nature

–The very best journals in the world.

Page 26: Week 10 fraud copy

Now● He has 10 friends on Facebook.

– I’m one!Gave back his PhD.Disappeared

Page 27: Week 10 fraud copy

You’ll never guess● He made all of his data up.

– [Movie time! 35:00]

Page 28: Week 10 fraud copy

So?● What’s the problem?

● So they lied. Nobody died.● (Well, probably. Fujii was a doctor.)

Page 29: Week 10 fraud copy

As I see it● Money

– Millions of dollars● Reputation

– Bell Labs, universities, colleagues, students

● Work: Reid Chesterfield spent 5 years trying to replicate Schön’s work.

Page 30: Week 10 fraud copy

MohammadHis supervisor spent months trying to

replicate Schön’s work(That’s hundreds of thousands of dollars)

Page 31: Week 10 fraud copy

Another kind● Damages to the scientific enterprise:

– Science has to be open to catch cheaters

– But openess makes researchers look bad

Page 32: Week 10 fraud copy

Kinds of fraud● Fabrication● Falsification● Other

Page 33: Week 10 fraud copy

Fraud“Fabrication of data involves totally inventing a

data set, falsification refers to manipulation of equipment or changing data such that the research is not accurately represented in the research report.” (Stroebe, Pestmes and Spears)

Page 34: Week 10 fraud copy

Fabrication● Pretty clear—you make up the data.

Page 35: Week 10 fraud copy

Falsification● Changing or interpreting the data:

“There is no rigid mathematical definition of what constitutes an outlier; determining whether or not an observation is an outlier is ultimately a subjective exercise.”

Page 36: Week 10 fraud copy

Outliers● How do you deal with them?

– Bill Gates walks in the room● Median and mean income?

Page 37: Week 10 fraud copy

(How) Do you eliminate that variation?

Page 38: Week 10 fraud copy

Data picking

● Say you want to show that monkeys flip a coin to heads more often than humans. How do you do it?

● Not investigate. Show

Page 39: Week 10 fraud copy

● 1) Each flip 1 coin 100 times● 2) Each flip 10 coins 10 times● 3) Each flip 100 coins 1 time

Page 40: Week 10 fraud copy

Then...

● Re-design your experiment!

Page 41: Week 10 fraud copy

Then...

● Monkeys and humans each flipped 10 coins....

Page 42: Week 10 fraud copy

A ha.● This is (abuse) of methodology

– And why I keep saying it matters!

Page 43: Week 10 fraud copy

Google Scholar vs MAS

What does that tell you?

Page 44: Week 10 fraud copy

Google Scholar vs MAS● That GS has better searching than

MAS● Or that GS has worse searching than

MAS!

Page 45: Week 10 fraud copy

Check this out

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Column A

Linear (Column A)

Page 46: Week 10 fraud copy

ClearlyA strong trend:

Decreasing over xDespite what appear to be sinusoidal variations

Page 47: Week 10 fraud copy

One problem:Made the data with random numbers

– And a few tricks● No R value● Lighten points● Darken line● Compress y for sharpness● Regenerate data if necessary

Page 48: Week 10 fraud copy

AlsoChoose line of best fit:

Linear? Moving average? Exponential? Log?

Page 49: Week 10 fraud copy

Of course● That’s not nearly the only way!

– Repeat the whole experiment– Blinding– Survey design– Outlier elimination

● And so on.

Page 50: Week 10 fraud copy

So: It’s easyIt’s so, so easy to cheat!

Let’s do it:

Page 51: Week 10 fraud copy

Google vs BingSay you wanted to show that Bing >

Google.How would you?

Page 52: Week 10 fraud copy

Population is, er, everyone!

Sample 1000 in Seattle

Sample young white men in Seattle

Redo sample!

Remove double blind

Remove single blind

10 in a row for Google? Outlier!

Choose best 100 of 1000 in Seattle

Repeat that ‘experiment’ to find the 20 th out of 20.

Page 53: Week 10 fraud copy

Why?● Career pressure

– Publish or perish– Past glories

● Over confidence● Tempation because of irreproducibility

Page 54: Week 10 fraud copy

How do they get caught?● Data that is too good● Draw suspicion in publication● Ratted out by underlings

Page 55: Week 10 fraud copy

Lessons:● Don’t cheat well● Don’t cheat much