158
PoCS | @pocsvox Power-Law Size Distributions Our Intuition Definition Examples Wild vs. Mild CCDFs Zipf’s law Zipf CCDF Appendix References . . . . . . 1 of 53 Power-Law Size Distributions Principles of Complex Systems | @pocsvox CSYS/MATH 300, Fall, 2016 | #FallPoCS2016 Prof. Peter Dodds | @peterdodds Dept. of Mathematics & Statistics | Vermont Complex Systems Center Vermont Advanced Computing Core | University of Vermont What's the Story? Principles of Complex Systems @pocsvox PoCS Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.1 of 53

Power-Law Size DistributionsPrinciples of Complex Systems | @pocsvoxCSYS/MATH 300, Fall, 2016 | #FallPoCS2016

Prof. Peter Dodds | @peterdodds

Dept. of Mathematics & Statistics | Vermont Complex Systems CenterVermont Advanced Computing Core | University of Vermont

What's the Story?

Principles ofComplex Systems

@pocsvox

PoCS

Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

Page 2: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.2 of 53

These slides are brought to you by:

Page 3: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.3 of 53

Outline

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

Page 4: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.4 of 53

Page 5: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.5 of 53

Two of the many things we struggle withcognitively:1. Probability.

Ex. The Monty Hall Problem. Ex. Daughter/Son born on Tuesday.

(see next two slides; Wikipedia entry here.)

2. Logarithmic scales.

On counting and logarithms:

Listen to Radiolab’s 2009 piece:“Numbers.”.

Later: Benford’s Law.

Also to be enjoyed: the magnificence of theDunning-Kruger effect

Page 6: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.5 of 53

Two of the many things we struggle withcognitively:1. Probability.

Ex. The Monty Hall Problem. Ex. Daughter/Son born on Tuesday.

(see next two slides; Wikipedia entry here.)

2. Logarithmic scales.

On counting and logarithms:

Listen to Radiolab’s 2009 piece:“Numbers.”.

Later: Benford’s Law.

Also to be enjoyed: the magnificence of theDunning-Kruger effect

Page 7: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.5 of 53

Two of the many things we struggle withcognitively:1. Probability.

Ex. The Monty Hall Problem. Ex. Daughter/Son born on Tuesday.

(see next two slides; Wikipedia entry here.)

2. Logarithmic scales.

On counting and logarithms:

Listen to Radiolab’s 2009 piece:“Numbers.”.

Later: Benford’s Law.

Also to be enjoyed: the magnificence of theDunning-Kruger effect

Page 8: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.6 of 53

Homo probabilisticus?The set up: A parent has two children.

Simple probability question: What is the probability that both children are girls?

1/4...

The next set up: A parent has two children. We know one of them is a girl.

The next probabilistic poser: What is the probability that both children are girls?

1/3...

Page 9: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.6 of 53

Homo probabilisticus?The set up: A parent has two children.

Simple probability question: What is the probability that both children are girls?

1/4...

The next set up: A parent has two children. We know one of them is a girl.

The next probabilistic poser: What is the probability that both children are girls?

1/3...

Page 10: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.6 of 53

Homo probabilisticus?The set up: A parent has two children.

Simple probability question: What is the probability that both children are girls?

1/4...

The next set up: A parent has two children. We know one of them is a girl.

The next probabilistic poser: What is the probability that both children are girls?

1/3...

Page 11: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.6 of 53

Homo probabilisticus?The set up: A parent has two children.

Simple probability question: What is the probability that both children are girls?

1/4...

The next set up: A parent has two children. We know one of them is a girl.

The next probabilistic poser: What is the probability that both children are girls?

1/3...

Page 12: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.6 of 53

Homo probabilisticus?The set up: A parent has two children.

Simple probability question: What is the probability that both children are girls?

1/4...

The next set up: A parent has two children. We know one of them is a girl.

The next probabilistic poser: What is the probability that both children are girls?

1/3...

Page 13: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.6 of 53

Homo probabilisticus?The set up: A parent has two children.

Simple probability question: What is the probability that both children are girls?

1/4...

The next set up: A parent has two children. We know one of them is a girl.

The next probabilistic poser: What is the probability that both children are girls?

1/3...

Page 14: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.6 of 53

Homo probabilisticus?The set up: A parent has two children.

Simple probability question: What is the probability that both children are girls?

1/4...

The next set up: A parent has two children. We know one of them is a girl.

The next probabilistic poser: What is the probability that both children are girls?

1/3...

Page 15: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.6 of 53

Homo probabilisticus?The set up: A parent has two children.

Simple probability question: What is the probability that both children are girls? 1/4...

The next set up: A parent has two children. We know one of them is a girl.

The next probabilistic poser: What is the probability that both children are girls?

1/3...

Page 16: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.6 of 53

Homo probabilisticus?The set up: A parent has two children.

Simple probability question: What is the probability that both children are girls? 1/4...

The next set up: A parent has two children. We know one of them is a girl.

The next probabilistic poser: What is the probability that both children are girls? 1/3...

Page 17: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.7 of 53

Try this one: A parent has two children. We know one of them is a girl born on a Tuesday.

Simple question #3: What is the probability that both children are girls?

?

Last: A parent has two children. We know one of them is a girl born on December

31.

And … What is the probability that both children are girls?

?

Page 18: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.7 of 53

Try this one: A parent has two children. We know one of them is a girl born on a Tuesday.

Simple question #3: What is the probability that both children are girls?

?

Last: A parent has two children. We know one of them is a girl born on December

31.

And … What is the probability that both children are girls?

?

Page 19: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.7 of 53

Try this one: A parent has two children. We know one of them is a girl born on a Tuesday.

Simple question #3: What is the probability that both children are girls?

?

Last: A parent has two children. We know one of them is a girl born on December

31.

And … What is the probability that both children are girls?

?

Page 20: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.7 of 53

Try this one: A parent has two children. We know one of them is a girl born on a Tuesday.

Simple question #3: What is the probability that both children are girls?

?

Last: A parent has two children. We know one of them is a girl born on December

31.

And … What is the probability that both children are girls?

?

Page 21: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.7 of 53

Try this one: A parent has two children. We know one of them is a girl born on a Tuesday.

Simple question #3: What is the probability that both children are girls?

?

Last: A parent has two children. We know one of them is a girl born on December

31.

And … What is the probability that both children are girls?

?

Page 22: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.7 of 53

Try this one: A parent has two children. We know one of them is a girl born on a Tuesday.

Simple question #3: What is the probability that both children are girls?

?

Last: A parent has two children. We know one of them is a girl born on December

31.

And … What is the probability that both children are girls?

?

Page 23: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.7 of 53

Try this one: A parent has two children. We know one of them is a girl born on a Tuesday.

Simple question #3: What is the probability that both children are girls?

?

Last: A parent has two children. We know one of them is a girl born on December

31.

And … What is the probability that both children are girls?

?

Page 24: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.7 of 53

Try this one: A parent has two children. We know one of them is a girl born on a Tuesday.

Simple question #3: What is the probability that both children are girls? ?

Last: A parent has two children. We know one of them is a girl born on December

31.

And … What is the probability that both children are girls?

?

Page 25: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.7 of 53

Try this one: A parent has two children. We know one of them is a girl born on a Tuesday.

Simple question #3: What is the probability that both children are girls? ?

Last: A parent has two children. We know one of them is a girl born on December

31.

And … What is the probability that both children are girls? ?

Page 26: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.8 of 53

Let’s test our collective intuition:

Money≡Belief

Two questions about wealth distribution in theUnited States:1. Please estimate the percentage of all wealth

owned by individuals when grouped into quintiles.2. Please estimate what you believe each quintile

should own, ideally.3. Extremes: 100, 0, 0, 0, 0 and 20, 20, 20, 20, 20

Page 27: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.8 of 53

Let’s test our collective intuition:

Money≡Belief

Two questions about wealth distribution in theUnited States:1. Please estimate the percentage of all wealth

owned by individuals when grouped into quintiles.2. Please estimate what you believe each quintile

should own, ideally.3. Extremes: 100, 0, 0, 0, 0 and 20, 20, 20, 20, 20

Page 28: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.8 of 53

Let’s test our collective intuition:

Money≡Belief

Two questions about wealth distribution in theUnited States:1. Please estimate the percentage of all wealth

owned by individuals when grouped into quintiles.2. Please estimate what you believe each quintile

should own, ideally.3. Extremes: 100, 0, 0, 0, 0 and 20, 20, 20, 20, 20

Page 29: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.8 of 53

Let’s test our collective intuition:

Money≡Belief

Two questions about wealth distribution in theUnited States:1. Please estimate the percentage of all wealth

owned by individuals when grouped into quintiles.2. Please estimate what you believe each quintile

should own, ideally.3. Extremes: 100, 0, 0, 0, 0 and 20, 20, 20, 20, 20

Page 30: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.8 of 53

Let’s test our collective intuition:

Money≡Belief

Two questions about wealth distribution in theUnited States:1. Please estimate the percentage of all wealth

owned by individuals when grouped into quintiles.2. Please estimate what you believe each quintile

should own, ideally.3. Extremes: 100, 0, 0, 0, 0 and 20, 20, 20, 20, 20

Page 31: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.9 of 53

Wealth distribution in the United States: [12]

“Building a better America—One wealth quintile at a time”Norton and Ariely, 2011. [12]

Page 32: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.10 of 53

Wealth distribution in the United States: [12]

A highly watched video based on this research is here.

Page 33: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.11 of 53

The sizes of many systems’ elements appear to obey aninverse power-law size distribution:(size = ) ∼ −

where 0 < min < < max and > 1. min = lower cutoff, max = upper cutoff

Negative linear relationship in log-log space:

log10() = log10 − log10 We use base 10 because we are good people.

power-law decays in probability:The Statistics of Surprise.

Page 34: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.11 of 53

The sizes of many systems’ elements appear to obey aninverse power-law size distribution:(size = ) ∼ −

where 0 < min < < max and > 1. min = lower cutoff, max = upper cutoff

Negative linear relationship in log-log space:

log10() = log10 − log10 We use base 10 because we are good people.

power-law decays in probability:The Statistics of Surprise.

Page 35: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.11 of 53

The sizes of many systems’ elements appear to obey aninverse power-law size distribution:(size = ) ∼ −

where 0 < min < < max and > 1. min = lower cutoff, max = upper cutoff

Negative linear relationship in log-log space:

log10() = log10 − log10 We use base 10 because we are good people.

power-law decays in probability:The Statistics of Surprise.

Page 36: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.11 of 53

The sizes of many systems’ elements appear to obey aninverse power-law size distribution:(size = ) ∼ −

where 0 < min < < max and > 1. min = lower cutoff, max = upper cutoff

Negative linear relationship in log-log space:

log10() = log10 − log10 We use base 10 because we are good people.

power-law decays in probability:The Statistics of Surprise.

Page 37: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.11 of 53

The sizes of many systems’ elements appear to obey aninverse power-law size distribution:(size = ) ∼ −

where 0 < min < < max and > 1. min = lower cutoff, max = upper cutoff

Negative linear relationship in log-log space:

log10() = log10 − log10 We use base 10 because we are good people.

power-law decays in probability:The Statistics of Surprise.

Page 38: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.12 of 53

Size distributions:

Usually, only the tail of the distribution obeys apower law: () ∼ − for large. Still use term ‘power-law size distribution.’ Other terms:

Fat-tailed distributions. Heavy-tailed distributions.

Beware: Inverse power laws aren’t the only ones:

lognormals, Weibull distributions, …

Page 39: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.12 of 53

Size distributions:

Usually, only the tail of the distribution obeys apower law: () ∼ − for large. Still use term ‘power-law size distribution.’ Other terms:

Fat-tailed distributions. Heavy-tailed distributions.

Beware: Inverse power laws aren’t the only ones:

lognormals, Weibull distributions, …

Page 40: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.12 of 53

Size distributions:

Usually, only the tail of the distribution obeys apower law: () ∼ − for large. Still use term ‘power-law size distribution.’ Other terms:

Fat-tailed distributions. Heavy-tailed distributions.

Beware: Inverse power laws aren’t the only ones:

lognormals, Weibull distributions, …

Page 41: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.12 of 53

Size distributions:

Usually, only the tail of the distribution obeys apower law: () ∼ − for large. Still use term ‘power-law size distribution.’ Other terms:

Fat-tailed distributions. Heavy-tailed distributions.

Beware: Inverse power laws aren’t the only ones:

lognormals, Weibull distributions, …

Page 42: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.13 of 53

Size distributions:

Many systems have discrete sizes : Word frequency Node degree in networks: # friends, # hyperlinks,

etc. # citations for articles, court decisions, etc.

() ∼ −where min ≤ ≤ max

Obvious fail for = 0. Again, typically a description of distribution’s tail.

Page 43: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.13 of 53

Size distributions:

Many systems have discrete sizes : Word frequency Node degree in networks: # friends, # hyperlinks,

etc. # citations for articles, court decisions, etc.

() ∼ −where min ≤ ≤ max

Obvious fail for = 0. Again, typically a description of distribution’s tail.

Page 44: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.13 of 53

Size distributions:

Many systems have discrete sizes : Word frequency Node degree in networks: # friends, # hyperlinks,

etc. # citations for articles, court decisions, etc.

() ∼ −where min ≤ ≤ max

Obvious fail for = 0. Again, typically a description of distribution’s tail.

Page 45: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.13 of 53

Size distributions:

Many systems have discrete sizes : Word frequency Node degree in networks: # friends, # hyperlinks,

etc. # citations for articles, court decisions, etc.

() ∼ −where min ≤ ≤ max

Obvious fail for = 0. Again, typically a description of distribution’s tail.

Page 46: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.14 of 53

The statistics of surprise—words:

Brown Corpus (∼ 06 words):rank word % q

1. the 6.88722. of 3.58393. and 2.84014. to 2.57445. a 2.29966. in 2.10107. that 1.04288. is 0.99439. was 0.966110. he 0.939211. for 0.934012. it 0.862313. with 0.717614. as 0.713715. his 0.6886

rank word % q1945. apply 0.00551946. vital 0.00551947. September 0.00551948. review 0.00551949. wage 0.00551950. motor 0.00551951. fifteen 0.00551952. regarded 0.00551953. draw 0.00551954. wheel 0.00551955. organized 0.00551956. vision 0.00551957. wild 0.00551958. Palmer 0.00551959. intensity 0.0055

Page 47: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.15 of 53

Jonathan Harris’s Wordcount:A word frequency distribution explorer:

Page 48: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.16 of 53

“Thing Explainer: Complicated Stuff inSimple Words ”by Randall Munroe (2015). [10]

Up goer five

Page 49: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.17 of 53

The statistics of surprise—words:

First—a Gaussian example: ()d = √ −(−)2/22dlinear:

0 5 10 15 200

0.1

0.2

0.3

0.4

x

P(x

)

log-log

−3 −2 −1 0 1 2−25

−20

−15

−10

−5

0

log10

x

log 1

0P

(x)

mean = 0, variance 2 = 1.

Activity: Sketch () ∼ −1 for = to = 07.

Page 50: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.18 of 53

The statistics of surprise—words:

Raw ‘probability’ (binned) for Brown Corpus:

linear:

0 1 2 3 4 5 6 70

200

400

600

800

1000

1200

q

Nq

= frequency of occurrence of word expressed as apercentage. = number of distinct words that have a frequencyof occurrence .

Page 51: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.18 of 53

The statistics of surprise—words:

Raw ‘probability’ (binned) for Brown Corpus:

linear:

0 1 2 3 4 5 6 70

200

400

600

800

1000

1200

q

Nq

log-log

−2.5 −2 −1.5 −1 −0.5 0 0.5 10

0.5

1

1.5

2

2.5

3

3.5

log10

q

log

10 N

q

= frequency of occurrence of word expressed as apercentage. = number of distinct words that have a frequencyof occurrence .

Page 52: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.19 of 53

The statistics of surprise—words:

Complementary Cumulative ProbabilityDistribution >:linear:

0 1 2 3 4 5 6 70

500

1000

1500

2000

2500

q

N>

q

log-log

−2.5 −2 −1.5 −1 −0.5 0 0.5 10

0.5

1

1.5

2

2.5

3

3.5

log10

q

log

10 N

> q

Also known as the ‘Exceedance Probability.’

Page 53: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.20 of 53

My, what big words you have...

Test capitalizes on word frequency following aheavily skewed frequency distribution with adecaying power-law tail.

Page 54: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.21 of 53

The statistics of surprise:

Gutenberg-Richter law

Log-log plot Base 10 Slope = -1( > ) ∝ −1

From both the very awkwardly similar Christensenet al. and Bak et al.:“Unified scaling law for earthquakes” [3, 1]

Page 55: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.22 of 53

The statistics of surprise:From: “Quake Moves Japan Closer to U.S. andAlters Earth’s Spin” by Kenneth Chang, March13, 2011, NYT:‘What is perhaps most surprising about the Japanearthquake is how misleading history can be. In thepast 300 years, no earthquake nearly thatlarge—nothing larger than magnitude eight—hadstruck in the Japan subduction zone. That, in turn, ledto assumptions about how large a tsunami mightstrike the coast.’

“‘It did them a giant disservice,” said Dr. Stein of thegeological survey. That is not the first time that theearthquake potential of a fault has beenunderestimated. Most geophysicists did not think theSumatra fault could generate a magnitude 9.1earthquake, …’

Page 56: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.22 of 53

The statistics of surprise:From: “Quake Moves Japan Closer to U.S. andAlters Earth’s Spin” by Kenneth Chang, March13, 2011, NYT:‘What is perhaps most surprising about the Japanearthquake is how misleading history can be. In thepast 300 years, no earthquake nearly thatlarge—nothing larger than magnitude eight—hadstruck in the Japan subduction zone. That, in turn, ledto assumptions about how large a tsunami mightstrike the coast.’

“‘It did them a giant disservice,” said Dr. Stein of thegeological survey. That is not the first time that theearthquake potential of a fault has beenunderestimated. Most geophysicists did not think theSumatra fault could generate a magnitude 9.1earthquake, …’

Page 57: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.22 of 53

The statistics of surprise:From: “Quake Moves Japan Closer to U.S. andAlters Earth’s Spin” by Kenneth Chang, March13, 2011, NYT:‘What is perhaps most surprising about the Japanearthquake is how misleading history can be. In thepast 300 years, no earthquake nearly thatlarge—nothing larger than magnitude eight—hadstruck in the Japan subduction zone. That, in turn, ledto assumptions about how large a tsunami mightstrike the coast.’

“‘It did them a giant disservice,” said Dr. Stein of thegeological survey. That is not the first time that theearthquake potential of a fault has beenunderestimated. Most geophysicists did not think theSumatra fault could generate a magnitude 9.1earthquake, …’

Page 58: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.22 of 53

The statistics of surprise:From: “Quake Moves Japan Closer to U.S. andAlters Earth’s Spin” by Kenneth Chang, March13, 2011, NYT:‘What is perhaps most surprising about the Japanearthquake is how misleading history can be. In thepast 300 years, no earthquake nearly thatlarge—nothing larger than magnitude eight—hadstruck in the Japan subduction zone. That, in turn, ledto assumptions about how large a tsunami mightstrike the coast.’

“‘It did them a giant disservice,” said Dr. Stein of thegeological survey. That is not the first time that theearthquake potential of a fault has beenunderestimated. Most geophysicists did not think theSumatra fault could generate a magnitude 9.1earthquake, …’

Page 59: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.22 of 53

The statistics of surprise:From: “Quake Moves Japan Closer to U.S. andAlters Earth’s Spin” by Kenneth Chang, March13, 2011, NYT:‘What is perhaps most surprising about the Japanearthquake is how misleading history can be. In thepast 300 years, no earthquake nearly thatlarge—nothing larger than magnitude eight—hadstruck in the Japan subduction zone. That, in turn, ledto assumptions about how large a tsunami mightstrike the coast.’

“‘It did them a giant disservice,” said Dr. Stein of thegeological survey. That is not the first time that theearthquake potential of a fault has beenunderestimated. Most geophysicists did not think theSumatra fault could generate a magnitude 9.1earthquake, …’

Page 60: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.22 of 53

The statistics of surprise:From: “Quake Moves Japan Closer to U.S. andAlters Earth’s Spin” by Kenneth Chang, March13, 2011, NYT:‘What is perhaps most surprising about the Japanearthquake is how misleading history can be. In thepast 300 years, no earthquake nearly thatlarge—nothing larger than magnitude eight—hadstruck in the Japan subduction zone. That, in turn, ledto assumptions about how large a tsunami mightstrike the coast.’

“‘It did them a giant disservice,” said Dr. Stein of thegeological survey. That is not the first time that theearthquake potential of a fault has beenunderestimated. Most geophysicists did not think theSumatra fault could generate a magnitude 9.1earthquake, …’

Page 61: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.23 of 53

ingredients such as salt, sugar, and egg constitute a major part of

our every-day diet. As a result, the set of distinct ingredients

roughly follows Heap’s law, as seen in Fig. 4, with an exponent

around 0:64. According to the method in previous work [20], the

exponent of Zipf’s law corresponding to Fig. 3 can be estimated by1

l1. The product of this exponent and the exponent of Heap’s

law (0.64) is close to 1, which is consistent with the previous result

[21].

Quantifying similarity between cuisinesOur dataset can be considered as a bipartite network with a set

of recipes and a set of ingredients. An edge between a recipe and

an ingredient indicates that the recipe contains the corresponding

ingredient. Since each recipe belongs to one and only one regional

cuisine, the edges could be categorized into cuisines. Given a

cuisine c and an ingredient i, we use nci to denote the degree of

ingredient i, counted with edges in cuisine c. In other words, nci is

the number of recipes (in cuisine c) that use ingredient i.

Therefore, the ingredient-usage vector of regional cuisine c is

written in the following form:

fPcPc~(pc1,p

c2, . . . ,p

ci , . . . ,p

cn), ð1Þ

where pci~nciPi~1 n

ci

is the probability of ingredient i appears in

cuisine c. For example, if recipes in a regional cuisine c use 1,000

ingredients (with duplicates) in total and ingredient i appears in 10

recipes in that cuisine, we have pci~10

1000.

Since common ingredients carry little information, we use an

ingredient-usage vector inspired by TF-IDF (Term Frequency

Inverse Document Frequency) [22]:

Pc~(w1p

c1,w2p

c2, . . . ,wjp

ci , . . . ,wnp

cn), ð2Þ

where a prior weight wi~log

Pc

Pi n

ciP

c nci

is introduced to penalize a

popular ingredient. We use Pc for all calculations in this paper.

With this representation in hand, we quantify the similarity

between two cuisines using the Pearson correlation coefficient (Eq.

3) and cosine similarity (Eq. 4).

(i) Pearson product-moment correlation [23]: This metric

measures the extent to which a linear relationship is present

between the two vectors. It is defined as

Figure 1. Map of regional cuisines in China.doi:10.1371/journal.pone.0079161.g001

0 10 20 30 400

0.04

0.08

0.12

Number of ingredients per recipe (k)

P(k

)

Lu

Chuan

Yue

Su

MIn

Zhe

Xiang

Hui

Figure 2. Probability distribution of the number of ingredientsper recipe. All regional cuisines show similar distributions, which havea peak around 10.doi:10.1371/journal.pone.0079161.g002

Geography and Similarity of Chinese Cuisines

PLOS ONE | www.plosone.org 2 November 2013 | Volume 8 | Issue 11 | e79161

“Geography and Similarity of RegionalCuisines in China”Zhu et al.,PLoS ONE, 8, e79161, 2013. [17]

Fraction of ingredientsthat appear in at least recipes.

Oops in notation: () isthe ComplementaryCumulative Distribution≥()

Page 62: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.24 of 53

“On a class of skew distributionfunctions”Herbert A. Simon,Biometrika, 42, 425–440, 1955. [14]

2 Power laws, Pareto distributions and Zipf’s law

0 50 100 150 200 250

heights of males

0

2

4

6

perc

enta

ge

0 20 40 60 80 100

speeds of cars

0

1

2

3

4

FIG. 1 Left: histogram of heights in centimetres of American males. Data from the National Health Examination Survey,1959–1962 (US Department of Health and Human Services). Right: histogram of speeds in miles per hour of cars on UKmotorways. Data from Transport Statistics 2003 (UK Department for Transport).

0 2×105

4×105

population of city

0

0.001

0.002

0.003

0.004

perc

enta

ge o

f ci

ties

104

105

106

107

10-8

10-7

10-6

10-5

10-4

10-3

10-2

FIG. 2 Left: histogram of the populations of all US cities with population of 10 000 or more. Right: another histogram of thesame data, but plotted on logarithmic scales. The approximate straight-line form of the histogram in the right panel impliesthat the distribution follows a power law. Data from the 2000 US Census.

is fixed, it is determined by the requirement that thedistribution p(x) sum to 1; see Section III.A.)

Power-law distributions occur in an extraordinarily di-verse range of phenomena. In addition to city popula-tions, the sizes of earthquakes [3], moon craters [4], solarflares [5], computer files [6] and wars [7], the frequency ofuse of words in any human language [2, 8], the frequencyof occurrence of personal names in most cultures [9], thenumbers of papers scientists write [10], the number ofcitations received by papers [11], the number of hits onweb pages [12], the sales of books, music recordings andalmost every other branded commodity [13, 14], the num-bers of species in biological taxa [15], people’s annual in-comes [16] and a host of other variables all follow power-law distributions.1

1 Power laws also occur in many situations other than the statis-

Power-law distributions are the subject of this arti-cle. In the following sections, I discuss ways of detectingpower-law behaviour, give empirical evidence for powerlaws in a variety of systems and describe some of themechanisms by which power-law behaviour can arise.

Readers interested in pursuing the subject further mayalso wish to consult the reviews by Sornette [18] andMitzenmacher [19], as well as the bibliography by Li.2

tical distributions of quantities. For instance, Newton’s famous1/r2 law for gravity has a power-law form with exponent α = 2.While such laws are certainly interesting in their own way, theyare not the topic of this paper. Thus, for instance, there hasin recent years been some discussion of the “allometric” scal-ing laws seen in the physiognomy and physiology of biologicalorganisms [17], but since these are not statistical distributionsthey will not be discussed here.

2 http://linkage.rockefeller.edu/wli/zipf/.

“Power laws, Pareto distributions and Zipf’slaw”M. E. J. Newman,Contemporary Physics, 46, 323–351,2005. [11]

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

SIAM REVIEW c© 2009 Society for Industrial and Applied MathematicsVol. 51, No. 4, pp. 661–703

Power-Law Distributions in

Empirical Data∗

Aaron Clauset†

Cosma Rohilla Shalizi‡

M. E. J. Newman§

Abstract. Power-law distributions occur in many situations of scientific interest and have significantconsequences for our understanding of natural and man-made phenomena. Unfortunately,the detection and characterization of power laws is complicated by the large fluctuationsthat occur in the tail of the distribution—the part of the distribution representing largebut rare events—and by the difficulty of identifying the range over which power-law behav-ior holds. Commonly used methods for analyzing power-law data, such as least-squaresfitting, can produce substantially inaccurate estimates of parameters for power-law dis-tributions, and even in cases where such methods return accurate answers they are stillunsatisfactory because they give no indication of whether the data obey a power law atall. Here we present a principled statistical framework for discerning and quantifyingpower-law behavior in empirical data. Our approach combines maximum-likelihood fittingmethods with goodness-of-fit tests based on the Kolmogorov–Smirnov (KS) statistic andlikelihood ratios. We evaluate the effectiveness of the approach with tests on syntheticdata and give critical comparisons to previous approaches. We also apply the proposedmethods to twenty-four real-world data sets from a range of different disciplines, each ofwhich has been conjectured to follow a power-law distribution. In some cases we find theseconjectures to be consistent with the data, while in others the power law is ruled out.

Key words. power-law distributions, Pareto, Zipf, maximum likelihood, heavy-tailed distributions,likelihood ratio test, model selection

AMS subject classifications. 62-07, 62P99, 65C05, 62F99

DOI. 10.1137/070710111

1. Introduction. Many empirical quantities cluster around a typical value. The

speeds of cars on a highway, the weights of apples in a store, air pressure, sea level,

the temperature in New York at noon on a midsummer’s day: all of these things vary

somewhat, but their distributions place a negligible amount of probability far from

the typical value, making the typical value representative of most observations. For

instance, it is a useful statement to say that an adult male American is about 180cm

tall because no one deviates very far from this height. Even the largest deviations,

which are exceptionally rare, are still only about a factor of two from the mean in

∗Received by the editors December 2, 2007; accepted for publication (in revised form) February2, 2009; published electronically November 6, 2009. This work was supported in part by the SantaFe Institute (AC) and by grants from the James S. McDonnell Foundation (CRS and MEJN) andthe National Science Foundation (MEJN).

http://www.siam.org/journals/sirev/51-4/71011.html†Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, and Department of Computer

Science, University of New Mexico, Albuquerque, NM 87131.‡Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213.§Department of Physics and Center for the Study of Complex Systems, University of Michigan,

Ann Arbor, MI 48109.

661

“Power-law distributions in empiricaldata”Clauset, Shalizi, and Newman,SIAM Review, 51, 661–703, 2009. [4]

Page 63: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.25 of 53

100

102

104

word frequency

100

102

104

100

102

104

citations

100

102

104

106

100

102

104

web hits

100

102

104

106

107

books sold

1

10

100

100

102

104

106

telephone calls received

100

103

106

2 3 4 5 6 7

earthquake magnitude

102

103

104

0.01 0.1 1

crater diameter in km

10-4

10-2

100

102

102

103

104

105

peak intensity

101

102

103

104

1 10 100

intensity

1

10

100

109

1010

net worth in US dollars

1

10

100

104

105

106

name frequency

100

102

104

103

105

107

population of city

100

102

104

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(j) (k) (l)FIG

.4

Cum

ula

tive

distr

ibutionsor

“ra

nk/fr

equen

cyplo

ts”

oftw

elve

quantities

repute

dto

follow

pow

erla

ws.

The

distr

ibutions

wer

eco

mpute

das

des

crib

edin

Appen

dix

A.

Data

inth

esh

aded

regio

ns

wer

eex

cluded

from

the

calc

ula

tions

ofth

eex

ponen

tsin

Table

I.Sourc

ere

fere

nce

sfo

rth

edata

are

giv

enin

the

text.

(a)

Num

ber

sofocc

urr

ence

sofw

ord

sin

the

nov

elM

oby

Dic

k

by

Her

mann

Mel

ville

.(b

)N

um

ber

sof

cita

tions

tosc

ientific

paper

spublish

edin

1981,

from

tim

eof

publica

tion

until

June

1997.

(c)

Num

ber

sofhits

on

web

site

sby

60

000

use

rsofth

eA

mer

ica

Online

Inte

rnet

serv

ice

for

the

day

of1

Dec

ember

1997.

(d)

Num

ber

sof

copie

sof

bes

tsel

ling

books

sold

inth

eU

Sbet

wee

n1895

and

1965.

(e)

Num

ber

of

calls

rece

ived

by

AT

&T

tele

phone

cust

om

ersin

the

US

fora

single

day

.(f

)M

agnitude

ofea

rthquakes

inC

alifo

rnia

bet

wee

nJanuary

1910

and

May

1992.

Magnitude

ispro

port

ionalto

the

logarith

mofth

em

axim

um

am

plitu

de

ofth

eea

rthquake,

and

hen

ceth

edistr

ibution

obey

sa

pow

erla

wev

enth

ough

the

horizo

nta

laxis

islinea

r.(g

)D

iam

eter

ofcr

ate

rson

the

moon.

Ver

tica

laxis

ism

easu

red

per

square

kilom

etre

.(h

)Pea

kgam

ma-r

ayin

tensity

of

sola

rflare

sin

counts

per

seco

nd,

mea

sure

dfr

om

Eart

horb

itbet

wee

nFeb

ruary

1980

and

Nov

ember

1989.

(i)

Inte

nsity

ofw

ars

from

1816

to1980,m

easu

red

as

batt

ledea

ths

per

10000

ofth

epopula

tion

ofth

epart

icip

ating

countr

ies.

(j)

Aggre

gate

net

wort

hin

dollars

ofth

erich

est

indiv

iduals

inth

eU

Sin

Oct

ober

2003.

(k)

Fre

quen

cyofocc

urr

ence

offa

mily

nam

esin

the

US

inth

eyea

r1990.

(l)

Popula

tions

ofU

Sci

ties

inth

eyea

r2000.

Page 64: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.26 of 53

Size distributions:

Some examples: Earthquake magnitude (Gutenberg-Richter

law): [8, 1] () ∝ −2 # war deaths: [13] ( ) ∝ −1.8 Sizes of forest fires [7]

Sizes of cities: [14] () ∝ −2.1 # links to and from websites [2]

Note: Exponents range in error

Page 65: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.26 of 53

Size distributions:

Some examples: Earthquake magnitude (Gutenberg-Richter

law): [8, 1] () ∝ −2 # war deaths: [13] ( ) ∝ −1.8 Sizes of forest fires [7]

Sizes of cities: [14] () ∝ −2.1 # links to and from websites [2]

Note: Exponents range in error

Page 66: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.26 of 53

Size distributions:

Some examples: Earthquake magnitude (Gutenberg-Richter

law): [8, 1] () ∝ −2 # war deaths: [13] ( ) ∝ −1.8 Sizes of forest fires [7]

Sizes of cities: [14] () ∝ −2.1 # links to and from websites [2]

Note: Exponents range in error

Page 67: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.26 of 53

Size distributions:

Some examples: Earthquake magnitude (Gutenberg-Richter

law): [8, 1] () ∝ −2 # war deaths: [13] ( ) ∝ −1.8 Sizes of forest fires [7]

Sizes of cities: [14] () ∝ −2.1 # links to and from websites [2]

Note: Exponents range in error

Page 68: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.26 of 53

Size distributions:

Some examples: Earthquake magnitude (Gutenberg-Richter

law): [8, 1] () ∝ −2 # war deaths: [13] ( ) ∝ −1.8 Sizes of forest fires [7]

Sizes of cities: [14] () ∝ −2.1 # links to and from websites [2]

Note: Exponents range in error

Page 69: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.26 of 53

Size distributions:

Some examples: Earthquake magnitude (Gutenberg-Richter

law): [8, 1] () ∝ −2 # war deaths: [13] ( ) ∝ −1.8 Sizes of forest fires [7]

Sizes of cities: [14] () ∝ −2.1 # links to and from websites [2]

Note: Exponents range in error

Page 70: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.27 of 53

Size distributions:More examples: # citations to papers: [5, 6] () ∝ −3. Individual wealth (maybe): () ∝ −2. Distributions of tree trunk diameters: ( ) ∝ −2. The gravitational force at a random point in the

universe: [9] () ∝ −5/2. (See the Holtsmarkdistribution and stable distributions.)

Diameter of moon craters: [11] ( ) ∝ −3. Word frequency: [14] e.g., () ∝ −2.2 (variable). # religious adherents in cults: [4] () ∝ −1.8±0.1. # sightings of birds per species (North American

Breeding Bird Survey for 2003): [4] () ∝ −2.1±0.1. # species per genus: [16, 14, 4] () ∝ −2.4±0.2.

Page 71: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.27 of 53

Size distributions:More examples: # citations to papers: [5, 6] () ∝ −3. Individual wealth (maybe): () ∝ −2. Distributions of tree trunk diameters: ( ) ∝ −2. The gravitational force at a random point in the

universe: [9] () ∝ −5/2. (See the Holtsmarkdistribution and stable distributions.)

Diameter of moon craters: [11] ( ) ∝ −3. Word frequency: [14] e.g., () ∝ −2.2 (variable). # religious adherents in cults: [4] () ∝ −1.8±0.1. # sightings of birds per species (North American

Breeding Bird Survey for 2003): [4] () ∝ −2.1±0.1. # species per genus: [16, 14, 4] () ∝ −2.4±0.2.

Page 72: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.27 of 53

Size distributions:More examples: # citations to papers: [5, 6] () ∝ −3. Individual wealth (maybe): () ∝ −2. Distributions of tree trunk diameters: ( ) ∝ −2. The gravitational force at a random point in the

universe: [9] () ∝ −5/2. (See the Holtsmarkdistribution and stable distributions.)

Diameter of moon craters: [11] ( ) ∝ −3. Word frequency: [14] e.g., () ∝ −2.2 (variable). # religious adherents in cults: [4] () ∝ −1.8±0.1. # sightings of birds per species (North American

Breeding Bird Survey for 2003): [4] () ∝ −2.1±0.1. # species per genus: [16, 14, 4] () ∝ −2.4±0.2.

Page 73: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.27 of 53

Size distributions:More examples: # citations to papers: [5, 6] () ∝ −3. Individual wealth (maybe): () ∝ −2. Distributions of tree trunk diameters: ( ) ∝ −2. The gravitational force at a random point in the

universe: [9] () ∝ −5/2. (See the Holtsmarkdistribution and stable distributions.)

Diameter of moon craters: [11] ( ) ∝ −3. Word frequency: [14] e.g., () ∝ −2.2 (variable). # religious adherents in cults: [4] () ∝ −1.8±0.1. # sightings of birds per species (North American

Breeding Bird Survey for 2003): [4] () ∝ −2.1±0.1. # species per genus: [16, 14, 4] () ∝ −2.4±0.2.

Page 74: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.27 of 53

Size distributions:More examples: # citations to papers: [5, 6] () ∝ −3. Individual wealth (maybe): () ∝ −2. Distributions of tree trunk diameters: ( ) ∝ −2. The gravitational force at a random point in the

universe: [9] () ∝ −5/2. (See the Holtsmarkdistribution and stable distributions.)

Diameter of moon craters: [11] ( ) ∝ −3. Word frequency: [14] e.g., () ∝ −2.2 (variable). # religious adherents in cults: [4] () ∝ −1.8±0.1. # sightings of birds per species (North American

Breeding Bird Survey for 2003): [4] () ∝ −2.1±0.1. # species per genus: [16, 14, 4] () ∝ −2.4±0.2.

Page 75: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.27 of 53

Size distributions:More examples: # citations to papers: [5, 6] () ∝ −3. Individual wealth (maybe): () ∝ −2. Distributions of tree trunk diameters: ( ) ∝ −2. The gravitational force at a random point in the

universe: [9] () ∝ −5/2. (See the Holtsmarkdistribution and stable distributions.)

Diameter of moon craters: [11] ( ) ∝ −3. Word frequency: [14] e.g., () ∝ −2.2 (variable). # religious adherents in cults: [4] () ∝ −1.8±0.1. # sightings of birds per species (North American

Breeding Bird Survey for 2003): [4] () ∝ −2.1±0.1. # species per genus: [16, 14, 4] () ∝ −2.4±0.2.

Page 76: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.27 of 53

Size distributions:More examples: # citations to papers: [5, 6] () ∝ −3. Individual wealth (maybe): () ∝ −2. Distributions of tree trunk diameters: ( ) ∝ −2. The gravitational force at a random point in the

universe: [9] () ∝ −5/2. (See the Holtsmarkdistribution and stable distributions.)

Diameter of moon craters: [11] ( ) ∝ −3. Word frequency: [14] e.g., () ∝ −2.2 (variable). # religious adherents in cults: [4] () ∝ −1.8±0.1. # sightings of birds per species (North American

Breeding Bird Survey for 2003): [4] () ∝ −2.1±0.1. # species per genus: [16, 14, 4] () ∝ −2.4±0.2.

Page 77: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.27 of 53

Size distributions:More examples: # citations to papers: [5, 6] () ∝ −3. Individual wealth (maybe): () ∝ −2. Distributions of tree trunk diameters: ( ) ∝ −2. The gravitational force at a random point in the

universe: [9] () ∝ −5/2. (See the Holtsmarkdistribution and stable distributions.)

Diameter of moon craters: [11] ( ) ∝ −3. Word frequency: [14] e.g., () ∝ −2.2 (variable). # religious adherents in cults: [4] () ∝ −1.8±0.1. # sightings of birds per species (North American

Breeding Bird Survey for 2003): [4] () ∝ −2.1±0.1. # species per genus: [16, 14, 4] () ∝ −2.4±0.2.

Page 78: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.27 of 53

Size distributions:More examples: # citations to papers: [5, 6] () ∝ −3. Individual wealth (maybe): () ∝ −2. Distributions of tree trunk diameters: ( ) ∝ −2. The gravitational force at a random point in the

universe: [9] () ∝ −5/2. (See the Holtsmarkdistribution and stable distributions.)

Diameter of moon craters: [11] ( ) ∝ −3. Word frequency: [14] e.g., () ∝ −2.2 (variable). # religious adherents in cults: [4] () ∝ −1.8±0.1. # sightings of birds per species (North American

Breeding Bird Survey for 2003): [4] () ∝ −2.1±0.1. # species per genus: [16, 14, 4] () ∝ −2.4±0.2.

Page 79: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

Table 3 from Clauset, Shalizi, and Newman [4]:

Basic parameters of the data sets described in section 6, along with their power-law fits and the corresponding p-values (statistically significant valuesare denoted in bold).

Quantity n 〈x〉 σ xmax xmin α ntail p

count of word use 18 855 11.14 148.33 14 086 7 ± 2 1.95(2) 2958 ± 987 0.49

protein interaction degree 1846 2.34 3.05 56 5 ± 2 3.1(3) 204 ± 263 0.31

metabolic degree 1641 5.68 17.81 468 4 ± 1 2.8(1) 748 ± 136 0.00Internet degree 22 688 5.63 37.83 2583 21 ± 9 2.12(9) 770 ± 1124 0.29

telephone calls received 51 360 423 3.88 179.09 375 746 120 ± 49 2.09(1) 102 592 ± 210 147 0.63

intensity of wars 115 15.70 49.97 382 2.1 ± 3.5 1.7(2) 70 ± 14 0.20

terrorist attack severity 9101 4.35 31.58 2749 12 ± 4 2.4(2) 547 ± 1663 0.68

HTTP size (kilobytes) 226 386 7.36 57.94 10 971 36.25 ± 22.74 2.48(5) 6794 ± 2232 0.00species per genus 509 5.59 6.94 56 4 ± 2 2.4(2) 233 ± 138 0.10

bird species sightings 591 3384.36 10 952.34 138 705 6679 ± 2463 2.1(2) 66 ± 41 0.55

blackouts (×103) 211 253.87 610.31 7500 230 ± 90 2.3(3) 59 ± 35 0.62

sales of books (×103) 633 1986.67 1396.60 19 077 2400 ± 430 3.7(3) 139 ± 115 0.66

population of cities (×103) 19 447 9.00 77.83 8 009 52.46 ± 11.88 2.37(8) 580 ± 177 0.76

email address books size 4581 12.45 21.49 333 57 ± 21 3.5(6) 196 ± 449 0.16

forest fire size (acres) 203 785 0.90 20.99 4121 6324 ± 3487 2.2(3) 521 ± 6801 0.05solar flare intensity 12 773 689.41 6520.59 231 300 323 ± 89 1.79(2) 1711 ± 384 1.00

quake intensity (×103) 19 302 24.54 563.83 63 096 0.794 ± 80.198 1.64(4) 11 697 ± 2159 0.00religious followers (×106) 103 27.36 136.64 1050 3.85 ± 1.60 1.8(1) 39 ± 26 0.42

freq. of surnames (×103) 2753 50.59 113.99 2502 111.92 ± 40.67 2.5(2) 239 ± 215 0.20

net worth (mil. USD) 400 2388.69 4 167.35 46 000 900 ± 364 2.3(1) 302 ± 77 0.00citations to papers 415 229 16.17 44.02 8904 160 ± 35 3.16(6) 3455 ± 1859 0.20

papers authored 401 445 7.21 16.52 1416 133 ± 13 4.3(1) 988 ± 377 0.90

hits to web sites 119 724 9.83 392.52 129 641 2 ± 13 1.81(8) 50 981 ± 16 898 0.00links to web sites 241 428 853 9.15 106 871.65 1 199 466 3684 ± 151 2.336(9) 28 986 ± 1560 0.00

We’ll explore various exponent measurementtechniques in assignments.

Page 80: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.29 of 53

power-law size distributions

Gaussians versus power-law size distributions: Mediocristan versus Extremistan Mild versus Wild (Mandelbrot) Example: Height versus wealth.

See “The Black Swan” by NassimTaleb. [15]

Terrible if successful framing:Black swans are not thatsurprising ...

Page 81: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.30 of 53

Turkeys...

From “The Black Swan” [15]

Page 82: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.31 of 53

Taleb’s table [15]

Mediocristan/Extremistan Most typical member is mediocre/Most typical is either

giant or tiny

Winners get a small segment/Winner take almost alleffects

When you observe for a while, you know what’s goingon/It takes a very long time to figure out what’s goingon

Prediction is easy/Prediction is hard

History crawls/History makes jumps

Tyranny of the collective/Tyranny of the rare andaccidental

Page 83: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.31 of 53

Taleb’s table [15]

Mediocristan/Extremistan Most typical member is mediocre/Most typical is either

giant or tiny

Winners get a small segment/Winner take almost alleffects

When you observe for a while, you know what’s goingon/It takes a very long time to figure out what’s goingon

Prediction is easy/Prediction is hard

History crawls/History makes jumps

Tyranny of the collective/Tyranny of the rare andaccidental

Page 84: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.31 of 53

Taleb’s table [15]

Mediocristan/Extremistan Most typical member is mediocre/Most typical is either

giant or tiny

Winners get a small segment/Winner take almost alleffects

When you observe for a while, you know what’s goingon/It takes a very long time to figure out what’s goingon

Prediction is easy/Prediction is hard

History crawls/History makes jumps

Tyranny of the collective/Tyranny of the rare andaccidental

Page 85: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.31 of 53

Taleb’s table [15]

Mediocristan/Extremistan Most typical member is mediocre/Most typical is either

giant or tiny

Winners get a small segment/Winner take almost alleffects

When you observe for a while, you know what’s goingon/It takes a very long time to figure out what’s goingon

Prediction is easy/Prediction is hard

History crawls/History makes jumps

Tyranny of the collective/Tyranny of the rare andaccidental

Page 86: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.31 of 53

Taleb’s table [15]

Mediocristan/Extremistan Most typical member is mediocre/Most typical is either

giant or tiny

Winners get a small segment/Winner take almost alleffects

When you observe for a while, you know what’s goingon/It takes a very long time to figure out what’s goingon

Prediction is easy/Prediction is hard

History crawls/History makes jumps

Tyranny of the collective/Tyranny of the rare andaccidental

Page 87: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.31 of 53

Taleb’s table [15]

Mediocristan/Extremistan Most typical member is mediocre/Most typical is either

giant or tiny

Winners get a small segment/Winner take almost alleffects

When you observe for a while, you know what’s goingon/It takes a very long time to figure out what’s goingon

Prediction is easy/Prediction is hard

History crawls/History makes jumps

Tyranny of the collective/Tyranny of the rare andaccidental

Page 88: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.31 of 53

Taleb’s table [15]

Mediocristan/Extremistan Most typical member is mediocre/Most typical is either

giant or tiny

Winners get a small segment/Winner take almost alleffects

When you observe for a while, you know what’s goingon/It takes a very long time to figure out what’s goingon

Prediction is easy/Prediction is hard

History crawls/History makes jumps

Tyranny of the collective/Tyranny of the rare andaccidental

Page 89: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.32 of 53

Size distributions:

Power-law size distributions aresometimes calledPareto distributions after Italianscholar Vilfredo Pareto.

Pareto noted wealth in Italy wasdistributed unevenly (80–20 rule;misleading).

Term used especially bypractitioners of the DismalScience.

Page 90: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.32 of 53

Size distributions:

Power-law size distributions aresometimes calledPareto distributions after Italianscholar Vilfredo Pareto.

Pareto noted wealth in Italy wasdistributed unevenly (80–20 rule;misleading).

Term used especially bypractitioners of the DismalScience.

Page 91: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.32 of 53

Size distributions:

Power-law size distributions aresometimes calledPareto distributions after Italianscholar Vilfredo Pareto.

Pareto noted wealth in Italy wasdistributed unevenly (80–20 rule;misleading).

Term used especially bypractitioners of the DismalScience.

Page 92: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.33 of 53

Devilish power-law size distribution details:

Exhibit A: Given () = − with 0 < min < < max,

the mean is ( ≠ ):⟨⟩ = − (2−max − 2−

min ) . Mean ‘blows up’ with upper cutoff if < . Mean depends on lower cutoff if > . < : Typical sample is large. > : Typical sample is small.

Insert question from assignment 2

Page 93: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.33 of 53

Devilish power-law size distribution details:

Exhibit A: Given () = − with 0 < min < < max,

the mean is ( ≠ ):⟨⟩ = − (2−max − 2−

min ) . Mean ‘blows up’ with upper cutoff if < . Mean depends on lower cutoff if > . < : Typical sample is large. > : Typical sample is small.

Insert question from assignment 2

Page 94: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.33 of 53

Devilish power-law size distribution details:

Exhibit A: Given () = − with 0 < min < < max,

the mean is ( ≠ ):⟨⟩ = − (2−max − 2−

min ) . Mean ‘blows up’ with upper cutoff if < . Mean depends on lower cutoff if > . < : Typical sample is large. > : Typical sample is small.

Insert question from assignment 2

Page 95: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.33 of 53

Devilish power-law size distribution details:

Exhibit A: Given () = − with 0 < min < < max,

the mean is ( ≠ ):⟨⟩ = − (2−max − 2−

min ) . Mean ‘blows up’ with upper cutoff if < . Mean depends on lower cutoff if > . < : Typical sample is large. > : Typical sample is small.

Insert question from assignment 2

Page 96: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.33 of 53

Devilish power-law size distribution details:

Exhibit A: Given () = − with 0 < min < < max,

the mean is ( ≠ ):⟨⟩ = − (2−max − 2−

min ) . Mean ‘blows up’ with upper cutoff if < . Mean depends on lower cutoff if > . < : Typical sample is large. > : Typical sample is small.

Insert question from assignment 2

Page 97: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.34 of 53

And in general...

Moments: All moments depend only on cutoffs. No internal scale that dominates/matters. Compare to a Gaussian, exponential, etc.

For many real size distributions: < < mean is finite (depends on lower cutoff) 2 = variance is ‘infinite’ (depends on upper cutoff) Width of distribution is ‘infinite’ If > , distribution is less terrifying and may be

easily confused with other kinds of distributions.

Insert question from assignment 3

Page 98: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.34 of 53

And in general...

Moments: All moments depend only on cutoffs. No internal scale that dominates/matters. Compare to a Gaussian, exponential, etc.

For many real size distributions: < < mean is finite (depends on lower cutoff) 2 = variance is ‘infinite’ (depends on upper cutoff) Width of distribution is ‘infinite’ If > , distribution is less terrifying and may be

easily confused with other kinds of distributions.

Insert question from assignment 3

Page 99: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.34 of 53

And in general...

Moments: All moments depend only on cutoffs. No internal scale that dominates/matters. Compare to a Gaussian, exponential, etc.

For many real size distributions: < < mean is finite (depends on lower cutoff) 2 = variance is ‘infinite’ (depends on upper cutoff) Width of distribution is ‘infinite’ If > , distribution is less terrifying and may be

easily confused with other kinds of distributions.

Insert question from assignment 3

Page 100: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.34 of 53

And in general...

Moments: All moments depend only on cutoffs. No internal scale that dominates/matters. Compare to a Gaussian, exponential, etc.

For many real size distributions: < < mean is finite (depends on lower cutoff) 2 = variance is ‘infinite’ (depends on upper cutoff) Width of distribution is ‘infinite’ If > , distribution is less terrifying and may be

easily confused with other kinds of distributions.

Insert question from assignment 3

Page 101: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.34 of 53

And in general...

Moments: All moments depend only on cutoffs. No internal scale that dominates/matters. Compare to a Gaussian, exponential, etc.

For many real size distributions: < < mean is finite (depends on lower cutoff) 2 = variance is ‘infinite’ (depends on upper cutoff) Width of distribution is ‘infinite’ If > , distribution is less terrifying and may be

easily confused with other kinds of distributions.

Insert question from assignment 3

Page 102: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.34 of 53

And in general...

Moments: All moments depend only on cutoffs. No internal scale that dominates/matters. Compare to a Gaussian, exponential, etc.

For many real size distributions: < < mean is finite (depends on lower cutoff) 2 = variance is ‘infinite’ (depends on upper cutoff) Width of distribution is ‘infinite’ If > , distribution is less terrifying and may be

easily confused with other kinds of distributions.

Insert question from assignment 3

Page 103: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.34 of 53

And in general...

Moments: All moments depend only on cutoffs. No internal scale that dominates/matters. Compare to a Gaussian, exponential, etc.

For many real size distributions: < < mean is finite (depends on lower cutoff) 2 = variance is ‘infinite’ (depends on upper cutoff) Width of distribution is ‘infinite’ If > , distribution is less terrifying and may be

easily confused with other kinds of distributions.

Insert question from assignment 3

Page 104: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.34 of 53

And in general...

Moments: All moments depend only on cutoffs. No internal scale that dominates/matters. Compare to a Gaussian, exponential, etc.

For many real size distributions: < < mean is finite (depends on lower cutoff) 2 = variance is ‘infinite’ (depends on upper cutoff) Width of distribution is ‘infinite’ If > , distribution is less terrifying and may be

easily confused with other kinds of distributions.

Insert question from assignment 3

Page 105: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.35 of 53

Moments

Standard deviation is a mathematicalconvenience: Variance is nice analytically... Another measure of distribution width:

Mean average deviation (MAD) = ⟨| − ⟨⟩|⟩ For a pure power law with < < :⟨| − ⟨⟩|⟩ is finite.

But MAD is mildly unpleasant analytically... We still speak of infinite ‘width’ if < .Insert question from assignment 2

Page 106: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.35 of 53

Moments

Standard deviation is a mathematicalconvenience: Variance is nice analytically... Another measure of distribution width:

Mean average deviation (MAD) = ⟨| − ⟨⟩|⟩ For a pure power law with < < :⟨| − ⟨⟩|⟩ is finite.

But MAD is mildly unpleasant analytically... We still speak of infinite ‘width’ if < .Insert question from assignment 2

Page 107: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.35 of 53

Moments

Standard deviation is a mathematicalconvenience: Variance is nice analytically... Another measure of distribution width:

Mean average deviation (MAD) = ⟨| − ⟨⟩|⟩ For a pure power law with < < :⟨| − ⟨⟩|⟩ is finite.

But MAD is mildly unpleasant analytically... We still speak of infinite ‘width’ if < .Insert question from assignment 2

Page 108: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.35 of 53

Moments

Standard deviation is a mathematicalconvenience: Variance is nice analytically... Another measure of distribution width:

Mean average deviation (MAD) = ⟨| − ⟨⟩|⟩ For a pure power law with < < :⟨| − ⟨⟩|⟩ is finite.

But MAD is mildly unpleasant analytically... We still speak of infinite ‘width’ if < .Insert question from assignment 2

Page 109: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.35 of 53

Moments

Standard deviation is a mathematicalconvenience: Variance is nice analytically... Another measure of distribution width:

Mean average deviation (MAD) = ⟨| − ⟨⟩|⟩ For a pure power law with < < :⟨| − ⟨⟩|⟩ is finite.

But MAD is mildly unpleasant analytically... We still speak of infinite ‘width’ if < .Insert question from assignment 2

Page 110: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.35 of 53

Moments

Standard deviation is a mathematicalconvenience: Variance is nice analytically... Another measure of distribution width:

Mean average deviation (MAD) = ⟨| − ⟨⟩|⟩ For a pure power law with < < :⟨| − ⟨⟩|⟩ is finite.

But MAD is mildly unpleasant analytically... We still speak of infinite ‘width’ if < .Insert question from assignment 2

Page 111: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.36 of 53

How sample sizes grow...

Given () ∼ −: We can show that after samples, we expect the

largest sample to be1 ≳ ′1/(−1) Sampling from a finite-variance distribution gives

a much slower growth with . e.g., for () = −, we find1 ≳ ln.Insert question from assignment 2

Page 112: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.36 of 53

How sample sizes grow...

Given () ∼ −: We can show that after samples, we expect the

largest sample to be1 ≳ ′1/(−1) Sampling from a finite-variance distribution gives

a much slower growth with . e.g., for () = −, we find1 ≳ ln.Insert question from assignment 2

Page 113: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.36 of 53

How sample sizes grow...

Given () ∼ −: We can show that after samples, we expect the

largest sample to be1 ≳ ′1/(−1) Sampling from a finite-variance distribution gives

a much slower growth with . e.g., for () = −, we find1 ≳ ln.Insert question from assignment 2

Page 114: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.37 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() = (′ ≥ ) = − (′ < ) = ∫∞′= (′)d′ ∝ ∫∞′=(′)−d′ = − + (′)−+1∣∞′= ∝ −+1

Page 115: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.37 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() = (′ ≥ ) = − (′ < ) = ∫∞′= (′)d′ ∝ ∫∞′=(′)−d′ = − + (′)−+1∣∞′= ∝ −+1

Page 116: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.37 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() = (′ ≥ ) = − (′ < ) = ∫∞′= (′)d′ ∝ ∫∞′=(′)−d′ = − + (′)−+1∣∞′= ∝ −+1

Page 117: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.37 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() = (′ ≥ ) = − (′ < ) = ∫∞′= (′)d′ ∝ ∫∞′=(′)−d′ = − + (′)−+1∣∞′= ∝ −+1

Page 118: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.37 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() = (′ ≥ ) = − (′ < ) = ∫∞′= (′)d′ ∝ ∫∞′=(′)−d′ = − + (′)−+1∣∞′= ∝ −+1

Page 119: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.37 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() = (′ ≥ ) = − (′ < ) = ∫∞′= (′)d′ ∝ ∫∞′=(′)−d′ = − + (′)−+1∣∞′= ∝ −+1

Page 120: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.38 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() ∝ −+1 Use when tail of follows a power law. Increases exponent by one. Useful in cleaning up data.

Page 121: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.38 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() ∝ −+1 Use when tail of follows a power law. Increases exponent by one. Useful in cleaning up data.

Page 122: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.38 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() ∝ −+1 Use when tail of follows a power law. Increases exponent by one. Useful in cleaning up data.

Page 123: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.38 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() ∝ −+1 Use when tail of follows a power law. Increases exponent by one. Useful in cleaning up data.

Page 124: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.38 of 53

Complementary Cumulative Distribution Function:

CCDF: ≥() ∝ −+1 Use when tail of follows a power law. Increases exponent by one. Useful in cleaning up data.

PDF:

−2.5 −2 −1.5 −1 −0.5 0 0.5 10

0.5

1

1.5

2

2.5

3

3.5

log10

q

log

10 N

q

CCDF:

−2.5 −2 −1.5 −1 −0.5 0 0.5 10

0.5

1

1.5

2

2.5

3

3.5

log10

q

log

10 N

> q

Page 125: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.39 of 53

Complementary Cumulative Distribution Function:

Same story for a discrete variable: () ∼ −. ≥() = (′ ≥ )

= ∞∑′= ()∝ −+1

Use integrals to approximate sums.

Page 126: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.39 of 53

Complementary Cumulative Distribution Function:

Same story for a discrete variable: () ∼ −. ≥() = (′ ≥ )

= ∞∑′= ()

∝ −+1

Use integrals to approximate sums.

Page 127: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.39 of 53

Complementary Cumulative Distribution Function:

Same story for a discrete variable: () ∼ −. ≥() = (′ ≥ )

= ∞∑′= ()∝ −+1 Use integrals to approximate sums.

Page 128: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.39 of 53

Complementary Cumulative Distribution Function:

Same story for a discrete variable: () ∼ −. ≥() = (′ ≥ )

= ∞∑′= ()∝ −+1 Use integrals to approximate sums.

Page 129: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.40 of 53

Zipfian rank-frequency plots

George Kingsley Zipf: Noted various rank distributions

have power-law tails, often with exponent -1(word frequency, city sizes...)

Zipf’s 1949 Magnum Opus:

We’ll study Zipf’s law in depth...

Page 130: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.40 of 53

Zipfian rank-frequency plots

George Kingsley Zipf: Noted various rank distributions

have power-law tails, often with exponent -1(word frequency, city sizes...)

Zipf’s 1949 Magnum Opus:

“Human Behaviour and the Principle ofLeast-Effort”by G. K. Zipf (1949). [18]

We’ll study Zipf’s law in depth...

Page 131: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.40 of 53

Zipfian rank-frequency plots

George Kingsley Zipf: Noted various rank distributions

have power-law tails, often with exponent -1(word frequency, city sizes...)

Zipf’s 1949 Magnum Opus:

“Human Behaviour and the Principle ofLeast-Effort”by G. K. Zipf (1949). [18]

We’ll study Zipf’s law in depth...

Page 132: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.41 of 53

Zipfian rank-frequency plots

Zipf’s way: Given a collection of entities, rank them by size,

largest to smallest. = the size of the th ranked entity. = corresponds to the largest size. Example: 1 could be the frequency of occurrence

of the most common word in a text. Zipf’s observation: ∝ −

Page 133: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.41 of 53

Zipfian rank-frequency plots

Zipf’s way: Given a collection of entities, rank them by size,

largest to smallest. = the size of the th ranked entity. = corresponds to the largest size. Example: 1 could be the frequency of occurrence

of the most common word in a text. Zipf’s observation: ∝ −

Page 134: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.41 of 53

Zipfian rank-frequency plots

Zipf’s way: Given a collection of entities, rank them by size,

largest to smallest. = the size of the th ranked entity. = corresponds to the largest size. Example: 1 could be the frequency of occurrence

of the most common word in a text. Zipf’s observation: ∝ −

Page 135: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.41 of 53

Zipfian rank-frequency plots

Zipf’s way: Given a collection of entities, rank them by size,

largest to smallest. = the size of the th ranked entity. = corresponds to the largest size. Example: 1 could be the frequency of occurrence

of the most common word in a text. Zipf’s observation: ∝ −

Page 136: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.41 of 53

Zipfian rank-frequency plots

Zipf’s way: Given a collection of entities, rank them by size,

largest to smallest. = the size of the th ranked entity. = corresponds to the largest size. Example: 1 could be the frequency of occurrence

of the most common word in a text. Zipf’s observation: ∝ −

Page 137: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.41 of 53

Zipfian rank-frequency plots

Zipf’s way: Given a collection of entities, rank them by size,

largest to smallest. = the size of the th ranked entity. = corresponds to the largest size. Example: 1 could be the frequency of occurrence

of the most common word in a text. Zipf’s observation: ∝ −

Page 138: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.42 of 53

Size distributions:

Brown Corpus (1,015,945 words):

CCDF:

−2.5 −2 −1.5 −1 −0.5 0 0.5 10

0.5

1

1.5

2

2.5

3

3.5

log10

q

log

10 N

> q

Zipf:

0 0.5 1 1.5 2 2.5 3 3.5−2.5

−2

−1.5

−1

−0.5

0

0.5

1

log10

rank ilo

g10 q

i

The, of, and, to, a, ... = ‘objects’ ‘Size’ = word frequency

Beep: (Important) CCDF and Zipf plots arerelated...

Page 139: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.42 of 53

Size distributions:

Brown Corpus (1,015,945 words):

CCDF:

−2.5 −2 −1.5 −1 −0.5 0 0.5 10

0.5

1

1.5

2

2.5

3

3.5

log10

q

log

10 N

> q

Zipf:

0 0.5 1 1.5 2 2.5 3 3.5−2.5

−2

−1.5

−1

−0.5

0

0.5

1

log10

rank ilo

g10 q

i

The, of, and, to, a, ... = ‘objects’ ‘Size’ = word frequency Beep: (Important) CCDF and Zipf plots are

related...

Page 140: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.43 of 53

Size distributions:

Brown Corpus (1,015,945 words):CCDF:

−2.5 −2 −1.5 −1 −0.5 0 0.5 10

0.5

1

1.5

2

2.5

3

3.5

log10

q

log

10 N

> q

Zipf:

−3 −2 −1 0 10

0.5

1

1.5

2

2.5

3

3.5

log10

qi

log

10 r

ank

i

The, of, and, to, a, ... = ‘objects’ ‘Size’ = word frequency Beep: (Important) CCDF and Zipf plots are

related...

Page 141: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.44 of 53

Observe: ≥() = the number of objects with size at least

where = total number of objects.

If an object has size , then ≥() is its rank . So ∝ − = (≥())−

∝ (−+1)(−) since ≥() ∼ −+1.We therefore have 1 = (− + 1)(−) or: = 1 − 1

A rank distribution exponent of = 1 corresponds to asize distribution exponent = 2.

Page 142: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.44 of 53

Observe: ≥() = the number of objects with size at least

where = total number of objects.

If an object has size , then ≥() is its rank . So ∝ − = (≥())−

∝ (−+1)(−) since ≥() ∼ −+1.We therefore have 1 = (− + 1)(−) or: = 1 − 1

A rank distribution exponent of = 1 corresponds to asize distribution exponent = 2.

Page 143: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.44 of 53

Observe: ≥() = the number of objects with size at least

where = total number of objects.

If an object has size , then ≥() is its rank . So ∝ − = (≥())−

∝ (−+1)(−) since ≥() ∼ −+1.We therefore have 1 = (− + 1)(−) or: = 1 − 1

A rank distribution exponent of = 1 corresponds to asize distribution exponent = 2.

Page 144: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.44 of 53

Observe: ≥() = the number of objects with size at least

where = total number of objects.

If an object has size , then ≥() is its rank . So ∝ − = (≥())−

∝ (−+1)(−) since ≥() ∼ −+1.We therefore have 1 = (− + 1)(−) or: = 1 − 1

A rank distribution exponent of = 1 corresponds to asize distribution exponent = 2.

Page 145: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.44 of 53

Observe: ≥() = the number of objects with size at least

where = total number of objects.

If an object has size , then ≥() is its rank . So ∝ − = (≥())−

∝ (−+1)(−) since ≥() ∼ −+1.We therefore have 1 = (− + 1)(−) or: = 1 − 1

A rank distribution exponent of = 1 corresponds to asize distribution exponent = 2.

Page 146: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.44 of 53

Observe: ≥() = the number of objects with size at least

where = total number of objects.

If an object has size , then ≥() is its rank . So ∝ − = (≥())−

∝ (−+1)(−) since ≥() ∼ −+1.We therefore have 1 = (− + 1)(−) or: = 1 − 1

A rank distribution exponent of = 1 corresponds to asize distribution exponent = 2.

Page 147: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.45 of 53

The Don.Extreme deviations in test cricket:

1000 10 20 30 9040 50 60 70 80

Don Bradman’s batting average= 166% next best.

That’s pretty solid. Later in the course: Understanding success—

is the Mona Lisa like Don Bradman?

Page 148: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.45 of 53

The Don.Extreme deviations in test cricket:

1000 10 20 30 9040 50 60 70 80

Don Bradman’s batting average= 166% next best.

That’s pretty solid. Later in the course: Understanding success—

is the Mona Lisa like Don Bradman?

Page 149: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.45 of 53

The Don.Extreme deviations in test cricket:

1000 10 20 30 9040 50 60 70 80

Don Bradman’s batting average= 166% next best.

That’s pretty solid. Later in the course: Understanding success—

is the Mona Lisa like Don Bradman?

Page 150: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.45 of 53

The Don.Extreme deviations in test cricket:

1000 10 20 30 9040 50 60 70 80

Don Bradman’s batting average= 166% next best.

That’s pretty solid. Later in the course: Understanding success—

is the Mona Lisa like Don Bradman?

Page 151: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.46 of 53

A good eye:

Page 152: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.47 of 53

Actual:Fall 2014:

0 20 40 60 80 100

Ideal:Fall 2014:

0 20 40 60 80 100

Page 153: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.48 of 53

Actual:Fall 2013:

0 20 40 60 80 100

Spring 2013:

0 20 40 60 80 100

Ideal:Fall 2013:

0 20 40 60 80 100

Spring 2013:

0 20 40 60 80 100

Page 154: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.49 of 53

References I

[1] P. Bak, K. Christensen, L. Danon, and T. Scanlon.Unified scaling law for earthquakes.Phys. Rev. Lett., 88:178501, 2002. pdf

[2] A.-L. Barabási and R. Albert.Emergence of scaling in random networks.Science, 286:509–511, 1999. pdf

[3] K. Christensen, L. Danon, T. Scanlon, and P. Bak.Unified scaling law for earthquakes.Proc. Natl. Acad. Sci., 99:2509–2513, 2002. pdf

[4] A. Clauset, C. R. Shalizi, and M. E. J. Newman.Power-law distributions in empirical data.SIAM Review, 51:661–703, 2009. pdf

Page 155: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.50 of 53

References II[5] D. J. de Solla Price.

Networks of scientific papers.Science, 149:510–515, 1965. pdf

[6] D. J. de Solla Price.A general theory of bibliometric and othercumulative advantage processes.J. Amer. Soc. Inform. Sci., 27:292–306, 1976. pdf

[7] P. Grassberger.Critical behaviour of the Drossel-Schwabl forestfire model.New Journal of Physics, 4:17.1–17.15, 2002. pdf

[8] B. Gutenberg and C. F. Richter.Earthquake magnitude, intensity, energy, andacceleration.Bull. Seism. Soc. Am., 499:105–145, 1942. pdf

Page 156: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.51 of 53

References III

[9] J. Holtsmark.Über die verbreiterung von spektrallinien.Ann. Phys., 58:577–, 1919.

[10] R. Munroe.Thing Explainer: Complicated Stuff in SimpleWords.Houghton Mifflin Harcourt, 2015.

[11] M. E. J. Newman.Power laws, Pareto distributions and Zipf’s law.Contemporary Physics, 46:323–351, 2005. pdf

[12] M. I. Norton and D. Ariely.Building a better America—One wealth quintile ata time.Perspectives on Psychological Science, 6:9–12,2011. pdf

Page 157: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.52 of 53

References IV

[13] L. F. Richardson.Variation of the frequency of fatal quarrels withmagnitude.J. Amer. Stat. Assoc., 43:523–546, 1949. pdf

[14] H. A. Simon.On a class of skew distribution functions.Biometrika, 42:425–440, 1955. pdf

[15] N. N. Taleb.The Black Swan.Random House, New York, 2007.

[16] G. U. Yule.A mathematical theory of evolution, based on theconclusions of Dr J. C. Willis, F.R.S.Phil. Trans. B, 213:21–87, 1925. pdf

Page 158: New PoCS|@pocsvox Power-LawSizeDistributions Power-LawSize · 2016. 9. 6. · PoCS|@pocsvox Power-LawSize Distributions OurIntuition Definition Examples Wildvs.Mild CCDFs Zipf’slaw

PoCS|@pocsvox

Power-Law SizeDistributions

Our Intuition

Definition

Examples

Wild vs. Mild

CCDFs

Zipf’s law

Zipf ⇔ CCDF

Appendix

References

.....

.53 of 53

References V

[17] Y.-X. Zhu, J. Huang, Z.-K. Zhang, Q.-M. Zhang,T. Zhou, and Y.-Y. Ahn.Geography and similarity of regional cuisines inchina.PLoS ONE, 8:e79161, 2013. pdf

[18] G. K. Zipf.Human Behaviour and the Principle ofLeast-Effort.Addison-Wesley, Cambridge, MA, 1949.