Serendipity and strategy in rapid innovation - arXiv · Serendipity and strategy in rapid innovation T. M. A. Finky, M. Reeves z, R. Palma and R. S. Farry yLondon Institute for Mathematical

Serendipity and strategy in rapid innovationT. M. A. Fink∗†, M. Reeves‡, R. Palma‡ and R. S. Farr†

†London Institute for Mathematical Sciences, Mayfair, London W1K 2XF, UK∗Centre National de la Recherche Scientifique, Paris, France‡BCG Henderson Institute, The Boston Consulting Group, New York, USA

Innovation is to organizations what evolution is to organisms: itis how organisations adapt to changes in the environment andimprove [1]. Governments, institutions and firms that innovateare more likely to prosper and stand the test of time; thosethat fail to do so fall behind their competitors and succumbto market and environmental change [2, 3]. Yet despite steadyadvances in our understanding of evolution, what drives inno-vation remains elusive [1, 4]. On the one hand, organizationsinvest heavily in systematic strategies to drive innovation [5–8]. On the other, historical analysis and individual experiencesuggest that serendipity plays a significant role in the discoveryprocess [9–11]. To unify these two perspectives, we analyzedthe mathematics of innovation as a search process for viabledesigns across a universe of building blocks. We then testedour insights using historical data from language, gastronomyand technology. By measuring the number of makeable designsas we acquire more components, we observed that the relativeusefulness of different components is not fixed, but cross eachother over time. When these crossovers are unanticipated, theyappear to be the result of serendipity. But when we can predictcrossovers ahead of time, they offer an opportunity to strate-gically increase the growth of our product space. Thus we findthat the serendipitous and strategic visions of innovation canbe viewed as different manifestations of the same thing: thechanging importance of component building blocks over time.

Lego game. Let’s illustrate the idea using Lego bricks. Thinkback to your childhood days. You’re in a room with two friendsBob and Alice, playing with a big box of Lego bricks—say, afire station set. All three of you have the same goal: to build asmany new toys as possible. As you continue to play, each of yousearches through the box and chooses those bricks that you be-lieve will help you reach this goal. Let’s now suppose each playerapproaches this differently. Your approach is to follow your gut,arbitrarily selecting bricks that look intriguing. Alice uses whatwe call a short-sighted strategy, carefully picking Lego men andtheir firefighting hats to immediately make simple toys. Mean-

R

F

X

Language

0 2 4 6 8 10 12 14 16 18 20 22 24 26

1

10

100

1000

104

Acquired letters

Makeablewords

(usefulness)

cayennecocoa

lime

Gastronomy

0 127 254 381

1

10

100

1000

104

Acquired ingredients

Makeablerecipes(usefulness)

Rails

jQuery UI

Sauce Labs

Technology

0 331 662 993

0.5

1

5

10

50

100

Acquired development tools

Makeablesoftware(usefulness)

RFX3rd

2nd1st

Rank cayenne

cocoalime3rd

2nd1st

Rank Rails

jQuery UISauce Labs3rd

2nd1st

Rank

FIG. 1: Products, components and usefulness. (Top) We studied products and components from three sectors. In language, the products are79,258 English words and the components are the 26 letters. In gastronomy, the products are 56,498 recipes from the databases allrecipes.com,epicurious.com, and menupan.com [12] and the components are 381 ingredients. In technology, the products are 1158 software productscatalogued by stackshare.io and the components are 993 development tools used to make them. (Bottom) The usefulness of a componentis the number of products we can make that contain it. We find that the relative usefulness of a component depends on how many othercomponents have already been acquired. For each sector, we show the usefulness of three typical components: averaged at each stage over allpossible choices of the other acquired components and—for gastronomy—for a particular random order of component acquisition (points).

while, Bob chooses pieces such as axels, wheels, and small baseplates that he noticed are common in more complex models,even though he is not able to use them straightaway to producenew toys. We call this a far-sighted strategy.

Who wins. At the end of the day, who will have innovatedthe most? That is, who will have built the most new toys? Wefind that, in the beginning, Alice will lead the way, surgingahead with her impatient strategy. But as the game progresses,fate will appear to shift. Bob’s early moves will begin to lookserendipitous when he is able to assemble a complex fire truckfrom his choice of initially useless axels and wheels. It will seemthat he was lucky, but we will soon see that he effectively cre-ated his own serendipity. What about you? Picking componentson a hunch, you will have built the fewest toys. Your friends hadan information-enabled strategy, while you relied on chance.

Spectrum of strategies. What can we learn from this? If in-novation is a search process, then your component choices to-day matter greatly in terms of the options they will open upto you tomorrow. Do you pick components that quickly formsimple products and give you a return now, or do you choosethose components that give you a higher future option value?By understanding innovation as a search for designs across auniverse of components, we made a surprising discovery. Infor-mation about the unfolding process of innovation can be usedto form an advantageous innovation strategy. But there is noone superior strategy. As we shall see, the optimal strategy de-pends on time—how far along the innovation process we haveadvanced—and the sector—some sectors contain more oppor-tunities for strategic advantage than others.

Components and products. Just like the Lego toys are madeup of distinct kinds of bricks, we take products to be made upof distinct components. A component can be an object, like atouch screen, but it can also be a skill, like using Python, or aroutine, like customer registration. Only certain combinationsof components form products, according to some predetermineduniversal recipe book of products. Examples of products and

arX

iv:1

608.

0190

0v4

[ph

ysic

s.so

c-ph

] 1

7 M

ar 2

017

2

the components used to make them are shown in Fig. 1. Nowsuppose that we possess a basket of distinct components, whichwe can combine in different ways to make products. We havemore than enough copies of each component for our needs, sowe do not have to worry about running out. There are N possi-ble component types in total, but at any given stage n we onlyhave n of these N possible building blocks. At every stage, wepick a new type of component to add to our basket.

Usefulness. The usefulness of a component is the number ofproducts we can make that contain it [13]. In other words, theusefulness uα of some component α is how many more productswe can make with α in our basket than without α in our basket.As we gather more components, uα increases or stays the same;

E

A

I

R

N

T

O

S

L

C

U

D

M

P

H

G

Y

B

F

V

K

W

Z

X

J

Q

6

13

20

26

Language

eggwheatbutteroniongarlicmilkvegetable_oilcreamtomatoolive_oilblack_pepperpeppervanillacayennevinegarcane_molassesbell_peppercinnamonparsleychickenlemon_juicebeefcocoacornbreadscallionmustardgingerbasilcelerycarrotpotatochicken_brothyeastricemushroomcheesesoy_saucecuminoregano

95

190

286

381

Gastronomy

Google AnalyticsGitHubjQuerynginxBootstrapSlackJavaScriptNew RelicRedisGoogle AppsAmazon S3Amazon EC2GitAngularJSNode.jsMySQLAmazon CloudFrontTrelloRailsPostgreSQLRubyMongoDBPython

PingdomMixpanelMailChimp

PHPDockerMandrillSublime TextElasticsearch

StripeHeroku

Sass

SendGridGoogle Drive

npmJenkinsBowerGrunt

248

496

745

993

Technology

FIG. 2: Crossovers. The relativeusefulness of different componentschanges as the number of componentswe possess increases. For example, if youare only allowed six letters, the ones that showup in the most words are a, e, i, o, s, r. For gastro-nomy and technology, for clarity we only show the40 components most useful when we have all N components. A pureshort-sighted strategy acquires components in the order that theyintersect the diagonal; whereas a pure far-sighted strategy acquiresthem in the order that they intersect a vertical. If there are nocrossovers, the strategies are the same.

it cannot decrease. We write uα(n) to indicate this dependenceon n: uα(n) is the usefulness of α given possession of α andn−1 other components, the combined set of components beingn. Averaging over all choices of the n−1 other components fromthe N − 1 that are possible gives the mean usefulness, uα(n).

Usefulness experiment. To measure the usefulness of differentcomponents as the innovation process unfolds and we acquiremore components, we did the following experiments. Using datafrom each of our three sectors, we put a given component α intoan empty basket, and then added, one component at a time,the remaining N − 1 other components, measuring the useful-ness of α at every step. We averaged uα(n) over all possibleorders in which to add the N − 1 components to obtain uα(n).(We explain how in SI B.) We repeated this process for all ofthe components α. Typical results from these experiments areshown in Fig. 1. We find that the mean usefulnesses of differentcomponents cross each other as the number of components in

1

5

9

13

17

2156,498 recipes in total597 recipes in total

A B

RecipesRecipes

BIG KITCHEN381 ingredients: almond to zucchini

SMALL KITCHEN127 ingredients: almond to fenugreek

600 0 60000

Recipecomplexity

1

5

9

13

17

21

4801 recipes contain cocoa89 recipes contain cocoa

C D

Recipes with cocoaRecipes with cocoa 0100 0 1000

Cocoa is more useful than cayenne Cayenne is more useful than cocoa

1

5

9

13

17

21

7950 recipescontain cayenne

43 recipes contain cayenne

E F

Recipes with cayenneRecipes with cayenne 0100 0 1000

FIG. 3: Why crossovers happen. On the right is a big kitchen with381 ingredients. On the left is a small kitchen with one-third as manyingredients. In the big kitchen (B), we can make a total of 56,498recipes. Each bar counts recipes with the same number of ingredients(complexity). When we move to the smaller kitchen (A), the numberof makable recipes shrinks dramatically to 597, or 1.0%. But thisreduction is far from uniform across different bars. Higher bars shrinkmore, on average by an extra factor of 3 with each bar. Thus thenumber of recipes of complexity one (first bar) shrinks about 3-fold;the number of complexity two (second bar) 9-fold, and so on. Ofall the recipes in the big kitchen, 4801 contain cocoa (D) and 7950contain cayenne (F). The cayenne recipes tend to be more complex,containing on average 10.6 ingredients, whereas the cocoa recipes aresimpler, averaging 7.2 ingredients. Because higher bars suffer strongerreduction, overall fewer cayenne recipes (0.5%) survive in the smallerkitchen (E) than cocoa recipes (1.8%) (C). Thus cayenne is moreuseful in the big kitchen, but cocoa is more useful in the small kitchen.

3

our basket increases. As Fig. 1 shows for gastronomy, this istrue for both the average over all possible orderings of compo-nents (lines) as well as a specific random ordering (points).

Bumps charts. To visualise the relative usefulness of compo-nents over time, for each sector we created its “bumps chart”(Fig. 2). These show the rank order of mean usefulness at everystage of the innovation process. We see that the crossovers inFig. 1 are commonplace, but that some sectors contain morecrossovers than others. There are few crossings in language,some in gastronomy and many in technology. This means, forexample, that the most useful letters for making words in Scrab-ble (a basket of seven letters) are nearly the same as the mostuseful letters for making words with a full basket (26 letters);the key ingredients in a small kitchen (20 ingredients) are mod-erately different from those in a big one (80 ingredients); themost-used development skills for a young software firm (ex-perience with 40 tools) are significantly different from thosefor an advanced one (160 tools). We call components that donot cross in time isochronic, like the letters; and those that doanisochronic, like the tools.

Why crossovers happen. To understand why crossovers hap-pen, let’s have a closer look at how the mean usefulness in-creases for a single component (Fig. 3). To make a product ofcomplexity s, we must possess all s of its distinct components.So making a complex product is harder than making a simpleone, because there are more ways that we might be missing anecessary component. We therefore group together the prod-ucts we can make containing α according to their complexity.That is, the usefulness uα(n, s) of component α is how manymore products of complexity s we can make with α in our bas-ket than without α in our basket. Summing uα(n, s) over s givesuα(n). The advantage of this refined grouping is that, by un-derstanding the behaviour of uα(n, s), we can understand themore difficult uα(n). Our key result, which we prove in SI B, isthat uα(n, s)/ns−1 is constant over all stages of the innovationprocess. In other words, for two stages n and n′,

uα(n′, s) ' uα(n, s)(n′/n)s−1. (1)

This tells us that the number of products containing α of com-plexity s grows much faster for higher complexities than for

E A IR N T

OS

LC

UD M

PHG Y

B

FV

KW

ZX

J Q

A

6.6 6.8 7.0 7.2 7.4 7.6 7.8

1000

2000

5000

1×104

2×104

Valence: average complexity of words a letter is in

Usefulness:no.ofwordsaletterisin

eggwheat

butter onion

garlicmilk

vegetable_oilcream tomato

olive_oil

black_pepper

pepper

vanilla

cayenne vinegarcane_molasses

bell_peppercinnamon parsleychickenlemon_juice beefcocoa

cornbreadscallion

mustard

ginger

basil celerycarrotpotato

chicken_broth

yeast ricemushroomcheese

soy_sauce

cuminoregano

B

7 8 9 10 11 12

4000

8000

20000

Valence: average complexity of recipes an ingredient is in

Usefulness:no.ofrecipesaningredientisin Google Analytics

GitHubjQuery

nginx

Bootstrap Slack JavaScriptNew Relic

RedisGoogle Apps Amazon S3Amazon EC2

GitAngularJSNode.js

MySQLAmazon CloudFront

TrelloRails

PostgreSQL RubyMongoDB

PythonPingdom

Mixpanel MailChimpPHP Docker

Mandrill Sublime TextElasticsearchStripeHeroku

SassSendGrid npm

Jenkins

BowerGrunt

C

26 28 30 32 34 36 38

200

300

400

500

600

700

Valence: average complexity of software a tool is in

Usefulness:no.ofsoftwareproductsatoolisin

Language

D

0 5 10 15 20 25

10

100

1000

104

Acquired letters

Totalmakeablewords

Far-sighted strategy

Impatient strategy

Pseudo-random (alphabetical)

Gastronomy

E

0 50 100 150 200 250 300 350

10

100

1000

104

Acquired ingredients

Totalmakeablerecipes

Technology

F

0 200 400 600 8001

5

10

50

100

500

1000

Acquired development tools

Totalmakeablesoftware

FIG. 4: (ABC) Scatter plots of component usefulness versus component valence for our three sectors. For gastronomy and technology, weonly show the top 40 components; the complete set is in SI Fig. 5. (DEF) Both the short-sighted and far-sighted strategies beat a typicalrandom component ordering (here alphabetical), but they diverge from each other only insofar that there are crossings in the bumps charts.

lower complexities. Early on, uα(n, s) will tend to be small forhigher complexities, but depending on how far ahead we look,the bigger growth rate can more than compensate for this, aswe see in Fig. 3. Summing eq. (1) over size s, we find

uα(n′) ' uα(n, 1) + uα(n, 2)x+ uα(n, 3)x2 + . . . , (2)

where x = n′/n. The growth of the mean usefulness of αstrongly depends on the complexity of products containing α.

Valence. So far we have only characterised a component byits usefulness: the number of products we can make that containit. Now we introduce another way of describing a component:the average complexity of the products it appears in. We callthis the valence. The valence vα of component α is the aver-age complexity of the products it appears in at stage N , whenwe have all N components. Think of the valence as the typi-cal number of co-stars a component performs with, plus one.We show the usefulness and valence for each of the componentsin our three sectors in Fig. 4ABC. More valent components areunlikely to be useful until we possess a lot of other components,so that we have a good chance of hitting upon the ones theyneed. These are the wheels and axels in our Lego set. On theother hand, less valent components are likely to boost our prod-uct space early on, when we have acquired fewer components.These are the Lego men and their firefighting hats. This insightsuggests that more valent components will tend to rise in rela-tive usefulness, and less valent components fall. This is verifiedin our experiments: components on the right of the plots inFig. 4ABC tend to rise in the bumps charts in Fig. 2, such asonion, tomato, Javascript and Git; whereas components on theleft tend to fall, like cocoa, vanilla, Google Apps and SendGrid.

Interpreting crossovers. A crossover in the usefulness of com-ponents means that the things that matter most today arenot the same as the things that will matter most tomorrow.How we interpret crossovers in practice depends on whetherthey are unanticipated, and take us by surprise, or anticipated,and can be planned for and exploited. When they are unantic-ipated, beneficial crossovers can seem to be serendipitous. Butwhen they can be anticipated, crossovers provide an opportu-nity to strategically increase the growth of our product space.To harness this opportunity, we turn to forecasting component

4

crossovers using the complexity of products containing them.Short-sighted strategy. To maximise the size of our product

space when crossovers are unanticipated, the optimal approachis to acquire, at each stage, the component that is most usefulfrom the ones that are remaining. Think of this as a “greedy”approach. It has a geometric interpretation: it is equivalent toacquiring the components that intersect the diagonals in Fig.2. At every stage we lock in to a specific component, unawareof the future implications of the choices we make. A componentpoorly picked is an opportunity lost.

Far-sighted strategy. Using only information about the prod-ucts we can already make with our existing components, how-ever, we can forecast the usefulness of our components into thefuture. Eq. (2) shows us how, and we give an example in SI C.Here the optimal approach is to acquire the component that willbe most useful at some later stage n′. This also has a geomet-ric interpretation: it is equivalent to acquiring the componentsthat intersect a vertical at n′ in Fig. 2, and thus depends onhow far into the future we forecast.

Strategy comparison. A short-sighted strategy considers onlythe usefulness uα, whereas a far-sighted strategy considers boththe usefulness uα and the valence vα. Short-sighted maximiseswhat a potential new component can do for us now, whereas far-sighted maximises what it could do for us later. Depending onour desire for short-term gain versus long-term growth, we havea spectrum of strategies dependent on n′. A pure short-sightedstrategy (n′ = n) and a pure far-sighted strategy (n′ = N)are compared in Fig. 4DEF. Like the Lego approaches of Boband Alice, both strategies beat acquiring components in a ran-dom order. As our theory predicts, the extent to which thetwo strategies differ from each other increases with the numberof crossovers. For language, they are nearly identical, becausethere are hardly any crossovers. For gastronomy, short-sightedhas a two-fold advantage at first, but later far-sighted wins bya factor of two. For technology, short-sighted surges ahead byan order of magnitude, but later far-sighted is dominant.

Serendipity and strategy. Our research helps resolve the ten-sion between a strategic approach to innovation, which viewsinnovation as a rational process which can be measured andprescribed [3, 4, 7, 8]; and a belief in serendipity and the intu-ition of extraordinary individuals [9–11]. A strategic approachis seen in firms like P&G and Unilever, which use process manu-als and consumer research to maintain a reliable innovation fac-tory [14], and Zara, which systematically scales new productsup and down based on real-time sales data. In scientific discov-ery, “traditional scientific training and thinking favor logic andpredictability over chance” [9]. If discoveries are actually madein the way that scientific publications suggest, the path to in-vention is a step-by-step, rational process. On the other hand,a serendipitous approach is seen in firms like Apple, which isnotoriously opposed to making innovation choices based on in-cremental consumer demands, and Tesla, which has invested foryears in their vision of long-distance electric cars [15]. In science,many of the most important discoveries have serendipitous ori-gins, in contrast to their published step-by-step write-ups, suchas penicillin, heparin, X-rays and nitrous oxide [9]. The role ofvision and intuition tend to be under-reported: a study of 33major discoveries in biochemistry “in which serendipity playeda crucial role” concluded that “when it comes to ‘chance’ fac-tors, few scientists ‘tell it like it was’” [16, 17].

Serendipity. Writing about the The Three Princes ofSerendip, Horace Walpole records that the princes “were al-ways making discoveries, by accidents and sagacity, of thingsthey were not in quest of”. Serendipity is the fortunate develop-ment of events, and many organizations and researchers stressits importance [9, 10]. Crossovers in component usefulness helpus see why. Components which depend on the presence of many

others can be of little benefit early on. But as the innovationprocess unfolds and the acquired components pay off, the re-sults will seem serendipitous, because a number of previouslylow-value components become invaluable. Thus, what appearsas serendipity is not happenstance but the delayed fruition ofcomponents reliant on the presence of others. After the acqui-sition of enough other components, these components flourish.For example, the initially useless axels and wheels were laterfound to be invaluable to building many new toys. In a similarway, the low value attributed to Flemming’s initial identifica-tion of lysosome was later revised to high value in the yearsleading to the discovery of penicillin, when other needed com-ponents emerged, such as sulfa drugs which showed that safeantiseptics are possible [9]. Interestingly, the word “serendip-ity” does not have an antonym. But as our bumps charts show,for every beneficial shift in a crossover, there is a detrimentalone. Each opportunity for serendipity goes hand-in-hand with achance for anti-serendipity : the acquisition of components use-ful now but less useful later. Avoiding these over-valued compo-nents is as important as acquiring under-valued ones to securinga large future product space.

Strategy. Our research shows that the most importantcomponents—materials, skills and routines—when an organiza-tion is less developed tend to be different from when it is moredeveloped. Instead, the relative usefulness of components canchange over time, in a statistically repeatable way. Recognisinghow an organization’s priorities depend on its maturity enableit to balance short-term gain with long-term growth. For ex-ample, our insights provide a framework for understanding thepoverty trap. When a less-developed country imitates a more-developed country by acquiring similar production capabilities[6], it is unable to quickly reap the rewards of its investment,because it does not have in place enough other needed capabil-ities. This in turn prevents it from further investment in thoseneeded components. Our analysis gives quantitative backing tothe “lean start-up” approach to building companies and launch-ing products [18]. Start-ups are wise to employ a short-sightedstrategy and release a minimum viable product. Without the re-sources to sustain a far-sighted approach, they need to quicklybring a simple product to market. On the other hand, firmsthat can weather an initial drought will see their sacrifice morethan paid off when their far-sighted approach kicks in. By track-ing how potential new components combine with existing ones,organisations can develop an information-advantaged strategyto adopt the right components at the right time. In this waythey can create their own serendipity, rather than relying onintuition and chance.

[1] D. Erwin, D. Krakauer, ‘Insights into innovation’, Science 304, 1117(2004).

[2] M. Reeves, K. Haanaes, J. Sinha, Your Strategy Needs a Strategy(Harvard Business Review Press, 2015).

[3] C. Weiss et al., ‘Adoption of a high-impact innovation in a homoge-neous population’, Phys Rev X, 4, 041008 (2014).

[4] J. McNerney et al., ‘Role of design complexity in technology im-provement’, Proc Natl Acad Sci, 108, 9008 (2011).

[5] R. Van Noorden, ‘Physicists make ‘weather forecasts’ for economies’,Nature, 1038, 16963 (2015):.

[6] A. Tacchella et al., ‘A new metric for countries’ fitness and products’complexity’, Sci Rep, 2, 723 (2012).

[7] P. Drucker, ‘The discipline of innovation’, Harvard Bus Rev 8, 1(2002).

[8] V. Sood et al., ‘Interacting branching process as a simple model ofinnovation’, Phys Rev Lett, 105, 178701 (2010).

[9] M. Rosenman, ‘Serendipity and scientific discovery’, Res UrbanEconomics, 13, 187 (2001).

[10] F. Johansson, ‘When success is born out of serendipity’, HarvardBus Rev 18, 22 (2012).

[11] W. Isaacson, The Innovators: How a Group of Hackers, Geniuses,and Geeks Created the Digital Revolution, (2014).

[12] Y.-Y. Ahn, S. E. Ahnert, J. P. Bagrow, A.-L. Barabsi, ‘Flavor net-work and the principles of food pairing’, Sci Rep 1, 196 (2011).

5

[13] We make no assumptions about the values of different products,which will depend on the market environment and may change withtime. But we can be sure that maximising the number of products isa proxy for maximising any reasonable property of them. A similarproxy is used in evolutionary models, where evolvability is defined asthe number of new phenotypes in the adjacent possible (1-mutationboundary) of a given phenotype; see A. Wagner, ‘Robustness andevolvability: a paradox resolved’, Proc Roy Soc B 91, 275 (2008).

[14] B. Brown, S. Anthony, ‘How P&G tripled its innovation successrate’, Harvard Bus Rev 6 (2011).

[15] K. Bullis, ‘How Tesla is driving electric car innovation’, MIT TechRev, 8 (2013).

[16] J. Comroe, ‘Roast pig and scientific discovery: Part II’, Am RevRespir Dis, 115, 853 (1977).

[17] F. Tria et al., ‘The dynamics of correlated novelties’, Sci Rep 4,5890 (2014).

[18] E. Ries, The Lean Startup, (Portfolio Penguin, 2011).

Online supplementary information (SI)

A. DataOur three data sets—described in Fig. 1—were obtained as fol-lows. In language, our list of 79,258 common English words isfrom the built-in WordList library in Mathematica 10. Of the84,923 KnownWords, we only considered those made from the26 letters a–z, ignoring case: we excluded words containing ahyphen, space, etc. In gastronomy, the 56,498 recipes can befound in the supplementary material in [12]. In technology, the1158 software products and the development tools used to makethem can be found at the site stackshare.io.

B. Proof of components invariantLet α be some component. Let N1 be the set of N − 1 otherpossible components not including α, n1 be a subset of n − 1components chosen from N1, and s1 be a subset of s − 1 com-ponents chosen from n1. The usefulness uα(n, s) is how manymore products of complexity s that we can make from the com-ponents n1 together with α, than from the components n1 alone:

uα(n, s) =∑s1⊆n1

prod(α ∩ s1)− prod(s1),

where prod(α∩ s1) takes the value 0 if the combination of com-ponents α ∩ s1 forms no products of complexity s and 1 ifα ∩ s1 forms one product of complexity s. (Occasionally, thesame combination of components α ∩ s1 forms multiple prod-ucts: for example, beef, butter and onion together form two dis-tinct recipes of length three. In such cases, prod(α ∩ s1) takesthe value 2 if α ∩ s1 forms two products, and so on.) The ex-pected usefulness of component α, uα(n, s), is the average ofuα(n, s) over all subsets n1 ⊆ N1; there are

(N−1n−1

)such subsets.

Therefore

uα(n, s) = 1/(N−1n−1

) ∑n1⊆N1

uα(n, s)

= 1/(N−1n−1

) ∑n1⊆N1

∑s1⊆n1

prod(α ∩ s1)− prod(s1).

Consider some particular combination of components s ′1. Thedouble sum above will count s ′1 once if s = n, but multiple timesif s < n, because s ′1 will belong to multiple sets n1. How many?In any set n1 that contains s1, there are n − s free elementsto choose, from N − s other components. Therefore the doublesum will count every combination s1 a total of

(N−sn−s

)times, and

uα(n, s) =(N−sn−s

)/(N−1n−1

) ∑s1⊆N1

prod(α ∩ s1)− prod(s1)

= N/n(ns

)/(Ns

)uα(N , s).

The same must be true when we replace n by n′, and therefore

uα(n, s)n/(ns

)= uα(n′, s)n′/

(n′

s

). (3)

When the number of components is big compared to the prod-

uct size (n, n′ � s), we can approximate(ns

)and

(n′

s

)by ns

and n′s, and thus

uα(n, s)/ns−1 ' uα(n′, s)/n′s−1.

For simplicity, we use this approximation in the mainmanuscript, but we could just as well have used the exactexpression in eq. (3).

C. Forecasting crossovers in usefulnessHere we show how we can forecast the usefulness of componentsat stage n′ from information we have at some earlier stagen, where n is the number of components we have acquired.As in Fig. 3, we have a set k of 127 ingredients in a smallkitchen—almond to fenugreek—and a set K of 381 ingredientsin a big kitchen—almond to zucchini.

In the small kitchen, we can make a total of 597 recipes.Of these 597 recipes, 43 contain cayenne, but they are not allequally complex. Two of the 43 recipes contain one ingredient(namely, cayenne itself) and have complexity one; one recipecontains two ingredients and has complexity two; 18 containthree ingredients and have complexity three; and so on. Simi-larly, 89 of the 597 recipes contain cocoa: six have complexityone; 22 have complexity two; and so on. Using eq. (2), we canwrite the mean usefulness of these two components as

uca(n′|k ) ' 2 + x+ 18x2 + 12x3 + 8x4 + x5 + x7 and

uco(n′|k ) ' 6 + 22x+ 37x2 + 16x3 + 8x4,

where x = n′/127. As expected,

uca(n′|k )∣∣x=1

= 43 and

uco(n′|k )∣∣x=1

= 89.

In the big kitchen, we can make a total of 56,498 recipes.Of these, 7950 contain cayenne and 4801 contain cocoa. Againusing eq. (2),

uca(n′|K ) ' 2 + 19x+ 64x2 + . . .+ 2x28 + 2x30 and

uco(n′|K ) ' 6 + 54x+ 195x2 + . . .+ 2x20 + 3x21.

where x = n′/381. As expected,

uca(n′|K )∣∣x=1

= 7950 and

uco(n′|K )∣∣x=1

= 4801.

So far, none of this is surprising. The punchline is that we canestimate the usefulness of components in the big kitchen fromwhat we know about our small kitchen. To do so, we simplyevaluate the small-kitchen polynomials at the big-kitchen stage:

uca(n′|K )∣∣n′=381

' uca(n′|k )∣∣x=3' 3569 and

uco(n′|K )∣∣n′=381

' uco(n′|k )∣∣x=3' 1485.

In log terms—log usefulness being the natural unit of measure—these are accurate to within 11% and 9% of the true values. Inparticular, this predicts the crossover of cayenne and cocoa inFigure 3.

6

eggwheat

butteroniongarlic

milk vegetable_oilcream tomatoolive_oil black_pepper

peppervanillacayennevinegarcane_molasses

bell_peppercinnamonparsley

chickenlemon_juice

beefcocoa cornbread scallionmustardginger basil celerycarrotpotato chicken_brothyeast rice mushroomcheese

soy_sauce

cuminoregano

parmesan_cheese

macaronilardlemon

thyme

cheddar_cheesecream_cheesewalnut

starch green_bell_pepper

nutmeghoneyapple

almondcilantropecan white_wine

baconpork

beanraisin

rosemaryfish cucumberolivecoconutorangeorange_juice

tamarindvegetablebuttermilkpineapple

shrimp corianderbay

lime_juicegelatin red_winepork_sausagesesame_oilchive

seedham mozzarella_cheese

oat

turmericnut

shallot

lettuceciderdill pea

zucchini

cherrylime

strawberry yogurt soybean

peanut_butter celery_oil

banana

meat tabasco_pepper

milk_fat

cabbage

mint

cranberry

fennel

sagebroccoli

turkey

wine fenugreek

beef_broth

grape_juice pumpkin

raspberry

whole_grain_wheat_flourcoffee

lemon_peel

sesame_seed avocadosherrysakefeta_cheeseapricot

rum

roasted_sesame_seedorange_peel

squash

crab

marjorampeach

swiss_cheesesweet_potato shiitakeradish

fruit tarragonblack_beanlamb

maple_syrup pear

blueberry

clamsmokepeanut tuna

kidney_bean

asparagussalmon

leek chickpea

blue_cheese

brandy

artichoke

mangohorseradish

date

white_breadcardamom oyster

cottage_cheesegrapebrown_ricecauliflower cured_pork

egg_noodlebeer

hazelnutmandarin

plum

romano_cheesepimentoscallop

smoked_sausage

goat_cheese

lentilcurrant saffronbarleybeet

carawaysquid

corn_flake anise

roasted_beef

pistachio peanut_oil

cereal

cashew

vealseaweedsauerkraut

turnipberry

kelp tomato_juicecod

blackberry

rhubarbprovolone_cheese

roasted_peanutcitrusmusselcorn_grit

chinese_cabbagemelon

bourbon_whiskey chicory lima_beanwhiskey

peppermint tequila fig

parsnip

lemongrass

watercressrye_flour savorylobster

roasted_pork

grapefruit mace

endivebrassicawatermelon

enokidakeporcini

kiwi wasabimacadamia_nuttea lime_peel_oilokra

champagne_winepopcorn smoked_salmon

kale

brussels_sprout rye_breadstar_anise thai_pepper

anise_seed yambitter_orange wheat_breadroot

buckwheatcatfishgin

cognacnirapotato_chip

rose

lavenderoatmeal

red_kidney_beanmatsutake

chervil

trufflechicken_liver

nectarine

bone_oil

katsuobushiport_winesour_cherry papayasour_milk

octopusgruyere_cheese

mackereltangerine

liverblack_tea

palmapple_brandyfrankfurter malt

cacao

cherry_brandy rutabaga

juniper_berryred_beanwood

green_tea haddock

flower

black_mustard_seed_oil

kumquatquince munster_cheese

shellfishblack_sesame_seed

caviarchayote

prawn

bartlett_pearroquefort_cheesecassavaeellicorice

passion_fruit

prickly_pearmung_bean

orange_flower

sassafras cabernet_sauvignon_wine

coconut_oilroasted_meat

sumacartemisia guava

japanese_plum salmon_roecamembert_cheese

concord_grapearmagnac

black_currantpear_brandy

beef_liver clove herringhuckleberry

mandarin_peelbaked_potato condiment

gardenialeaflingonberrylitchi ouzoblack_raspberry grape_brandy sunflower_oil

bergamot

carobjasmine

kohlrabismoked_fish violetelderberry

pork_liver spearmintblackberry_brandy

citrus_peel

sea_algae

balmcarnation

chamomilehop rapeseed

roasted_almondholy_basil

pimenta

raw_beef

red_algae

sheep_cheese

soybean_oilstrawberry_juice

4 6 8 10 12 14

10

100

1000

104

Average complexity of recipes an ingredient appears in (valence)

Numberofrecipesaningredientappearsin

(usefulness)

eggwheatbutteroniongarlicmilkvegetable_oilcreamtomatoolive_oilblack_pepperpeppervanillacayennevinegarcane_molassesbell_peppercinnamonparsleychickenlemon_juicebeefcocoacornbreadscallionmustardgingerbasilcelerycarrotpotatochicken_brothyeastricemushroomcheesesoy_saucecuminoreganoparmesan_cheesemacaronilardlemonthymecheddar_cheesecream_cheesewalnutstarchgreen_bell_peppernutmeghoneyapplealmondcilantropecanwhite_winebaconpork

beanraisin

rosemaryfishcucumberolivecoconutorangeorange_juicetamarindvegetablebuttermilkpineappleshrimpcorianderbaylime_juicegelatinred_winepork_sausagesesame_oilchiveseedhammozzarella_cheeseoatturmericnutshallotlettuceciderdillpeazucchinicherrylimestrawberryyogurtsoybeanpeanut_buttercelery_oil

95

190

286

FIG. 5: (Top) The valence-usefulness scatter plot for all ingredients that are used in two or more recipes (365 of the 381 ingredients).(Bottom) The relative usefulness of different ingredients as the number of ingredients we possess increases, for the 100 ingredients mostuseful when we have all 381 ingredients.

7

Google Analytics

GitHubjQuery

nginxBootstrap Slack JavaScriptNew Relic

RedisGoogle Apps

Amazon S3

Amazon EC2Git

AngularJS

Node.jsMySQL

Amazon CloudFrontTrelloRailsPostgreSQLRubyMongoDBPython PingdomMixpanel

MailChimp

PHP Docker

Mandrill

Sublime TextElasticsearch

StripeHeroku

Sass

SendGrid

Google DrivenpmJenkinsBower

GruntZendesk VagrantDropboxJava HTML5

Amazon Route 53Bitbucket gulpApache HTTP ServerWordPressCloudFlare

Amazon RDSJIRASentry Objective-C

Backbone.jsReactIntercomOptimizely

MemcachedjQuery UI

VimLessDigitalOcean Android SDK

HipChatMailgun

CoffeeScriptChefDjango VirtualBoxTravis CIGo InVisionTwilio

SkypePagerDutyAsanaXcode

Segment

RabbitMQUnderscore ExpressJSCircleCI

AnsibleRequireJS HAProxy

PayPalSidekiq CapistranoSeleniumMarkdownAtom

BrowserStackConfluence

Amazon SESVarnish D3.jsCodeship TestFlightPapertrail Android Studio

UnicornStatusPage.io CrashlyticsKISSmetrics GitHub PagesSwiftScala Handlebars.js Amazon ElastiCacheAmazon EBSEmber.js

Code ClimateFlaskOlark.NET

Logentries

GitLab

Puppet Labs

Pivotal TrackerCassandra

Amazon CloudWatchSourceTreeIntelliJ IDEA

BalsamiqAdRoll Datadog Amazon SQS

LaravelBrowserify

Postman

Socket.IORollbarGoogle Maps

Disqus

UserVoiceRackspace Cloud Servers

Mocha

BraintreeAmazon VPCPhpStorm

Keen IO

Parse

Nagios

MongoLab

HadoopBugsnag

Pusher

C#Basecamp

Amazon DynamoDBZapier

Foundation

Fastly

Celery

Solr

IonicHeap Jasmine

Compass boot2docker

Airbrake Visual Studio

Sinatra

Passenger

Chartbeat Yeoman

SymfonyHAML

Apache Maven

Spring

Salesforce Sales CloudMicrosoft SQL Server

FirebaseLinodeCompose

Amazon EC2 Container ServiceAlgolia

JadeFabric Buffer

LogglyHelp Scout Google App EngineFlux

AWS Elastic BeanstalkSQLiteDNSimpleCustomer.io

Windows AzureMaxCDNMarketo

Eclipse

Amazon Redshift

WebStorm

Meteor KarmaDesk.com

Sauce LabsLibratoKafka

GraphiteCloudinary

TeamCityPlayHubSpotGradleDeviseStatsD

RubyMine

PhoneGapCrazy Egg

Zookeeper RGoogle Compute EngineApache Tomcat

SoftLayer AWS CloudFormationAkamai

Webpack

TumblrMariaDBJekyllDrupalAmazon SNSAmazon EMR

Microsoft IISMemCachieriDoneThis

TrackJS

Piwik OpenStack

LogstashKibana

EmacsSwiftype

StylusQualaroo

Mustache

Material Design for AngularFilepicker

Coveralls

ClojureAWS Elastic Load Balancing (ELB)Salt RecurlyInfluxDB HighchartsFlowdockDyn

SVN (Subversion)

RaygunNeo4j

Hubot Gunicorn

ClickTale

Campaign MonitorApache Mesos

Perfect AudienceMongooseMEAN

ErlangEmbedly Consul WistiaPerlPacker OVH

HoneybadgerHockeyApp

Django REST frameworkC++ AWS OpsWorksAWS IAM

UserTestingStorm Stack OverflowPostmark

HBaseGrafana Flurry CodeIgniter

Amazon RDS for PostgreSQLZopimwerckerTornadoScout

RedmineOpenShift

IronMQHeroku Postgres

Cloud9 IDEAzure Websites Azure Storage

Apache Spark

Amazon KinesisYiiTerraform

ShopifySemantic UIPhabricator

Notepad++HarvestHackPad GoSquared

GhostGeckoboard

CouchDBApiaryZenPayroll

Puma

PubNub

MiddlemanMarionette

Looker

imgix

HHVM (HipHop Virtual Machine)GroovyFramer DeployBot ClickyAmplitude

waffle.ioUXPin

Sumo Logic SquarespaceSemaphoreSails.jsRunscopeRedis Cloud

OracleOneLogin Nexmo

NetBeans IDE

Litmus Jetty

Hogan.js

EdgeCast

Discourse C3.jsAviaryApp Annie

ZeroMQ Zencoder Urban AirshipUnbounceTransifex Tower sendwithusResquePyCharmPostGISLeaflet jQuery Mobile Join.meHelloSignGearman CakePHPBoxBeanstalkd BeanstalkAWS Lambda

YammerXamarin

20 30 40 50 60 70

10

50

100

500

Average complexity of software product a tool appears in (valence)

Numberofsoftwareproductsatoolappearsin

(usefulness)

Google AnalyticsGitHubjQuerynginxBootstrapSlackJavaScriptNew RelicRedisGoogle AppsAmazon S3Amazon EC2GitAngularJSNode.jsMySQLAmazon CloudFrontTrelloRailsPostgreSQLRubyMongoDBPython

PingdomMixpanelMailChimp

PHPDockerMandrillSublime TextElasticsearch

StripeHeroku

Sass

SendGridGoogle Drive

npmJenkinsBowerGruntZendesk

VagrantDropbox

JavaHTML5

Amazon Route 53Bitbucket

gulpApache HTTP Server

WordPress

CloudFlareAmazon RDS

JIRA

SentryObjective-CBackbone.js

ReactIntercom

OptimizelyMemcachedjQuery UI

VimLessDigitalOcean

Android SDKHipChat

MailgunCoffeeScript

ChefDjangoVirtualBox

Travis CIGo

InVision

TwilioSkype

PagerDutyAsana

XcodeSegment

RabbitMQUnderscore

ExpressJSCircleCI

Ansible

RequireJSHAProxy

PayPalSidekiqCapistrano

SeleniumMarkdownAtom

BrowserStackConfluenceAmazon SESVarnish

D3.jsCodeship

TestFlight

248

496

745

FIG. 6: (Top) The valence-usefulness scatter plot for the 365 technology tools most useful in making software products. (Bottom) Therelative usefulness of different tools as the number of tools we possess increases, for the 100 tools most useful when we have all 993 tools.

Documents

Serendipity and strategy in rapid innovation - arXiv · Serendipity and strategy in rapid innovation T. M. A. Finky, M. Reeves z, R. Palma and R. S. Farry yLondon Institute for Mathematical