24
Index Notes: PA stands for predictive analytics. Page numbers in italics followed by irefer to pages within the Central Tables insert. The nafter a page number refers to an entry that appears in a footnote on that page. A Abbott, Dean, 71, 91, 199, 200, 204, 303 AB testing, 261 Accident Fund Insurance, 7i accommodation bookings, 2i, 8i actuarial approach, 152 advertisement targeting, predictive, 31, 296 advertising. See marketing and advertising Against Prediction: Proling, Policing, and Punishing in an Actuarial Age (Harcourt), 82 AI. See articial intelligence (AI) Airbnb, 2i, 8i Air Force, 18i airlines and aviation, predicting in, 8, 14i, 128 Albee, Edward, 113 Albrecht, Katherine, 53 algorithmic trading. See blackbox trading Allen, Woody, 10 Allstate, 7i, 9, 191 Amazon.com employee security access needs, 12i machine learning and predictive models, 8 Mechanical Turk, 76 personalized recommendations, 5i sarcasm in reviews, 8 American Civil Liberties Union (ACLU), 67, 99 American Public University System, 9, 16i analytics, 16 Analytics Revolution, The (Franks), 305 Ansari X Prize, 187 Apollo 11, 223, 244 Apple, Inc., 115, 223 Apple Mac, 118 Apple Siri, 215, 292 Applied Predictive Analysis (Abbott), 303 Are Orange Cars Really not Lemons?(Elder, Bullard), 143n Argonne National Laboratory, 13i Arizona Petried Forest National Park, 259260 Arizona State University, 9, 16i articial intelligence (AI) Amazon.com Mechanical Turk, 76 mind-reading technology, 8, 19i possibility of, the, 209, 217219, 248 Watson computer and, 213219 Asimov, Isaac, 139 astronomy, 113, 189 313 COPYRIGHTED MATERIAL

3GBINDEX 12/08/2015 1:51:44 Page 313

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:44 Page 313

Index

Notes:

• PA stands for predictive analytics.

• Page numbers in italics followed by “i” refer to pages within the Central Tables insert.

• The “n” after a page number refers to an entry that appears in a footnote on that page.

AAbbott, Dean, 71, 91, 199, 200, 204, 303AB testing, 261Accident Fund Insurance, 7iaccommodation bookings, 2i, 8iactuarial approach, 152advertisement targeting, predictive,

31, 296advertising. See marketing and advertisingAgainst Prediction: Profiling, Policing, and

Punishing in an Actuarial Age(Harcourt), 82

AI. See artificial intelligence (AI)Airbnb, 2i, 8iAir Force, 18iairlines and aviation, predicting in, 8, 14i,

128Albee, Edward, 113Albrecht, Katherine, 53algorithmic trading. See blackbox tradingAllen, Woody, 10Allstate, 7i, 9, 191Amazon.comemployee security access needs, 12imachine learning and predictive models, 8Mechanical Turk, 76

personalized recommendations, 5isarcasm in reviews, 8

American Civil Liberties Union (ACLU),67, 99

American Public University System, 9, 16ianalytics, 16Analytics Revolution, The (Franks), 305Ansari X Prize, 187Apollo 11, 223, 244Apple, Inc., 115, 223Apple Mac, 118Apple Siri, 215, 292Applied Predictive Analysis (Abbott), 303“Are Orange Cars Really not Lemons?”

(Elder, Bullard), 143nArgonne National Laboratory, 13iArizona Petrified Forest National Park,

259–260Arizona State University, 9, 16iartificial intelligence (AI)Amazon.com Mechanical Turk, 76mind-reading technology, 8, 19ipossibility of, the, 209, 217–219, 248Watson computer and, 213–219

Asimov, Isaac, 139astronomy, 113, 189

313

COPYRIG

HTED M

ATERIAL

Page 2: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:45 Page 314

AT&T Research BellKor Netflix Prizeteams, 188–189, 192–197

Australia, 14i, 16i, 106, 120, 178, 191Austria, 122, 193automatic suspect discovery (ASD), 87–101approach to, 88–91arguments for and against, 96–99assumptions about NSA’s use of, 94–95challenges of, 99–100defined, 88example patterns, 91–94privacy issues and, 87–88, 100

automobile insurancecrashes, predicting, 9, 191credit scores and accidents, 120–121driver inattentiveness, 9, 18i, 191fraud predictions for, 11i, 70

Averitt, 18iaviation incidents, 9, 14iAviva Insurance (U.K.), 11iAWK computer language, 111Ayres, Ian, 306

Bbacktesting, 39–41. See also test dataBaesens, Bart, 182bagging (bootstrap aggregating), 200–203Baker, Stephen, 306Baltimore, 11iBangladesh, 140, 167, 174Barbie dolls, 118Bari, Anasse, 304Bayes, Thomas (Bayes Network), 31Beane, Billy, 43Beano, 57Beaux, Alex, 61–62, 63behavioral predictors, 115Bella Pictures, 5iBellKor, 188–189, 193, 195–197BellKor Netflix Prize teams, 188–189,

192–197Ben Gurion University (Israel), 79, 125Bernstein, Peter, 151Berra, Yogi, 268

Berry, J. A., 303Big Bang theory, 4, 113Big Bang Theory, The (TV show), 18Big Brother, 85BigChaos team, 192–197big data

about, xviii, 16, 105, 114The Data Effect and, 115, 135–145, 295“wider” data and, 140–141

billing errors, predicting, 10iblack-box trading, 8i, 24, 38–44Black Swan, The (Taleb), xviii, 168Blue Cross Blue Shield of Tennessee, 7i, 10iBMW, 23, 79BNSF Railway, 13iboard games, predictive play of, 76, 297Bohr, Niels, 13Boire, Richard, 305Bollen, Johan, 107nbooks, about Predictive Analytics, 303–304,

305, 306book titles, testing, 262Bowie, David, 32brain activity, predicting, 19iBrandeis, Louis, 55Brazil, 98breast cancer, predicting, 9i, 10Breiman, Leo, 178, 200Brigham Young University, 10, 10iBritish Broadcasting Corporation, 16i, 18iBrobst, Stephen, 151Brooks, Mel, 38Brynjolfsson, Eric, 111–112buildings, predicting fault in, 13iBullard, Ben, 143nburglaries, predicting, 11i, 67business rules, decision trees and, 161, 164buying behavior, predicting, 3i, 48–49,

121

CCage, Nicolas, 28, 262Canadian Automobile Association, 14iCanadian Tire, 7i, 121

314 Index

Page 3: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:45 Page 315

car crashes and harm, predicting, 9, 10i, 191CareerBuilder, 20i, 191Carlin, George, 212Carlson, Gretchen, 52Carnegie Mellon University, 57, 228, 285cars, “orange lemons,” 104–109, 142–143car services passenger destination,

predicting, 14iCART decision trees, 178–181, 200–203causality, 35, 131–135, 158, 257cell phone industryconsumer behavior and, 120dropped calls, predicting, 8–9, 14i, 204GPS data and location predicting, 2i, 57,

84Telenor (Norway), 5i, 252–254, 281

CellTel (African telecom), 125Centers for Medicare and Medicaid

Services, 11iCerebellum Capital, 8iChabbert, Martin, 185–189, 192, 301Chaffee, Alexander Day, 171, 244CHAID, 178Chaouchi, Mohamed, 304Charlotte Rescue Mission, 15iChase Bankactuarial approach used by, 152churn modeling, 166, 252ndata, learning from, 153–156economic recession and risk, 181–182learning from data, 162–163mergers and growth of, 149, 181microrisks faced by, 149–153mortgage risk decision trees, 163–165,

179–181mortgage risks, predicting, 5i, 7i,

149–150mortgage value estimation, 180–181predictive models and success, 181risks faced by, 149–153, 181Steinberg as leader, 3, 147–148, 153–156,

175–176, 178–181cheating, predicting, 11–12, 18i, 76

check fraud, predicting, 11, 11ichess-playing computers, 76, 222–223Chicago Police Department, 12, 12iChopra, Sameer, 304Chu-Carroll, Jennifer, 244, 246churn modeling, 166, 252–254, 277, 281,

298, 299Cialdini, Robert, 259–260Citibank, 8, 18iCitigroup, 7iCitizens Bank, 11, 11i, 69City of Boston, 15iCity of Chicago, 14i, 15iCity of New York, 12i, 14i, 16icity power, predicting fault in, 13icity regulations, crime predictions and,

11icivil liberties, risks to, 47–48, 80Clarke, Arthur C., 209clicks, predicting, 7, 20i, 25, 29, 262clinical trial recruitment, predicting, 10iClinton, Hillary, 8, 289cloning customers, 156nCoase, Ronald, 167Colbert, Stephen, 53, 283Colburn, Rafe, 74collection agencies, PA for, 10, 13icollective intelligence, 197–199collective learning, 17–18Columbia University, 12, 79, 125, 248Commander Data, 77company networks, predicting fault in,

13iCompeting on Analytics: The New Science of

Winning (Davenport and Harris),37–38, 305

competitions. See PA competitionscomputational linguistics, 210. See also

natural language processing (NLP)computer “intelligence.” See artificial

intelligence (AI)computer programs, decision trees and,

161–162

Index 315

Page 4: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:46 Page 316

computerscreation of, 18–19Deep Blue computer, 76, 222–223human brain versus, 19–20viruses on, predicting, 12i, 73See also artificial intelligence; machine

learning; Watson computer Jeopardy!challenge

computer science, 74Con Edison, 6, 13iconferences, 305Congo massacres, 125consumer behavior, insights on,

118–119contests. See PA competitionsContinental Airlines, 14icontrol sets, 267, 268corporate roll-ups, 195ncouponing, predictive, 6–7Coursera, 153nCox Communications, 4iCraig, Roger, 17i, 225, 302credit cardsfraud detection for, 69payment systems, predicting fault in, 10,

13icredit riskcredit scores and, 149–151typing and, 121–122

Crichton, Michael, 172crime fighting and fraud detectioncheating in board games, 76credit card fraud, 69in financial institutions, 69–71fraud, defined, 68fraudulent transactions, 69IRS tax fraud, 11i, 12, 71, 204machine injustice, 79–80network intrusion, 12i, 72PA application, 70, 297PA for, 8i–9i, 11, 68–77prejudice in, risk of, 81–82spam filtering, 74

crime prediction for law enforcementautomatic suspect discovery (ASD) andNSA, 87–101

cybercrime, 73ethical risks in, 79–82fighting, PA for, 8i–9i, 11, 67judicial decisions and, 77–80, 160–161murder, predicting, 11, 11i, 12i, 18i, 78PA application, 66–67, 297PA examples and insights, 125prediction models for, pros and cons of,83–86

recidivism, 12i, 77–80Uber and behavior, 118

crowdsourcingcollective intelligence and, 197–199Kaggle PA crowdsourcing contests,189–191, 204

noncompetitive crowdsourcing, 190nPA and, 185, 187–191, 197–200, 224

Cruise, Tom, 79, 152customer need, predicting fault in, 14icustomer retention

cancellations and predicting, 4i, 8with churn modeling, 166, 252–254, 277,298

with churn uplift modeling, 277, 299contacting customers and, 119–120feedback and dissatisfaction, 18i

customization, perils of, 30–31cybercrime, 73cybernetics, xvii

DD’Arcy, Aoife, 304data, sharing and using

automatic suspect discovery (ASD) andNSA, 87–101

battle over mining, 56–57choosing what to collect, 6–12, 56–57,249, 254–256

growth and rates of expansion, 109–114policies for, 55

316 Index

Page 5: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:47 Page 317

privacy concerns and, 52–54, 58–60,83–86

data, types and sources ofbig data, xviii, 16, 105, 114buzzwords for, 114consumption and, 6–7employee data, 61fake data, 68–69free public data, 110learning data, 162–163location data, 2i, 57, 84medical data, 85personal data, 54–55, 57social media posts, 9itextual data, 210, 214

data, value ofabout, 4–5learning from data, 153–156machine learning and, 4–5personal data, 54–55, 57as “the new oil,” 115

Dataclysm (Rudder), 306Data Effect,The, 115, 135-145, 295data gluts, 112Data.gov, 110data hustlers and prospectors, 57, 85data mining, 17, 36, 58, 114Data Mining for Managers (Boire), 305Data Mining Techniques (Linoff, Berry), 303data preparation phase, 162–163data science, 16-17, 114nData-sim (Lohr), 306Data Smart (Foreman), 303data storage, 56–57, 112dating websitesattractiveness ratings and success, 127consumer behavior on, 118–119predicting on, 3i, 7, 110, 127, 259,

291–292Davenport, Thomas H., xix, 37–38, 305death predictions, 9i, 10i, 10–11, 84–85,

306“Deathwatch” (Siegel), 306

deception, predicting, 11–12, 18iDecision Management Systems (Taylor), 305decision treesCART decision trees, 178–181, 200–203circle single model, 201–202decision boundaries in, 202getting data from, 175–179in machine learning, 156–162methods competing with, 171–172mortgage risk decision trees, 163–165,180–181

overlearning and assuming, 167–169random forests, 201uplift trees, 275

deduction versus induction, 169–170Deep Blue computer, 76, 222–223DeepQA, 247deliveries, predicting, 14iDelta Financial, 38–41Deming, William Edwards, 4Democratic National Committee (DNC),

283–288Deshpande, Bala, 304Dey, Anindya, 59, 61, 301Dhar, Vasant, 17diapers and beer, behavior and, 117Dick, Philip K., 28differential response modeling. See uplift

modelingdiscrimination, risks of, 81–82disease, predicting, 9iDisraeli, Benjamin, 167divorce, 7, 53–54dolls and candy bars, behavior and, 118Domingoes, Pedro, 305donations and giving, predicting, 15i, 204do-overs, 257–259Dormehl, Luke, 306downlift, 273driver inattentiveness, predicting, 9, 18i, 191driverless cars, 23, 79Drucker, Peter, 149drugs effects and use, predicting, 10i

Index 317

Page 6: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:48 Page 318

“Dry Bones” (song), 114DTE Energy, 8iDubner, Stephen J., 27, 72, 81, 91n, 92, 95Duhigg, Charles, 51–52dunnhumby, 191Dyson, George, 214

EEarthlink.com, 119Echo Nest, 126economic recession, 181–182educationgrades, predicting, 16i, 191guided studying for targeted learning,

225, 299PA for, 8, 15i–17istudent dropout risk, predicting, 9, 16istudent knowledge and performance,

predicting, 17i, 191eHarmony, 7Eindhoven University, 16iEindhoven University, Netherlands, 9Einstein, Albert, 4, 175Elder, John, 8i, 143n, 300, 304about, 23, 39, 43–44“Are Orange Cars Really not Lemons?,”

143nblack-box trading systems, 24, 38–44on employee death predictions, 84on generalization paradox, 204–205in Netflix Prize competition, 192on passion for science, 44–45on power of data, 54risks taken by, 38–41on “vast search,” 140

Elder Research, Inc., 45, 71, 132, 143n, 192elections, crime rates and, 125electoral politicsHillary for America 2016 Campaign, 15imusical taste political affiliation, 125Obama for America 2012 Campaign, 15iObama for America Campaign, 8,

282–288, 286–288

uplift modeling applications for, 277,286–288

voter persuasion, predicting, 15i, 160,282–288

electronic equipment, predicting fault in,13i

Elements of Statistical Learning, The(Hastie, Tibshirani, Friedman), 304

Elie Tahari, 4ie-mail

consumer behavior and addresses for,119–120

government storage of, 113Hotmail.com, 34, 109, 119phishing e-mails, 74spam filtering for, 74

emotionsin blog posts, 107human behavior and, 106–108mood predictions and, 106–108See also human behavior

employee longevity, predicting, 20iemployees and staff

job applications and positions, 20i, 191job performance, predicting, 20i, 204job promotions and retention, 128job skills, predicting, 20iLinkedIn for career predictions, 7, 20iprivacy concerns and data on, 84quitting, predicting, 9, 20i, 58–64,128

Energex (Australia), 6, 16ienergy consumption, predicting, 16iEnsemble Effect, The, 205, 275n, 295Ensemble Experts, 192ensemble models

about, 204automatic suspect discovery (ASD) and,93n

CART decision trees and bagging,178–181, 200–203

collective intelligence in, 199–200complexity in, 204

318 Index

Page 7: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:48 Page 319

crowdsourcing and, 185, 187–191,197–200, 224

generalization paradox and, 204IBMWatson question answering

computer and, 18i, 204IRS (tax fraud), 11i, 12, 71, 204meta-learning and, 193–195Nature Conservancy (donations), 15i, 204Netflix (movie recommendations), 5i, 6,

183, 186, 204Nokia-Siemens Networks (dropped

calls), 14i, 204University of California, Berkeley (brain

activity), 19i, 204for uplift modeling, 275nU.S. Department of Defense (fraudulent

invoices), 11iU.S. Department of Defense Service

(fraudulent invoices), 204U.S. Special Forces (job performance),

20i, 204Ensemble team, 197Epagogix, 8ierectile dysfunction, 10iExperian, 150Exxon Mobil Corp., 115

F“Fab Four” inventors, 178, 200Facebookdata glut on, 112data on, 4, 29, 55fake data on, 55friendships, predicting, 2i, 191happiness as contagious on, 124job performance and profiles on, 20isocial effect of, 124student performance PA contest, 17i

facial recognition, 2iFailure of Risk Management, The

(Hubbard), 151–152false conclusions, avoiding, 104–109,

142–143

false positives (false alarms), 79, 140family and personal life, PA for, 2i–3i, 7, 54,

124Farrell, Colin, 79fault detection for safety and efficiency, PA

for, 13i–14iFederal Trade Commission, 69FedEx, 4i, 9Femto-photography, 112Ferguson, Andrew, 101Ferrucci, David, 217, 243, 246FICO, 10i, 150Fidelity Investments, 275finance and accounting, fraud detection in,

11, 69–71finance websites, behavior on, 118–119financial risk and insurance, PA for, 7i–8i,

11Fingerhut, 4iFinland, 124fire, predicting, 16iFirst Tennessee Bank, 3i, 4iFisher, Ronald, 135, 136nFleming, Alexander, 139flight delays, predicting fault in, 14iFlight Risks, predicting, 8, 47, 58–64Flirtback computer program, 111Florida Department of Juvenile Justice, 12ifMRI brain scans, 19iFoldit, 190nFood and Drug Administration (FDA), 279Fooled by Randomness (Taleb), 136–137, 138,

168Ford Motor Co., 9, 18i, 191forecasting, 17, 284Foreman, John W., 303Formula, The (Dormehl), 306Fox & Friends (TV show), 52Franklin, Benjamin, 29, 75Franks, Bill, 305fraud, defined, 68fraud detection, 48, 61, 68–77, 297. See also

crime fighting and fraud detection

Index 319

Page 8: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:49 Page 320

Freakonomics Radio, 27, 72, 81, 140Freakonomics Radio, 140frequency, 116Friedman, Jerome, 178, 304friendships, predicting, 3i, 191Fukuman, Audrey, 159Fulcher, Christopher, 66Fundamentals of Machine Learning for Predictive

Data Analytics (Kelleher, MacNamee,D’Arcy), 304

fund-raising, predicting in, 16iFurnas, Alexander, 56, 83future, views onhuman nature and knowing about, xxipredictions for 2022, 291–293uncertainty of, 12–13

GGalileo, 109Gates, Bill, 137generalization paradox, 204Ghani, Rayid, 284–285, 302Gilbert, Allen, 99Gladwell, Malcolm, 30–31GlaxoSmithKline (U.K.), 10iGoethe, Johann Wolfgang von, 36Goldbloom, Anthony, 190, 204Gondek, David, 207, 224, 227–231,

234–237, 240n, 245–248, 302Googlemouse clicks, measuring for predictions,

7, 29, 262privacy policies, 84Schmidt, 83, 84searches for playing Jeopardy!, 214self-driving cars, 23, 79spam filtering, 74

Google Adwords, 30, 262Google Flu Trends, 9i, 123Google Page Rank, 199governmentdata storage by, 110fraud detection for invoices, 11i, 71, 204

PA for, 15i–17ipublic access to data, 110See also individual names of U.S. governmentagencies

GPS data, 2i, 57, 84grades, predicting, 16i, 191grant awards, predicting, 16i, 191Greenwald, Glenn, 98Grockit, 17i, 191Groundhog Day (film), 258Grundhoefer, Michael, 267, 268–269

Hhackers, predicting, 12i, 73HAL (intelligent computer), 209–210Halder, Gitali, 59, 61, 301Handbook of Statistical Analysis and Data Mining

Applications (Nisbet, Elder, Miner), 304Hansell, Saul, 182happiness, social effect and, 107, 124Harbor Sweets, 4i, 6Harcourt, Bernard, 82Harrah’s Las Vegas, 4iHarris, Jeanne, 37–38, 305Harvard Medical School, 9Harvard University, 123Hastings, Reed, 197healthcare

death predictions in, 9i, 10–11, 84–85health risks, predicting, 10ihospital admissions, predicting, 10iinfluenza, predicting, 9i, 124insurance companies, predicting, 7i, 9imedical research, predicting in, 124medical treatments, risks for wrongpredictions in, 267–269, 277, 279

medical treatments, testing persuasion in,262

PA for, 9i–10i, 10–11, 124personalized medicine, uplift modelingapplications for, 277, 279

health insurance companies, PA for, 9i–10i,10–11, 84–85

320 Index

Page 9: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:50 Page 321

Hebrew University, 18iHeisenberg, Werner Karl, 262Helle, Eva, 280, 281, 282, 302Helsinki Brain Research Centre, 124Hennessey, Kathleen, 282, 286Heraclitus, 257Heritage Health Prize, 191Heritage Provider Network, 10, 10iHewlett Foundation, 16i, 189, 191Hewlett-Packard (HP)employee data used by, 61financial savings and benefits of PA, 61,

64Global Business Services (GBS), 62quitting and Flight Risks, predicting, 8,

20i, 48, 58–64, 128sales leads, predicting, 5iturnover rates at, 60warranty claims and fraud detection, 11,

11iHillary for America 2016 Campaign, 8, 15iHIV progression, predicting, 9i, 191HIV treatments, uplift modeling for, 279Hollifield, Stephen, 67Holmes, Sherlock, 33, 34, 83, 86Hopper, 8ihormone replacement, coronary disease

and, 133hospital admissions, predicting, 10i, 10–11Hotmail.com, 34, 109, 119House (TV show), 83“How Companies Learn Your Secrets”

(Duhigg), 51–52Howe, Jeff, 187, 189HP. See Hewlett-Packard (HP)Hubbard, Douglas, 109, 151–152human behaviorcollective intelligence, 197–199consumer behavior insights, 119–120emotions and mood prediction, 199–200mistakes, predicting, 11–12social effect and, 11–12, 120, 124

human genome, 113

human languagenatural language processing (NLP), 210,219–221, 227–231

PA for, 18i–19ipersuasion and influence in, 259–260

human resources. See employees and staff

IIBMcorporate roll-ups, 195nDeep Blue computer, 76, 224DeepQA project, 247Iambic IBM AI, 248mind-reading technology, 8natural language processing research, 227sales leads, predicting, 5istudent performance PA contest, 17iT. J. Watson Research Center, 224value of, 222–223See alsoWatson computer Jeopardy!challenge

ID3, 178impact modeling. See uplift modelingImperium, 18i, 191inappropriate comments, predicting, 18iincremental impact modeling. See uplift

modelingincremental response modeling. See uplift

modelingIndia, 125Indiana University, 107Induction Effect, The, 179, 295induction versus deduction, 169–170inductive bias, 170ineffective advertising, predicting, 20iinfidelity, predicting, 122Infinity Insurance, 7i, 16iinfluence. See persuasion and influenceinfluenza, predicting, 9i, 124information technology (IT) systems,

predicting fault in, 13iInnoCentive, 190ninsults, predicting, 18i, 191

Index 321

Page 10: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:50 Page 322

insurance claimsautomobile insurance fraud, predicting,

7i, 10–11, 70, 121, 191death predictions and, 7i, 10–11, 84–85financial risk predicting in, 8ihealth insurance, 9i, 10–11, 84–85life insurance companies, 11life nsurance companies, 7i

Integral Solutions Limited, 195nInternal Revenue Service (IRS), 11i, 12, 71,

106, 204International Conference on Very Large

Databases, 114Iowa State University, 9, 16iiPhone. See Apple SiriIran, 13iIsrael Institute of Technology, 12i

JJapan, 124Jennings, Ken, 226, 239, 240, 246Jeopardy! (TV show). SeeWatson computer

Jeopardy! challengeJevons, William Stanley, 169Jewell, Robert, 248jobs and employment. See employees and staffJones, Chris, 238Journal of Computational Science, 107nJPMorgan Chase. See Chase Bankjudicial decisions, crime prediction and, 79,

160–161Jung, Tommy, 304Jurassic Park (Crichton), 172Just Giving, 15i

KKaggle, 189–191, 204Kane, Katherine, 276Kasparov, Garry, 76, 223KDnuggets, 84, 110Keane, Bil, xxix“keep it simple, stupid” (KISS) principle,

178, 200

Kelleher, John D., 304Khabaza, Tom, 115killing, predicting, 11, 11i, 18iKing, Eric, 248, 269nKiva, 8iKmart, 7knee surgery choices, 124knowledge (for education), predicting, 13iKotu, Vijay, 304Kretsinger, Stein, 132, 133, 189Kroger, 7Kuneva, Meglena, 4, 115Kurtz, Ellen, 78, 82Kurzweil, Ray, 113

Llanguage. See human languageLashkar-e-Taiba, 92law enforcement. See crime prediction for

law enforcementlead poisoning from paint, predicting, 16ilearning

about, 17–21collective learning, 17–18education—guided studying for targetedlearning, 225, 299

learning from data, 162–163memorization versus, 169overlearning, avoiding, 167–169, 175,200, 204

Leinweber, David, 167Leno, Jay, 12Levant, Oscar, 217Levitt, Stephen, 72, 81, 91n, 92, 95Lewis, Michael, 15lies, predicting. See deception, predictingLie to Me (TV show), 12life insurance companies, PA for, 7i, 11Life Line Screening, 4ilift, 166, 266nLindbergh, Charles, 223LinkedIn

friendships, predicting, 3i

322 Index

Page 11: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:51 Page 323

job skills, predicting, 7, 20iLinoff, Gordon S., 303Linux operating systems, 190nLloyds TSB, 5iloan default risks, predicting, 7i, 8ilocation data, 2i, 57, 84logistic regression, 236Lohr, Steve, 306London Stock Exchange, 8iLos Angeles Police Department, 12i, 67Lotti, Michael, 83Loukides, Mike, 17love, predicting, 7, 110–111, 126–127, 259,

291–292Lynyrd Skynyrd (band), 253

MMacDowell, Andie, 258machine learningabout, 4–5, 19–21, 147–148, 156–159,

171–173, 204courses on, 153nin crime prediction, 79–80data preparation phase for, 162–163decision trees in, 156–162induction and, 169induction versus deduction, 169–170learning data, 162–163learning from mistakes in, 156learning machines, building, 153–156overlearning, 167–169, 175, 200, 204predictive models, building with, 36, 41silence, concept of, 20ntesting and validating data, 173–175univariate versus multivariate models,

154–156, 156–157See alsoWatson computer Jeopardy!

challengemachine risk, 79–80MacNamee, Brian, 304macroscopic risks, 182Mac versusWindows users, 118Madrigal, Alexis, 54

Magic 8 Ball toy, 81Mao, Huina, 107nmaritime incidents, predicting, 14imarketing and advertisingbanner ads and consumer behavior, 119mouse clicks and consumer behavior, 7,29, 262

targeting direct marketing, 26, 296marketing modelsdo-overs in, 257–258messages, creative design for, 259–262Persuasion Effect, The, 276quantum humans, influencing, 262–266response uplift modeling, 270–271, 275

marketing segmentation, decision trees and,161–162

marriage and divorce, predicting, 7, 53–54,122

Mars Climate Orbiter, 39–40, 245Martin, Andres D., 160Maryland, crime predictions in, 11, 11i, 78Massachusetts Institute of Technology

(MIT), 227Master Algorithm, The (Domingos), 305Match.com, 3i, 7, 291–292Matrix, The (film), 213McCord, Michael, 231nMcKinsey reports, 190–191McNamara, Robert, 269Mechanical Turk, 76medical claims, fraudulent, 11imedical treatments. See healthcarememorization versus learning, 169Memphis (TN) Police Department, 12, 12i,

67metadata, 91–92, 106, 106nmeta-learning, 193–195Mexican Tax Administration, 71Miami Police Department, 12iMichelangelo, 175microloan defaults, predicting, 16iMicrosoft, 2i, 223Milne, A. A., xxix

Index 323

Page 12: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:52 Page 324

Mimoni (Mexico), 7imind-reading technology, 8, 19iMiner, Gary, 304Minority Report (film), 79Missouri, 79Mitchell, Tom, 57, 170, 177, 178, 234, 304mobile operators. See cell phone industrymoneyballing, concept of, 15, 224–226, 282mood labels, 106–108mood prediction, blogs and, 107mortgage prepays and risk, predicting, 7i,

149–151mortgage risk decision trees, 163–165,

180–181mortgage value estimation, 180–181, 298mouse clicks, predicting, 7, 29, 262movie hits, predicting, 8i, 20imovie recommendations, 6, 183, 186, 204,

298movies, 20iMTV, 21iMultiCare Health System (Washington

State), 10i“multiple comparisons problem”/multiple

comparisons trap”, 140multivariate models, 154–156, 156–157murder, predicting, 11, 11i, 12i, 18i, 78Murray, Bill, 258music, stroke recovery and, 124musical taste, political affliation and, 126Muslims, 81–82

NNaïve Bayes, 31Naked Future, The (Tucker), 306NASAApollo 11, 223, 244Mars Climate Orbiter, 39–40, 245PA contests sponsored by, 187, 189on space exploration, 76

National Insurance Crime Bureau, 69National Security Agency (NSA), 12i,

87–101

National Transportation Safety Board, 9natural language processing (NLP), 210,

219–221, 227–231Nature Conservancy, 15i, 204Nazarko, Edward, 222Nerds on Wall Street (Leinweber), 167Netflix movie recommendations, 5i, 6, 183,

186, 204Netflix Prize

about, 185–187competition and winning, 192crowdsourcing and PA for, 185–187,187–191, 197–200, 223

meta-learning and ensemble models in,193–195, 204

PragmaticTheory team, 188–189,196–197

Netherlands, 16inet lift modeling. See uplift modelingnet response modeling. See uplift modelingnet weight of evidence (NWOE), 272network intrusion detection, 12i, 72New South Wales, Australia, 191New York City Medicaid, 11iNew York State, 11iNew York Times, The, 279Next (Dick), 28, 262Ng, Andrew, 153nNightcrawler (superhero), 54Nineteen Eighty-Four (Orwell), 8599designs, 190nNisbet, Robert, 304“no free lunch” theorem, 170nNokia, 2iNokia-Siemens Networks, 14i, 204nonprofit organizations, PA for, 15i–17iNoonan, Peggy, 286No Place to Hide (Greenwald), 98Northwestern University Kellogg School of

Management, 117nuclear reactors, predicting fault in, 13inull hypothesis, 136nNumerati, The (Baker), 306

324 Index

Page 13: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:52 Page 325

OObama for America 2012 Campaign, 8, 15i,

282–288observation, power of, 33–36Occam’s razor, 178, 200O’Connor, Sandra Day, 160office equipment, predicting fault in, 13iOi (Brasil Telecom), 8ioil flow rates, predicting, 13ioil refinery safety incidents, predicting, 13iOkCupid, 3i, 7, 126–127, 259Oklahoma State University, 9, 16iO’Leary, Martin, 189Olshen, Richard, 1781–800-FLOWERS, 711-sided equality of proportions hypothesis

test, 137nOnline Privacy Foundation, 18i, 191Oogway, xxixopen data movement, 110open question answering, 213–222, 226,

238open source software, 190nOptus (Australia), 4i, 106, 120“orange lemons” (cars), 104–109, 142–143Orbitz, 118Oregon, crime prediction in, 11, 12i, 78organizational learning, 17–18organizational risk management, 28Orwell, George, 85Osco Drug, 117overfitting. See overlearningoverlearning, 167–169, 175, 200, 204Oz, Mehmet, 27

PPA (predictive analytics)about, xxx–xxxi, 15–17, 116assumption about NSA’s use of, 94–96choosing what to predict, 6–12, 55–56,

249, 254–256in crime fighting and fraud detection,

8i–9i

crowdsourcing and, 185, 187–191,197–200, 224

defined, 15, 152in family and personal life, 2i–3iin fault detection for safety and efficiency,13i–14i

in finance and accounting frauddetection, 11, 69–71, 121

in financial risk and insurance, 7i–8iforecasting versus, 17, 284frequently asked questions about,xxii–xxvii

in government, politics, nonprofit, andeducation, 15i–17i

in healthcare, 9i–10i, 10–11, 124in human language understanding,thought, and psychology, 18i–19i

launching and taking action with PA,23–26

in law enforcement and fraud detection,11i–12i

in marketing, advertising, and the Web,4i–6i

“orange lemons” and, 104–109, 142–143overview, xxi–xxiirisk-oriented definition of, 152text analytics, 209, 214in workforce (staff and employees), 20i

PA (predictive analytics) applicationsblack-box trading, 24, 43blog entry anxiety detection, 107board games, playing, 76, 292credit risk, 149–151, 277crime prediction, 66–67, 297customer retention with churn modeling,166, 252–254, 277, 298, 299

customer retention with churn upliftmodeling, 277, 299

defined by, 26education—guided studying for targetedlearning, 226, 299

employee retention, 58–64, 297fraud detection, 70–71, 297

Index 325

Page 14: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:53 Page 326

PA (predictive analytics) applications(continued )

mortgage value estimation, 180, 298movie recommendations, 186, 298network intrusion detection, 72, 297open question answering, 220, 238political campaigning with voter

persuasion modeling, 285, 299predictive advertisement targeting, 31,

296pregnancy prediction, 48, 296–297recidivism prediction for law

enforcement, 78, 298spam filtering, 74, 297targeting direct marketing, 26–27, 296uplift modeling applications, 277See also Central Tables insert

PA (predictive analytics) competitions andcontests

in astronomy and science, 189for design and games, 190nfor educational applications, 189Kaggle crowdsourcing contests, 189–191,

204Netflix Prize, 185–187, 204

PA (predictive analytics) insightsconsumer behavior, 119–120crime and law enforcement, 125finance and insurance, 121miscellaneous, 126–130

Palmisano, Sam, 247Panchoo, Gary, 198Pandora, 21iparole and sentencing, predicting, 11, 12i,

77–78Parsons, Christi, 282, 286PAW (Predictive Analytics World)

conferences, xxii, xxvii, 48–49, 61,70–71, 115, 197–199, 227, 240n, 305

payment processors, predicting fault in, 13iPayPal, 8, 18i, 70penicillin, 139Pennsylvania, 12i

personalization, perils of, 30–31persuasion and influence

observation and, 255–257persuadable individuals, identifying,262–266, 274, 286–288

predicting, 255–256, 255–276scientifically proving persuasion,259–260

testing in business, 262uplift modeling for, 266–267voter persuasion modeling, 285, 299

Persuasion Effect, The, 276, 295persuasion modeling. See uplift modelingPetrified Forest National Park, Arizona,

259–260Pfizer, 10iPhiladelphia (PA) Police Department, 77photographs

caption quality and likability, 129growth of in data glut, 112

Piotte, Martin, 185–189, 301Pitney Bowes, 195n, 273Pittsburgh Science of Learning, 17iPole, Andrew, 48–49, 301police departments. See crime prediction for

law enforcementpolitics, PA for, 15i–17i. See also electoral

politicsPorter, Daniel, 287, 289, 302Portrait Software, 195nPost hoc, ergo propter hoc, 130Power of Habit: Why We Do What We Do in

Life and Business (Duhigg), 52PragmaticTheory team, 188–189, 196–197prediction

benefits of, 3, 28choosing what to predict, 6–12, 55–56,249, 254–256

collective obsession with, xxifuture predictions, 291–293good versus bad, 83–86limits of, 12–14organizational learning and, 17–18

326 Index

Page 15: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:54 Page 327

prediction, effects of and onabout, 12–14, 295Data Effect,The, 115, 135-145, 295Ensemble Effect, The, 205, 275n, 295Induction Effect, The, 179, 295Persuasion Effect, The, 276, 295Prediction Effect, The, xxx–xxxi, 1–21,

26–27, 295prediction markets, 199predictive analytics. See PA (predictive

analytics)Predictive Analytics (Siegel), website of, 303Predictive Analytics and Data Mining

(Kotu, Deshpande), 304Predictive Analytics Applied (training

program), 304Predictive Analytics for Dummies (Bari,

Chaouchi, Jung), 304Predictive Analytics Guide, xxvii, 303Predictive Analytics Times, xxvii, 303, 304,

305, 306Predictive AnalyticsWorld (PAW), xxvii,

305Predictive Analytics World (training

programs), 304Predictive Analytics World (PAW)

conferences, xxii, xxvii, 48–49, 61,70–71, 115, 197–199, 227, 240n,305

Predictive Marketing and Analytics (Strickland),304

predictive modelsabout, 23–24action and decision making, 36–38causality and, 35, 131–135, 158,

257defined, 34, 154–155deployment phase, 32Elder’s success in, 41–45going live, 24–26, 30–31machine learning and building, 36, 41marketing models, 257–267, 270–271,

273–276

observation and, 33–36overlearning and assuming, 167–169personalization and, 30–31response modeling, 255–256, 268–270,270–271

response uplift modeling, 270–271, 275,277, 299

risks in, 38–41univariate versus multivariate, 154–156,156–157

uplift modeling, 266–267See also ensemble models

PredictiveNotes.com, xvi, xxiii, xxvi, 95,103, 147, 191, 249, 306

predictive technology, 3–4. See also machinelearning

predictor variables, 116pregnancy and birth, predictingcustomer pregnancy and buyingbehavior, 7, 48–52, 296–297

premature births, 10, 10iprejudice, risk of, 81–82PREMIER Bankcard, 4i, 7iprescriptive analytics, xviii, 267nprivacy, 47–48, 56–57, 83–86, 186Google policies on, 84insight versus intrusion regarding,60–61

predicted consumer data and, 47–52probability, The Data Effect and,

137–139profiling customers, 156nProgressive Insurance, 70Psych (TV show), xxxpsychologypredictive analysis in, 18ischizophrenia, predicting, 10, 18i

psychopathy, predicting, 18i, 191purchases, predicting, 4i, 48–49, 121p-value, 136n

QQuadstone, 195n, 273, 280

Index 327

Page 16: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:54 Page 328

RRadcliffe, Nicholas, 267Radica Games, 19iRalph’s, 7random forests, 201Rebellion Research, 8irecency, 116recidivism prediction for law enforcement,

11–12, 12i, 77–78, 298recommendation systems, 7, 57, 185–187,

223Reed Elsevier, 5i, 17ireliability modeling, 13iREO Speedwagon (band), 58response modelingdrawbacks of, 268–270examples of, 4itargeted marketing with, 255–256,

270–271, 277, 299response rates, 268, 276–277response uplift modeling, 255–256,

270–271, 275, 277restaurant health code violations, predicting,

16iretail websites, behavior on, 118–119retirement, health and, 122Richmond (VA) Police Department, 12,

12i, 67, 78RightShip, 14iRio Salado Community College, 16irisk management, 28, 137–139, 149–151,

152Riskprediction.org.uk, 9irisk scores, 149–151Risky Business (film), 152Romney, Mitt, 285Rousseff, Dilma, 98Royal Astronomy Society,

189R software, 190nRudder, Christian, 306Russell, Bertrand, 135Rutter, Brad, 241, 247

Ssafety and efficiency, PA for, 13i–14iSafeway, 7sales leads, predicting, 7iSalford Systems, 178“Sameer Chopra’s Hotlist of Training

Resources for Predictive Analytics”(Predictive Analytics Times), 304

Sanders, Bernie, 289Santa Cruz (CA) Police Department, 12i, 67sarcasm, in review, 18iSartre, Jean-Paul, 116SAS, 195nsatellites, predicting fault in, 13isatisficing, 269nSchamberg, Lisa, 108Scheer, Robert, 96schizophrenia, predicting, 10, 18iSchlitz, Don, 238Schmidt, Eric, 83, 84Schwartz, Ari, 86Science magazine, 57scientific discovery, automating, 139–141Seattle Times, 104security levels, predicting, 12iself-driving cars, 23, 79Selfridge, Oliver, 161Semisonic (band), 41sepsis, predicting, 9iSesame Street, 109Sesenbrenner, James, 96Sessions, Roger, 175Shakespeare, William, 169, 209Shaw, George Bernard, 156Shearer, Colin, 110Shell, 13i, 20ishopping habits, predicting, 4i, 48–49, 121sickness, predicting, 9i, 10–11Siegel, Eric, 300, 303, 306, 311silence, concept of, 20nSilver, Nate, 27, 140, 238, 283Simpsons, The (TV show), 247, 254Siri, 202, 215

328 Index

Page 17: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:55 Page 329

Sisters of Mercy Health Systems, 9ismoking and smokershealth problems and causation for, 135motion disorders and, 123, 131–132social effect and quitting, 9, 123

Snowden, Edward, 87, 98Sobel, David, 55social effect, 10–11, 120, 123social media networksdata glut on, 112happiness as contagious on healthcare,

124LinkedIn, 3i, 7, 20iPA for, 3ispam filtering on, 74Twitter, 69, 112YouTube, 112See also Facebook

sociology, uplift modeling applications for,277

SpaceShipOne, 187spam, predicting, 20ispam filtering, 74, 297Spider-Man (film), xviii, 83sporting events, crime rates and, 125sports cars, 119Spotify, 21iSprint, 4iSPSS, 195nstaff behavior. See employees and staffStandard & Poor’s (S&P) 500, 39Stanford University, 9i, 10, 153n, 178staplers, hiring behavior and, 118Star Trek (TV shows and films), 41, 77, 196,

209, 220statistics, 5, 16, 78, 112, 125, 135–145,

136–139, 167, 186StatSoft, 178stealing, predicting, 11–12Steinberg, Dan, 3, 147–148, 149, 153,

175–176, 178–181, 301stock market predictionsblack-box trading systems, 8i, 24, 38–44Standard & Poor’s (S&P) 500, 39

Stone, Charles, 178Stop & Shop, 6street crime, predicting, 11iStrickland, Jeffrey, 304student dropout risks, predicting, 9, 16istudent performance, predicting, 16i, 191suicide bombers, life insurance and, 125Sun Microsystems, 5iSuper Crunchers (Ayres), 306SuperFreakonomics (Levitt and Dubner), 72,

81, 91n, 92, 95supermarket visits, predicting, 4i, 191surgical site infections, predicting, 9iSurowiecki, James, 197, 199Sweden, 124system failures, 13iSzarkowski, John, 112

TTaleb, Nassim Nicholas, xviii, 136–137,

138, 168Talking Heads (band), 52Targetbaby registry at, 49–50couponing predictively, 6customer pregnancy predictions, 3i, 7,48–52, 296–297

privacy concerns in PA, 52, 85product choices and personalizedrecommendations, 6i

purchases and target marketingpredictions, 4i

targeted marketing with response upliftmodeling, 255–256, 270–271, 277,299

targeting direct marketing, 26–27, 296target shuffling, 143ntaxonomies, 161tax refunds, 11i, 12, 71, 204Taylor, James, 305TCP/IP, 73Telenor (Norway), 5i, 252–254, 281Teragram, 195nterrorism, predicting, 11i, 11–12, 81–82

Index 329

Page 18: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:56 Page 330

Tesco (U.K.), 5i, 6test data, 173–175test preparation, predicting, 17itext analytics, 189, 195n, 209Text Analytics World, 305textbooks, 304text data, 209, 214text mining. See text analyticsThey Know Everything about You (Scheer), 96thought and understanding, PA for, 8,

18i–19ithoughtcrimes, 85Tibshirani, Robert, 304Titantic (ship), 130T. J. Watson Research Center, 224tobacco. See smoking and smokersTolstoy, Leo, 122traffic, predicting, 14i, 191training data, 49, 61, 148–149, 162–163,

170, 276. See also learningtraining programs, 304train tracks, predicting fault in, 13iTrammps (band), 130travel websites, behavior on, 118–119Trebek, Alex, 207, 209, 224, 230, 245TREC QA (Text Retrieval Conference—

Question Answering), 222truck driver fatigue, predicting, 18itrue lift modeling. See uplift modelingtrue response modeling. See uplift modelingTTX, 13iTucker, Patrick, 306Tumblr, 112Turing, Alan and the Turing test, 74, 77,

222Twenty Questions game, decision trees and,

162Twilight Zone (TV show), 262Twitter2001: A Space Odyssey (film), 209data glut on, 112fake accounts on, 69mood prediction research via, 107n

person-to-person interactions saved by,106

2degrees (New Zealand), 5ityping, credit risk and, 121–122

UUber, 2i, 118uncertainty principle, 262understanding and thought, predictive

analysis in, 18i–19iunivariate models, 154–156, 156–159University of Alabama, 9, 16iUniversity of Buffalo, 12, 18iUniversity of California, Berkeley, 19i, 204University of Colorado, 125University of Helsinki, 124University of Iowa Hospitals & Clinics, 9iUniversity of Massachusetts, 227University of Melbourne, 16i, 191University of New Mexico, 122University of Phoenix, 16iUniversity of Pittsburgh Medical Center,

10, 10iUniversity of Southern California, 227University of Texas, 227University of the District of Columbia, 101University of Utah, 10, 10i, 15iUniversity of Zurich, 122uplift modeling

customer retention with churn upliftmodeling, 252–254, 276–279, 281,299

downlift in, 273influence across industries, 276–279mechanics and workings of, 266–267,270–275

Obama for America Campaign and, 277,286–288

The Persuasion Effect, and, 276response uplift modeling, 270–271, 275targeted marketing with response upliftmodeling, 255–256, 277

Telenor using, 252, 254, 281

330 Index

Page 19: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:56 Page 331

U.S. Bank using, 6, 7i, 119–120,266–276

uplift trees, 275UPS, 14iU.S. Armed Forces, 12iU.S. Bank, 5i, 6, 119–120, 266–276, 273n,

274n, 275U.S. Department of Defense, 92U.S. Department of Defense Finance and

Accounting Service, 11i, 71, 204U.S. Food and Drug Administration (FDA),

279U.S. government. See governmentU.S. National Institute of Justice, 67U.S. National Security Agency, 113U.S. Naval Special Warfare Command, 20iU.S. Postal Service, 11i, 16i, 71U.S. Social Security Administration, 16iU.S. Special Forces, 20i, 204U.S. Supreme Court, 160–161Utah Data Center, 113

Vvariables. See predictor variables“vast search,” 140Vermont Country Store, 4i, 6Vineland (NJ) Police Department, 12i, 67viral tweets/posts, predicting, 20iVirginia, crime prediction in, 12, 12i, 67, 78Volinsky, Chris, 188voter persuasion, predicting, 15i, 160,

282–288

WWagner, Daniel, 287Walgreens, 57Wall Street Journal, The, 134Walmart, 118Wanamaker, John, 26warranty claim fraud, predicting, 11, 11iwashing machines, fault detection in, 13iWatson, Thomas J., 224, 227nWatson computer Jeopardy! challengeabout, 18i, 24, 207–209

artificial intelligence (AI) and, 217–219candidate answer evidence routines,232

confidence, estimation of, 238–240Craig’s question predictions for, 17i,225

creation and programming of, 207–215,225, 226–227

ensemble models and evidence, 234–235Jeopardy! questions as data for, 207–209,222

language processing and machinelearning, 236–237

language processing for answeringquestions, 18i, 204, 210, 219–234

moneyballing Jeopardy!, 224–226natural language processing (NLP) and,210, 219–221, 227–231

open question answering, 213–222, 226,238, 244

playing and winning, 8, 18i, 24, 204,241–247

praise and success of, 247–248predictive models and predicting answers,226–227, 231–234

predictive models for predicting answers,17i, 18i

predictive models for question answering,17i, 18i, 219–221, 226–227, 231–234

Siri versusWatson, 215–216speed in answering for, 241

Web browsing, behavior and, 120Webster, Eric, 152Wells Fargo, 20iWhiting, Rick, 291Who Wants to Be a Millionaire? (TV show),

199“wider” data, 140–141Wiener, Norbert, xviiWikipedia, 20ieditor attrition predicting, 9, 20i, 191entries as data, 214, 227noncompetitive crowdsourcing in,190n

Index 331

Page 20: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:57 Page 332

Wilde, Oscar, 156Wilson, Earl, 12Windows versusMac users,

118Winn-Dixie, 6Wired magazine, 191Wisdom of Crowds, The (Surowiecki), 197,

199WolframAlpha, 216WordPress, 112workplace injuries, predicting, 7iWright, Andy, 159Wright brothers, 32Wriston, Walter, 54

XX Prize, 187

YYahoo!, 119Yahoo! Labs, 19iYes! 50 Scientifically Proven Ways to Be

Persuasive (Cialdini et al.), 260yoga, mood and, 124YouTube, 112

ZZeng, Xiao-Jun, 107nZhou, Jay, 69

332 Index

Page 21: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:57 Page 333

Page 22: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:57 Page 334

Page 23: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:57 Page 335

Page 24: 3GBINDEX 12/08/2015 1:51:44 Page 313

3GBINDEX 12/08/2015 1:51:57 Page 336