10 Detailed Case Studies + Big Data & Analytics’ New Digital Divide + How to Think Like a Data Scientist
Book Review: Numbers Rule Your World/Kaiser Fung
[UPDATED] Step inside a data scientist’s mind, and learn why probability is the key to profit and how it’s the key to understanding and using big data for better decision making. This fascinating and useful book clearly shows how people misunderstand probability and misuse statistics—and therefore big data—and how the knowledge gap leads to faulty models, thinking and decisions. New winners and losers are emerging in the digital social and big-data age. A new digital divide, people who think like data scientists and use probability to support decision making—and everyone else. The data science group will outperform, and Fung shows how creative, fun and useful data science is.
This book is a perfect twin to Duncan Watts’ Everything Is Obvious* Once You Know the Answer, which exposes how common sense pervades management decisions and failure. I shall refer to several specific connections between the two reviews. You can appreciate both reviews without reading the books, although I highly recommend buying both. Where Watts does an enthralling job of describing the limitations of the common-sense, hyperlocal human brain, Fung shows his readers new ways of thinking that take advantage of large data sets.
Best of all, although Numbers Rule Your World (hereafter “Numbers”) doesn’t skimp on details, it is not a dry book because Fung is an talented storyteller who revels in thinking about information in creative ways. He’s curious and smart. I covered his Chicago talk, and he’s like that in person, too, one of those people who thrives on what he’s doing. He’s a total geek, but he’s also an excellent interpreter. I have never studied statistics, although I am strong in logical and abstract thinking, and I enjoyed the book immensely.
My analysis and conclusions follow the outline of each chapter.
Numbers’ subtitle is, “The Hidden Influence of Probability and Statistics on Everything You Do,” and it captures a large part of the book’s value proposition, but I shall attempt to take you deeper down. My intense work with social business has had me swimming in social data for the past eight years, and founding the Chief Digital Office in early 2013 has led me to read numerous books and thought leadership on data science, big data and analytics.
I am beginning to appreciate the profound change that pervasive data has in store for decision makers, and Numbers will give you a strong taste for it; it will also show you how to begin to rewire your brain, to understand and use intense data and to think like a data scientist. As Fung’s detailed discussion of Numbers’ ten case studies suggests, humans who continue to make decisions by “gut feel” will end up on the windshield of the 21st century.
Imagine a farmer licking his finger, holding it to the wind and predicting tomorrow’s weather. Contrast that to the meteorologist with sophisticated data models and intense data feeds. Leaders of commercial, nonprofit and government enterprises are the farmers until their organizations rewire their thinking.
Human brains were not designed to work with intense data. Our brains evolved during humans’ 250,000 year history (defining us as homo sapiens). Throughout our evolution, we have lived hyperlocal, and our brains have solved simple problems with relatively little data. However, we are sorely lacking when we try to solve complex problems because they always involve interdependent systems and often hundreds or thousands of variables. Duncan Watts describes this disastrous shortcoming as “common sense,” which evolved to help us succeed in hyperlocal living in communities with 150 members maximum (until the neolithic revolution, 10,000 BCE).
I hope you will give Numbers your utmost attention because Fung shows how to start rewiring and how to avoid common sense thinking (although he doesn’t refer to common sense per se). Where Watts exposes our tendency is to scale things down to a level where we can deal with them, Fung uses statistics as a tool to deal with complex problems. However, he doesn’t treat it as the Oracle: statistics is only effective when the logic, modeling and data are sound. Unfortunately, as several cases show, statisticians also fall prey to common sense when they construct models with faulty logic, and Fung points these out with relish.
One: Fast Passes / Slow Merges [how average distorts thinking]
One of Watts’ key concepts is that human brains, in having limited ability to store and manipulate large amounts of data, function by “reducing complexity” to a level that they can manage. For our entire history as a species, this was adaptive because we have been living hyperlocal and encountering simple challenges and opportunities. However, our propensity for “slash and burn reduction” is maladaptive in today’s big data world. Simply put, we don’t know how to reduce, we don’t know what’s important, so we just throw tons of information overboard to create some space.
The “average” is a lethal construct because it reduces specificity to “average.” Chapter One reveals the development of “average thinking” (sorry, couldn’t resist ;^).
Chapter One begins with a brief history of the average and its inventor, Belgian statistician Adophe Quetelet, who invented the concept of l’homme moyen (the average man) in 1831. He applied statistical thinking to the social sciences for the first time.
- For example, even “experts” constantly reference “the average Joe,” the “average billionaire,” the “average bear,” etc. Insodoing, they dismiss the differences in each item in the set that is “averaged.” It’s like putting a nice dinner in a blender.
- Average constantly tempts us to confuse the imaginary with the real. The average doesn’t exist except as a reduction tool.
- Example: “The Average American Owes Average Chinese $4,000.” The article implicitly spread the $1.4 trillion U.S. debt to China evenly among all members of the population. Bloggers spread this idea and used faulty math on top of it. Averaging kills diversity, which is real.
- Statistics is the study of variability, how much things change from a defined reference point. Fung explores the statistics of variability. Again, I found his descriptions rich, not dry.
- Case Study A/Lines at Disney World: statisticians Dr. Edward Waller and Dr. Yvette Bendeck tackle the lines at Disney World.
- People universally hate waiting in lines, especially in amusement parks.
- Several quotes from customers and how they try to hack waiting time.
- How they apply statistics to queues in Disney World.
- Disney tackled perceptions head on, invented waiting areas that entertain, and “FastPass,” which doesn’t reduce waiting time (which is a function of park capacity, which doesn’t change), but the perception of waiting.
- Case Study B/Traffic Congestion: statisticians dive into the densest highway network in the U.S., in the Twin Cities (Minnesota) metropolitan area.
- Statistics on traffic delays and congestion. You have likely struggled with them without understanding them.
- Quotes from commuters about how they try to hack traffic are sure to resonate with many readers!
- Minnesota Department of Transportation’s experiment with ramp metering. Citizens’ backlash result in canceling the meters for six weeks—and increased travel times and accidents—before reinstating the program.
- Through the two case studies, Fung shows that waiting is perceived to be the enemy, but it hides the real culprit, variability. In most cases, waiting is quite tolerable because people can use the time productively; not knowing how long one will wait is the real frustration. Variability. Think about how profound and applicable this distinction is.
- Like ramp metering, Disney’s FastPass attacks variability, not waiting time itself, which is caused by the capacity of Disney’s rides and Minnesota’s highways.
- FastPass doesn’t reduce waiting time, but quotes from delighted customers reveal that their perception of waiting time is flawed; they experience less waiting time. Numbers is replete with “perception vs. reality” examples.
Two: Bagged Spinach / Bad Score [imperfect but useful models]
In Chapter Two, Fung describes how data scientists and statistics modelers work, and this chapter is immensely readable because it focuses on how to use statistics and logic to think and solve problems. A fascinating subtopic is how politics affects modelers’ work: in disease outbreaks, the public is very supportive and enthusiastic; in credit scoring, much of the public is hostile and regards the models as “evil machines.”
- Who are statistics modelers, and how to they think? This chapter is immensely practical in understanding how to use data and logic to solve problems. Very practical in appreciating big data.
- Case Study A/Tracking a Deadly E. coli Outbreak: this is the story of a collaborative effort among scientists (epidemiologists) and public officials (FDA, state authorities), who tracked the source of a deadly E. coli outbreak within 18 days of the FDA’s massive spinach recall, which badly hurt spinach farmers and vendors.
- This reads like a thriller; it focuses on the thought processes the sleuths used.
- It calls out the importance of appreciating and using differences among data sets; also how modelers think and where they looked to find the data.
- A critical success factor during E. coli and other disease outbreaks is not getting distracted by interesting data patterns; sleuths need to focus limited resources quickly because time is short. It’s about triage.
- A key challenge is distinguishing outbreaks from “normal” E. coli infections. In case of the former, scientists and government need to mobilize resources, which are very costly. Moreover, recalls (real or false) hurt farmers and food companies. Infections and deaths are a function of incubation time and distribution of the virus/bacteria, so they are delayed, then often spread quickly. To be effective, intervention often has to be fast.
- In this case, the strain of E. coli was unusual, and this helped determine the outbreak; there are 3,520 strains of E. coli documented, and each strain has a DNA fingerprint.
- How state and federal officials work together. The role of databases for strains of diseases. After authorities connected the dots, they had to determine the source, and it was a surprise in this case, spinach. But then, what kind? Where? The case is a fast read, and Fung maintains the suspense.
- Researchers interview patients in several states that had been infected with that strain. Through a lucky break, they determined the source was bagged spinach.
- Fung also references another epidemiology case, cholera in London, in which physician John Snow showed that the source of cholera was not “foul air” as experts asserted but wells. He mapped deaths and infections by house number and pinpointed their distribution. Fung then applies the principles to the spinach case and shows how statistical thinking cracked the case and tracked Dole Baby Spinach to a California plant.
- Then Fung questions the value of the recall, which hurt the industry and saved no lives because it was too late; after six months, spinach sales had just recovered by half, all because one plant was sloppy with its operations.
- Fund describes the statistical thinking in some detail, and this is much of the value of the chapter; it’s the logic of how to use data to get answers. The case–control technique, and its early use by Sir Bradford Hill and associates to show that cigarette smoking caused lung cancer in the 1950s.
- Case Study B/Consumer Credit Scores: traces the history of the “credit score,” how the scores work and some of their impact on the economy.
- Various political groups attack credit scores as unfair. How their logic jibes with the statistics behind the scores.
- Fung describes the rather simple statistics behind credit scores and how their models work.
- He explains how the statistics enable credit extenders to manage risk and how the scores exploded credit and enabled “instant credit.” FICO invented credit scoring in the 1960s. In 1960, only 7% of U.S. households held a credit card, and 70% of loans were secured by collateral.
- Fung contrasts the analytics with human bankers’ arbitrary credit decision making, which was practiced as a craft. This is important because the general population trusts “the machine” less than “people” but the people are even more arbitrary and they base their decisions on far less data.
- Before FICO and credit scoring, bankers developed their own rulesets, for example:
- If the applicant is a painter, plumber or paperhanger, then reject
- If the applicant has ever been bankrupt, then reject
- If the applicant’s total debt payment exceeds 36% of income, then reject
- By contrast, the FICO modelers use 100 characteristics, grouped into these five categories (most important first):
- Has the applicant dealt responsibly with past and current loans?
- How much debt is currently held?
- How long is the credit history?
- How eagerly is the applicant seeking new loans?
- Does the applicant have credit cards, mortgages, department store cards, or other types of debt?
- Algorithmic decision making, unlike its human counterpart, constantly gets tweaked based on the business results of having extended credit to millions of people.
- Moreover, Fung posits that scores increased credit to all socioeconomic classes because scores are a better way to assess and manage risk. Credit card debt doubled to the bottom 10% of the U.S. eligible population between the 1980s and the early 2000s.
- Fung then presents the critics’ arguments about why credit scoring is unfair; they tend to insist that the statistics are “faulty” and discriminatory.
- A key learning from his presentation of the statistics is that modelers are not trying to determine causality; their focus is on correlation. He discusses the difference between correlation and causality. He asks, “Can correlation by useful without causation?” and utters a resounding “yes” in response.
- This gem emerges at the end of the chapter, a famous quote from George Box: “All models are wrong, but some are useful.” All models fail to perfectly reflect the “real world,” but good ones can help organizations do things by using data. No one disputes that, for example, data analyzed by scoring algorithms is dirty and faulty. However, Fung shows that, despite its faults, credit scoring software assesses common consumer credit risk far better than individual people.
- Between the lines, this chapter reflects some of the social challenges that machine-based decisions face. People recognize that other people have discretion to “bend the rules,” where machines do not. Even if people, say bankers, do not bend the rules to give the prospective borrower a loan, the prospect recognizes that the banker could have. The machine just rejects and feels nothing. It is a fairer decision but less emotionally acceptable. For more on this, see my Everything Is Obvious review, Chapter 9, Fairness and Justice.
Three: Item Bank / Risk Pool [comparing like with like]
Chapter Three presents two sides of a dilemma, how and when to group data sets for analysis and decision making. In one case, a testing authority achieved a breakthrough in test “fairness,” while in the second case, errors in logic resulted in a failed property/casualty insurance market in Florida.
- How to group data for analysis carries valuable lessons for big data and analytics: charts look pretty, data is impressive, but faulty reasoning and logic can produce unwelcome results.
- Case Study A/Licensing Examinations for Insurance Agents and College Entrance: the Educational Testing service is best known for using statistics to test and score prospective college students, but it extends its craft into professional exams for insurance agents as well. Fung takes us behind the curtain to reveal how statistics and modeling can increase fairness by programming the black box to judge cohorts among themselves.
- An insurance company sued the Educational Testing Service, charging that its insurance agent licensing examination unfairly discriminated against blacks; Golden Rule Insurance wanted to penetrate the Chicago market and needed more black agents to achieve that goal, but it was frustrated at blacks’ low passing rate.
- The ETS had long relied on testing experts to decide whether questions were “fair,” but the lawsuit caused them to create a new “fairness testing” process for insurance exams that statistically validated test fairness based on actual results.
- The ETS reused the technique in its main business, college screening (it administers the SAT).
- Note the similarity to the transformation of credit scoring, which had long relied on “craft” techniques grounded in expert judgment (but limited data).
- Fung shows that black high school students consistently underperformed white due to myriad factors like access to tutors, test taking coaches, educational opportunities, etc. ETS’s breakthrough was grouping and comparing “high ability” students and “low ability” students with each other to measure relative ability. This had better results than racial groupings, which were skewed because total whites had higher scores than total blacks, and high ability blacks were a smaller portion of total blacks. Test scoring algorithms then normalized scores.
- The ETS introduced an “experimental” section in 2005 in which it tests prospective questions whose answers don’t count towards testees’ scores, but testees cannot discern which section it is. Therefore, questions are pre-tested for bias and unfairness before they are used in “real” tests.
- It is impossible for humans to judge whether or not a question is impartial because people have very small data sets to analyze and on which to base their assessment. “Real world” testing is much more effective. Think about how revolutionary that is—and how applicable to business.
- Fung used numerous interesting examples of test questions to call out nuances.
- Along with Watts, Fung lays bare the limitations of the human brain to analyze complex issues that involve large amounts of data. Our brains are not designed for that use case.
- Case Study B/Property Insurance on Florida’s Hurricane Coast: tells the story of how Florida’s property tax industry was built on faulty assumptions, logic and models—and how the market buckled in the wake of a series of exceptional hurricanes.
- This case is built around an experienced insurance entrepreneur Bill Poe, who founded the Southern Family Insurance Company to insure Florida property in the 1990s. The company was wiped out by Katrina and Wilma hurricanes, but it only had itself to blame since it wrote policies for Florida and concentrated its risk geographically.
- National insurers State Farm and Allstate had suffered extreme losses in 1992 at the hands of Andrew, so they were questioning their risk models as extraordinary storms lashed the Gulf coast during the 1990s.
- After Wilma, the State of Florida became the lender of last resort when Allstate and State Farm raised premiums so high that homeowners could no longer afford them. The market had become uninsurable. No carrier would offer coverage; carriers cancelled policies.
- Risk models had been preoccupied with the standard of the “one hundred year storm,” which required them to carry enough reserves to pay claims against a storm so severe that it would only occur every one hundred years. Fung exposes this concept as a major flaw.
- The statistics concern probability, not frequency, so the severity has a one percent probability of happening in any year.
- As years pass with no “100 year severity” storms, the probability increases. Note that severe storms that are not “100 year severity” don’t count toward the 100-year total.
- Insurers and regulators engendered a false sense of security by understanding the statistics in terms of frequency. This is a perfect example of humans erroneously simplifying the situation.
- The unsolved problem in property casualty insurance for natural disasters is that the risk pool is geographically concentrated. The state of Florida is too small a market; coast dwellers are too large a portion of the total.
- Fung contrasts property insurance with automobile insurance, whose claims are not geographically concentrated. That market functions well.
- The chapter ends on an ominous note: the state of Florida is not pricing insurance for coastal dwellers to drive them away from the coast; meanwhile, inland dwellers pay ruinous rates through Florida taxes: “trailer-park grandmas were subsidizing wealthy owners of estates by the sea.”
Four: Timid Testers / Magic Lassos [asymmetry in lie detectors/polygraphs]
In Chapter Four, Fung discusses the false “science” of lie detectors (polygraph tests). He focuses right on their Achilles heel, the asymmetry of false positives vs. false negatives: social context leads organizations to treat false negatives very differently from false positives. For example, the U.S. Army uses polygraph tests to screen local employees and militia in Afghanistan, so it sets “passing standards” extremely high, rejecting thousands of potential candidates. It does not want false negatives, which would enable terrorists to pass the test, so it endures thousands of false positives. Conversely, polygraphs used to detect lying among professional baseball players are lenient because baseball does not want false positives (players who don’t dope but are accused anyway), so it accepts hundreds of false negatives, which enable players to cheat. This is asymmetry.
- Test designers cannot eliminate false positives and false negatives; they can only decide where to put the errors.
- Polygraph tests have rather large margins of error, so testers only have the choice of how to calibrate results, toward false positives or false negatives. Social context determines where they put errors.
- An interesting subtext is that polygraph tests are extremely ineffective at measuring whether or not testees are lying, yet they are used extensively. Fung exposes this irrationality from several angles. He doesn’t say it but implies that organizations want to believe in the tests, so they use them. They buy into the idea, not the result.
- Case Study A/Detecting Doping in Professional Sports: doping in professional sports is widespread, but one of the biggest scandals occurred in U.S. professional baseball in the early 2000s. Fung recounts that story to show how steroid testers set results and thresholds to push errors into false negatives, letting hundreds of dopers off the hook. “If [testing] is 99% accurate, there will be seven false positives in big league baseball.” Seven careers ruined.
- Fung builds the story around doping in baseball, but he also uses facts from cycling and other sports, and he uses extensive quotes from players and player unions.
- Notably, quotes show how accused players are encouraged by social context to deny doping even when they are detected, and how the public supports them. Several players later confessed, exposing misplaced trust.
- “Statistical analysis shows that in steroid testing, a negative finding has far less value than a positive” (because testing favors false negatives).
- A very inconvenient truth: “Some dismiss false negatives as victimless errors. Not true… Michael Johnson [of golden Nike Spikes fame]: ‘The athletes who finished behind [the winner who cheated] will never experience the glory or recoup the financial benefit they deserved for their hard work.’”
- Fung presents a top ten list of cheating tips, David Letterman style, based on quotes from players in various sports. It is chilling.
- Case Study B/Detecting Terrorists in the U.S. Army in Afghanistan: in this case, the social context drives testers to put errors into the false positive column because one insurgent that infiltrates the U.S. Army can potentially kill dozens of soldiers, so polygraphs reject thousands of bona fide candidates.
- Fung begins this case on lie detector tests with a humorous, cynical cultural reference to Wonder Woman, who has a “magic lasso” that forces people within it to tell the truth. Many people and organizations treat the lie detector as a magic lasso, but Fung exposes this as a lie. ;^)
- Lie detector tests are data gatherers, and test design and models determine “pass” and “fail.”
- He shows the critical role of examiners, who administer the tests (after hooking the testee to the machine’s instruments, they ask questions). It is a significant role because there are numerous false questions; the examiner only asks “real” questions when s/he feels that the testee is completely at ease. It is easy to imagine the impact the examiner has on results.
- An interesting aside is that many professional athletes tried to clear their names by volunteering for polygraphs.
- U.S. courts have not allowed polygraphs as evidence since the 1920s, although there are some exceptions. Polygraphs are too error-prone.
- Interestingly, polygraphs are worse when trying to measure hypothetical situations (i.e. screening prospective employees) than measuring something that actually happened (or not).
- U.S. Congress passed the Employee Polygraph Protection Act of 1988, which prohibits U.S. firms from using polygraphs for screening; notably, government and police are excepted.
- Even though polygraphs lack judicial or scientific credibility, the CIA, FBI and police forces use them routinely. Most galling is that police and agencies are not barred by courts from using false “test results” to elicit confessions! Fung shows how Jeffrey Deskovic spent sixteeen years in jail, largely based on his false “confession” for murdering Angela Correa in New York in 1989. He had agreed to take the polygraph to clear himself (innocent people often do this). Police lied and said he had failed the test to elicit a confession.
- Fung shows that polygraphs are statistically less reliable than police line-ups. In Afghanistan, for every spy polygraphs catch, 111 good employees are rejected. In line-ups, for every nine murderers caught, four innocent citizens are falsely accused. Socially, polygraphs are more acceptable, but their results are horrible.
- Fung warns that using polygraphs for screening is dangerous, asking, “How many innocent lives should we ruin in the name of national security?” This idea is widely applicable in terrorism screening of most types.
- Although the NSA scandals happened after Numbers was published, they illustrate his warning very well.
- Security expert Bruce Schneier has looked at data-mining systems the same way Fung evaluated steroid tests and polygraphs.
- “We’ll assume the [data-mining] system has a one in 100 false-positive rate… and a one in 1,000 false-negative rate. Assume one trillion possible indicators to sift through: that’s about 10 events—emails, phone calls, purchases, web destinations, whatever—per person in the United States per day. Also assume that 10 of them are actually terrorists plotting. This unrealistically accurate system will generate one billion false alarms for every real terrorist plot it uncovers. Every day of every year, the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Raise that false-positive accuracy to an absurd 99.9999 percent and you’re still chasing 2,750 false alarms per day but that will inevitably raise your false negatives, and you’re going to miss some of those 10 real plots.”
- “Expecting intelligence agencies to ‘connect the dots’ is a pipe dream.” My review of Everything Is Obvious, Chapter Six, shows that predicting events is especially perilous when the kind of event is a black swan (like 9/11) because we don’t know what to look for.
- [Big data and] Data mining systems use statistics, models and algorithms to detect patterns of behavior, but they have huge margins of error. In the context of screening for terrorists, the statistics show that millions of innocent citizens will be accused for every terrorist caught. This is why Numbers is so important; probability is widely misunderstood. As Watts explains, people want to predict the one and only series of events that will happen, but statistics can only give degrees of probability that some event might happen. Organizations that learn to unwire hyperlocal thinking and manage by using probability will outperform.
- In big data and analytics, statisticians are modeling the world, describing reality. They do not know reality, so they will have errors. When using these tools it is essential to be mindful about managing errors and maintaining awareness of what you are doing.
Five: Jet Crashes / Jackpots [belief in miracles]
I found Chapter Five to be very entertaining and interesting because it applied probability to flying (which I do) and lotteries (which I don’t). It starts with this quote, by an anonymous pilot: “The safest part of your journey is over. Now drive home safely.”
- People regularly believe in miracles, and Fung shows this through two lenses; 30% of people fear flying, yet they drive, which carries a far higher probability of death. The probability of dying in a plane crash is one in ten million, a miracle. The probability is roughly the same for winning a fortune in a state lottery, yet 50% of Americans play in state lotteries!
- Most people focus on unexpected patterns (Watts also shows this), but statisticians evaluate patterns against the norm, the background, i.e. fatal airline flights divided by safe flights.
- Case Study A/Airline Crashes: delves into “experts’,” journalists’ and the public’s perceptions of risk; how people do not understand probability at all.
- From 1996-1999, there were five plane crashes near Nantucket, and journalists talked about a new Bermuda Triangle. People rejected the concept of randomness; instead they looked for a “cause” that produced such a “high concentration” of crashes.
- Fung shows that, in statistics, what you don’t know is often more revealing than what you know. Experts and pundits talked and wrote incessantly about “Nantucket’s five crashes,” while statisticians analyzed the number of safe flights that had traversed the same geography during the same period. There were millions of safe flights, which revealed that there was no “trend” of plane crashes other than a very small portion of millions of flights.
- Even more interesting is the widespread belief that “developing-world” air carriers are far more dangerous than “developed-world” airlines. Developing-world airlines are just as safe. Analyzing MIT’s Arnold Barnett’s work, Fung shows how:
- Developed-world airlines’ risk of death was 1/700,000 in the 1960s, when data were first gathered. By the 1990s, it was one in ten million, a 14-fold improvement.
- Jet crashes occur randomly; national carriers have such a low probability of death that differences don’t matter.
- Developing-world airlines have more total fatal crashes, but when they are compared to developed-world airlines on the same international routes, they actually have a slightly better safety record. Most fatalities are on national flights for which there is no competition.
- This is another example of the importance of grouping data for analysis.
- To boil this down to everyday thinking, one set of data held the risk of death at one chance in 1 to 1.5 million during 2000-2005. A once-a-day flyer could expect to die in a jet crash once in 4,100 years!
- It is practically impossible to die in a plane crash.
- Case Study B/Exposing Insider Fraud in a Canadian Lottery: statisticians discovered insider fraud in the Ontario Lottery and Gaming Corporation, using probability.
- Jeffrey Rosenthal at the University of Toronto analyzed seven years of draws and winners of Ontario jackpots. He compared the portion of “retail insider” (store owners that sold lottery tickets) wins to total wins. Probability held that insiders should win 57 times. In actuality, they won 200 times, a virtual impossibility.
- Insiders defrauded an old man by telling him that he had won a free ticket, so they pocketed his ticket and claimed the prize money, CDN$250,000. He had bought tickets there for years and thought the clerks were his friends.
- Odds of winning a lottery are once in every 27,000 years.
- Many people believe in miracles!
Analysis and Conclusions
- I hope you have already ordered Numbers because it’s a great primer for starting to conceptualize how to use intense data to improve problem solving and decision making. The book has other great design features that increase its value: a detailed conclusion is useful for reading before you read the book proper; you get the lay of the land. Fung includes detailed notes, references, and an index, too.
- Humans’ tendency to use common sense is baked into our evolutionary history; it has served us very well, but its limitations are painfully obvious when our decisions involve complex systems and intense data, as they increasingly do.
- While exposing human frailties, Fung does not come across as “anti-human,” which makes the book even more engaging. Machines are “dumb,” they are only as “smart” as logic and design enable them to be. People using machines will be the killer app: machines can crunch intense data; the human brain cannot even conceptualize it. For example, when I read that my chance of dying in a jet crash is 1/1,500,000, I cannot imagine it. It is easier if I think of it in terms of dying in a crash once every 4,100 years.
- Fung doesn’t address the social acceptability of relying on machines, even though this issue lurks in some of the case studies. People often don’t feel comfortable when machines make decisions that affect them (i.e. credit scores); they prefer other people to make the decisions, even though people are usually more limited and flawed than machines. Organizations can remedy this by appreciating it and designing business processes to pair machines and humans appropriately.
- Organizations can use data science to manage risk and dramatically enhance decision making, but using big data, analytics and data science doesn’t have to take away from their humanity. Organizations need to use machines for the risk assessment and number crunching part of the process, but using and presenting the data should be done with humanity and empathy. Getting this right will require innovation.
- The Matthew Effect is another book that uses probability to show “hidden influence” of numbers.
- The Wisdom of Crowds is another popular book that brushes with probability, but it’s far less detailed and not written by a data scientist.