Hindsight and foresight in judgments under uncertainty

As I began reflecting on my earlier posts on You are intelligent: have you done something dumb? and Judgments, I rolled back to this classic article, so beautifully titled, Hindsight ≠ foresight. Written by B Fischhoff, way back in 1975, this paper provides two intuitive results (at least in hindsight): once the outcome of an event is known, people associate higher probability of its occurrence; and people were unaware that they have been influenced by hindsight (the knowledge of what actually happened).

Case class in a business school – hampered by hindsight bias

Take a business school case class for instance. As a case teacher, I face this a quite a lot of times. In a typical strategy cases, the primary question to the class is “what should the company do?” And since most cases are set a few years in the past, simple Internet search (by the students as part of their class preparation) would have informed them about what had actually happened. Given that students come into class with this hindsight, they try very hard to fit their preparation and theoretical arguments to the actual outcome, however irrational, or improbable it might have been. A good management teacher ought to therefore provide for this “hindsight bias” in  students and ensure that a fair discussion happens in class on all possible outcomes.

Should I therefore, as a management teacher, provide my students with only cases for which the outcomes were not known? What therefore are my criteria to choose cases for a class? Do I fight to eliminate this “hindsight bias”? Let me come back to this later.

An air-crash investigation

Let us take another example. A special team has been tasked with investigating the cause of an air crash. Any investigation of an accident would inevitably entail putting together pieces of information to arrive at a causal relationship between the antecedent factors and the event, which is known to have happened. It is impossible to eliminate the source of bias here, the event. The investigation team has to be trained to create a counter-factual (good) outcome from the evidence at hand. They need to recreate the antecedents to the event in a manner that they evaluate if anyone in their place would have made the same decisions as the actors (pilots and crew who made certain decisions) did. Experimentally, it is akin to creating a control-group that knows all the facts leading up to the case, but not the actual outcome.

Investigating white-collar crime

In the case of white-collar crime, especially when it involves financial fraud, another significant factor interferes with hindsight bias, the size or impact. Larger frauds are fraught with more pronounced biases. Media coverage on “select” white-collar crimes are testimony to such biases. Nicholas Bourtin (read the article here), adds how armed with hindsight bias, financial crime investigators might ascribe malicious intent to even innocent mistakes or poor judgement.

As I was thinking about this issue, I just saw the breaking news of an earthquake of 7.2 magnitude hitting the Iraq-Iran border (Sunday, 12th November 2017). Hopefully, there isn’t much damage. And it triggered a thought.

Fighting hindsight bias – learning from geophysics

A great learning for fighting hindsight bias comes from geophysical studies. Imagine how geologists and geophysicists study earthquakes and volcanic eruptions. These are events that just “happen”, and then the scientists “reconstruct” the events through carefully collected data. Can we learn something from the way they fight hindsight bias? Sure.

Strategy #1: Conduct stability studies. Not just fault studies, but stability studies. Take the context of highly quake-prone areas, and go study why earthquakes aren’t happening! Such data would provide the ‘normal’ distribution of data with the occurance of earthquakes being the outliers.

Strategy #2: Broaden the search. Take all retrospective data for analysis. Study all the quakes that happened on a plate/ all eruptions of a volcano. Such events may occur very infrequently, and may be randomly distributed over time. However infrequent they may be, it would be worthwhile to study the antecedent conditions every time. Maybe, one can find a cause-effect relationship. Like concluding that most road accidents happen between 2.00am and 4.30am because there is a high likelihood of drivers sleeping behind the wheel (that is if they are still being driven by human drivers!).

Strategy #3: Combine the two, and seek patterns. Conduct stability studies and say why events do not happen, and conduct (with big data) longitudinal studies to infer why events do happen. Combine the two and create patterns. Such patterns can be immensely helpful in studying antecedents of events, and effectively fighting hindsight bias.

Fighting hindsight bias – applying it in managerial judgement

Straight, let us try and apply the learning from geo-physics to managerial judgement. First, consider prior probabilities of an event happening appropriately. Imagine an angry boss (no, I would like to believe that all bosses are not always angry!). In trying to understand what angered her today, use prior probabilities appropriately. She may be angry because she was being held accountable for something beyond her control (like your productivity), or just that she gets angry when she is frustrated about not being able to communicate or convince others. Strategy #1: ask yourself, when is she ‘not angry’? She is not angry when you complete your work on time, when you present your work properly (as she likes it), and when your work is of good quality. Then why is she angry today? You have the answer.

Second, stop thinking sample size and probability. Unless you have a really large sample size of such events, stop thinking about probability. Imagine predictions in sport or financial services. I was taught in my first finance classes, “past performance is not an indication of future performance”. And my brief indulgence with sports tells me that the law of averages is that “sustained good performance does not last long”. Would you be confident in predicting the goals scored by a football team if their prior performances were [4-1, 3-0, 5-2, and 1-0] or [1-4, 4-2, 2-2, 1-0] with the second number in each pair representing the goals scored by the opposition? Most would be confident of predicting the performance of the former scoring pattern than the latter. It might just happen that the next game is against the league leader (including someone with initials of CR7) and all these performances do not matter at all. You really need to collect loads of data on each team’s performance, including historical performances of all the opposition teams before you make any predictions.

Third, stay away from causal relationships (no I did not say casual relationships!), unless you have really “big data” on both the normal distribution of the event not happening, as well as the outlier chance of the event happening. Remember the wonder batsman, Pranav Dhanawade, the 17-year old kid who scored 1009* runs for his local cricket team. After a few years, his father has decided to return the scholarship he received, since he has not performed up to expectations (read it here). It was important that when an event of this nature (an extraordinary performance) occured, one needs to not just reward, but also invest in nurturing the talent. Without an adequate support structure to hone his talent, the financial reward was insufficient to sustain even acceptable performance.

So why do some firms perform better than others?

The answer may not lie in analyzing why those performed better, but in understanding what the others do that make them not perform as well as the high performers; longitudinal and cross-sectional (big) data on multiple firms’ performance; and being very cautious about making causal assertions. Isn’t this the core of strategy research, today?


(c) 2017. Srinivasan R


Reference class forecasting using pluralism: Fighting single parameter obsessions

Traveling around prestigious Universities and Business Schools in the US this week on an institutional assignment (this post comes from Chapel Hill, NC), one thing struck me in this society, pluralism. I read with interest my friend Suresh Satyamurthy’s piece in yourstory.com (link here) that uses a hangman metaphor for an investor review in the start-up world. In Suresh’s start-up world, the investor is hung-up on a single parameter – scale (pun intended). It set me thinking – any evaluation of performance (more importantly, assessment of future performance) needs to be grounded in as many parameters as possible. In this post, I will introduce Reference Class Forecasting (RCF) as a technique for fighting such biases like single parameter obsession. Drawing on research on behavioural economics, I attempt to provide guidelines for entrepreneurs and investors to make better forecasts of future performance.

Intent-outcome relationship

This is possibly the first and the most obvious starting point of any assessment. Start with what was the intent in the first place. If the stated intent of the platform was to transform the industry, please define what is industry transformation and measure those, and not start harping on profitability. Not every business needs to show the same kind of performance on the same parameters. Take the example of baby products company, firstcry.com. The founders’ motivation to start-up arose from the difficulty in finding products for their own children – availability, variety, poor quality, and certain international products/ brands not available in India (read their interview here). So, the best performance metric for assessing the performance of firstcry.com would be to see if they have been able to “make a wide variety of good quality international products and brands available to parents”. The performance metrics would therefore be (a) number of outlets – online and offline, (b) inventory size and variety, (c) number of brands, (d) number of products uniquely available at firstcry.com, at least in a specific geography, and (e) number of parents reached. Scale here would mean growth in number of customers, brands, products, and channels. Not GMV, not anything else. Yes, profitability is important, but not the first parameter of success.

Constructs, variables, and measures

Hmm, I may sound like a research methods teacher, but I think this is important to understand. Everyone (at least those reading this blog post) understands that everything could be measured in a variety of ways. A construct is an attribute of a person/ entity that cannot be observed or measured directly, but can be inferred using a number of indicators, known as manifest variables. For instance, entrepreneurial success is a construct that is measured by a variety of variables ranging from firm performance, firm growth, market power, firm’s influence in industry standard setting, pioneering innovation, to even investor wealth creation (or exit valuation) at sell-out to a large corporation. Each of these variables could be measured using different measures; see for instance, the number of measures we identified for firm growth in the context of firstcry.com in the last section. Can you see a decision–tree like structure here?


So, when I think of multiple parameters, I am reminded of indices. Indices like Human Development Index (HDI) as a measure of economic development, or a Consumer Price Index (CPI) as a measure of inflation. Each and every of these indices are prone to discussions and debates about what constitutes these indices and why; and in what proportion/ weights. Take for instance HDI that is a composite of life expectancy (personal well being), education (social well being), and income per capita (economic well being). Why only these? What about social and racial discrimination? What about ecological sustainability? Similar is the case with consumer price index (CPI), which is calculated using prices of a select basket of items, with price data collected weekly, monthly, or half-yearly for specific items. Again, why should tobacco products prices be included in CPI calculations? Or we could debate of how the housing price index is calculated for inclusion in the CPI. Does age composition of the household matter in calculating the CPI basket? For a relatively young family, would the basket of goods not be different than those families with more elders than children?

So, to cut my long argument short, please refrain from creating indices that just simply represent a mish-mash of parameters to evaluate a start-up.

My recommendation: Use reference class forecasting

Reference class forecasting (RCF), sometimes also referred to as comparison class forecasting is a method recommended to overcome cognitive biases and misplaced incentives. My favourite article on this appeared in The McKinsey Quarterly (see here). Let me elaborate the theory first.

Nobel laureate Daniel Kahneman and Amos Tversky’s work on theories of decision making under uncertainty is the starting point for understanding RCF. They described how people make decisions that are seemingly irrational while dealing with probabilities and forecasts using Prospect Theory (see an insightful class by Prof. Schiller, another Nobel Laureate, on YouTube here). Summary relevant to us: people are more concerned by smaller losses than equivalent gains; and people round off probabilities of occurrence to either zero or one, when it is close to either, and in between, exaggerate.

Let us understand how an entrepreneur could use this theory to manipulate his capital provider. She shows some initial success, and likens her business model to an already successful model somewhere else, in some other context; and gets the investor to exaggerate the probability of her success. For example, I know a friend wanted to build the Uber of toys in India. Why buy toys, just rent them, let the child play for a week, and return it back to the library next week to issue a new set of toys. Sounds exciting? Just that the economics did not work out the cost of damages to the toys small children could do, that would render it useless for the next borrower (like breaking one car wheel). The entrepreneur kept the rentals high enough to account for such losses, and soon her customers realised that the rentals were working out far more expensive than buying new toys, notwithstanding the child refusing to part with his toys at the end of the week. The entrepreneur continued to convince his investors to keep investing in her, luring them to wait for the economies of scale to kick-in and she could have enough bargaining power with toy manufacturers to directly import from the North of Himalayas, but that never happened and the investor exited the firm at its lowest valuation.

These biases manifest themselves in the form of delusional optimism, rather than a clear understanding and detailed evaluation of costs and benefits, even when hard data is available.

Steps in using RCF: A field guide

RCF helps forecasters and planners overcome these biases by situating the reference point outside of the subject being assessed. In order to forecast (or assess future performance) a business, investors need to identify a reference class of analogous businesses, estimate the distribution of the outcomes of those firms, and benchmark the enterprise at an appropriate point of the distribution. Firstly, the investors should identify appropriate reference class for the enterprise. These reference classes need to be identified using a variety of parameters that match the enterprise. The next step is to analyse the performance of the firms in the reference class and map them into a probability distribution. There may be clusters of firms that may emerge during this distribution-mapping exercise; there may be instances of only extremes of firm performance observed (say in winner-takes-all markets); or there could be continuous distributions.

The next task is to use pluralism in the parameters to position the enterprise in the distribution. Here is where multiple parameters would help in an reliable estimate of the position. For instance, an Uber for toys in India would only work when the marginal costs of renting out a car (wear and tear) is negligible compared to the fixed (sunk) costs of buying the car. Whereas in the toys market, the marginal costs of a child playing with the toy is a significant proportion of the market price of the toy, and therefore this enterprise would not be subject to the same evolutionary direction as Uber. However, if the enterprise was repositioned as a toy library (as my friend ultimately did), it would work – look at how the cost structures of library and toys work. It provided her a benchmark on only buying those toys that would be durable, held the customer’s attention for only short periods of time, and were very expensive to buy. Typical examples were multi-player games, which no child wanted to own independently (given the small size of families today), but would rent out during the weekends/ birthday parties for a small proportion of the cost of the game.

So, hers is calling entrepreneurs and investors to overcome such cognitive biases and forecast better.

Comments and feedback welcome.