Playing To Win
Overcoming the Pervasive Analytical Blunder of Strategists
There was so much interest in, commentary on, and even befuddlement about Strategists: Stop Obsessing about Averages that I decided to write a follow up piece on related misuse of data analytics. My 33rd Playing to Win/Practitioner Insights is on The Pervasive Analytical Blunder of Strategists. (Links for the rest of the PTW/PI series can be found here.)
Modern Business Training
Modern business education trains all students to believe that good decisions are based on rigorous analysis of data, and strategy courses are certainly no exception. Managers who don’t make decisions based on rigorous analysis of data are presented as willfully incompetent managers. They know better. Yet they insist through sloth, arrogance, and/or ignorance in making bad decisions which, by definition, are those made on some other basis than the rigorous analysis of data. If you think I am overstating this, just think back to your most recent formal business training and you will realize that I am not.
Whether students are taught statistics or analytical methodologies under names such as “decision analytics”, “decision making”, or “data analytics,” all those courses have standard statistical concepts underpinning them. Using varying levels of formality, these courses provide instruction on how students can infer from the analysis of data what is the data-based, and hence by definition, correct decision. And as I have argued earlier in this series, when graduates enter the strategy consulting industry, they will find more of the same focus on the central importance of data-based decisions.
A critical tenet of the inferential methodology is that if a conclusion is to be drawn about a given universe, the sample analyzed must be representative of that universe. For example, to infer ‘what customers want,’ we need to collect data from a representative sample of said customers. If we collect data only from older customers or big city customers or male customers or west coast customers, we are taught that we can’t make an inference to the general universe of all customers from this unrepresentative or biased sample. Or if we want to know our average product costs across our network of factories, we can only determine that by analyzing costs from a representative set of factories, not from, say, our five biggest or newest factories.
So, a good decision is made on the basis of inference from the rigorous analysis of data drawn from a sample that is representative of the universe with respect to which we seek to make decisions. And in a bad one, either analysis isn’t done at all or it is based on a biased sample. That is the methodological doctrine of all modern business education. That means all fields of business from strategy to marketing to finance to operations to accounting — and even to human resources.
The Inherent Data Limitation
What is generally not taught explicitly is that the inherent data limitation is that 100% of the world’s data is from one era: the past. Standing at any point in time, all data is from the past. Even if we have just finished running an experiment or collecting a sample, at that point in time, the actions giving rise to our data have happened already. There is no data about the future — ever. Thus, for every single analysis that modern business education tells you that you must do to be a good (not bad) manager, you are restricted strictly to a pool of data that is in the past as of the time that you analyze and draw inferences from it. There is no way around this limitation: it is an inherent part of the thing we call life!
When This Data Limitation is Not a Problem
There is a class of circumstances in which this data limitation poses no challenge to assembling a representative sample. If we can say that the pool of data from the past is indistinguishable from the pool of data from the future, then we can be confident that the data from the past is completely representative not only of the past but the future as well. For example, if I hold a ball in my hand, palm downward, and release it 100 times, it will fall to the ground every time. Since balls have been dropping for time immemorial and there seems like little chance that the force of gravity is going to change any time soon, it is reasonable to assume that the 100 ball-drop sample is representative of the universe of ball-drops, including those that will happen in the future.
Aristotle, the 4th century B.C. Greek philosopher who invented the concept of data analytics, called this kind of ball-drop scenario “the part of the world where things cannot be other than they are.” Gravity will always operate the way it has and does. Hence balls will always drop the way they have done. Hence the use of data from the past is representative of the future and fully usable to make inferences about the future on which to make decisions. If you prefer a more recent theorist, the Santa Fe Institute’s David Wolpert makes the same point, calling this as the math part of the world, the part governed by immutable laws where we can make confident, data-based predictions about the future.
When This Data Limitation is a Giant Problem
Let’s switch from ball dropping to smartphone usage. If we did a survey of smartphone usage patterns in 1999, we would find that customers didn’t use smartphones at all. That was for a good reason: the first commercially successful smartphone was launched by BlackBerry a year later in 2000. If we broaden the definition to Personal Digital Assistants (PDA), the installed base in 1999 was well under 10 million. Analyzing that in 1999, we might conclude that the future usage of intelligent handheld devices like PDAs and smartphones would be modest. Perhaps we would extrapolate the growth in the data leading up to 1999 and come up with, say, a couple of hundred million by 2020. But no analysis of 1999 data could get the analyst to an installed base of 4.4 billion a mere 20 years later.
Human use of smartphones is squarely in Aristotle’s other category: “the part of the world where things can be other than they are.” In the past 20 years, we humans have dramatically changed the way we live and operate based on the smartphone. In 2000, only a miniscule fraction of humanity knew what a smartphone was. Now half of humanity owns and intensively uses one. Wolpert calls this the world of physics — where messy, real things interact in unpredictable ways. To generalize further, it is the part of the world in which humans interact with one another and try to get things done. This part of the world forever changes.
In this part of the world, a sample drawn from the past is utterly unrepresentative of the universe including the future. And it is particularly problematic in being unrepresentative in ways we cannot know until after the fact. It is not even that we can say “Oh we sampled men only and probably women are more likely to be users of the product/service in question than men, so let’s adjust our analytical estimate upward a bit.” In this case, we have no idea of the nature, direction, or magnitude of the flaw — until the future unfolds.
The Pervasive Analytical Error
Despite Aristotle’s very specific warning to never use analytical inference in the part of the world where things can be other than they are, and Wolpert’s modern restatement of the warning, not only does the business world routinely violate the principles of statistics and sense to do exactly that, but the learned institutions of business education also teach the error-filled practice with enthusiasm bordering on fanaticism. It flat out disables otherwise intelligent men and women. It deludes them into thinking that their choices are valid when they are, in fact, designed to be fallacious blunders.
And the problem is getting worse not better. Data analytics is all the rage. Stunningly enough, “people analytics” is a big field. The implicit presumption behind people analytics is that people are part of the world in which things cannot be other than they are. I.e., people cannot and will not change. That is an insane assumption that is inconsistent with thousands of years of, well, data.
As a strategist, before you make any decision based on data analysis of any sort, you must ask yourself one simple question: Am I willing to assume that the future will be identical to the past? If the answer is yes, then do what the analysis tells you to do. If the answer is no, then DO NOT USE THE ANALYSIS. You have no idea how the future will change the inferences you make based on the analysis. It doesn’t matter what you have been taught. In this situation — which is the vast majority of cases for a strategist — what you have been taught is dangerous to the health of you and your business. All it will do is make you confident about your analytical delusion. In reality, your data analysis has the same level of rigor as bloodletting or application of leeches.
In this part of the world, heed Aristotle’s advice. Imagine possibilities and choose the one for which the most compelling argument can be made. Analysis is logic to which data is applied. For the future, there is no data, so the degree to which your logic is more compelling than competing logic (or not) is the key question.
The best way to test your logic is by turning the future into the past. The data problem with the next six months is that there is no data yet about the next six months. The good thing about the next six months is that in six months, it will be a part of the past. That is why prototyping is the true friend of effective strategists. Since you have no data about the future, take the logic that you feel is most compelling and put it to a test — the most thorough test that is easily affordable — and collect data from that test. If you do this iteratively, you will get more and more insight into a proposition that is more and more compelling — though never ‘proven.’
That is a more rigorous way to do strategy than the analytical method you have been taught if you have had formal strategy training. If you haven’t; you are in luck! You don’t have to unlearn destructive and delusional practices.