The Surprising Power of Business Experiments

The Surprising Power of Business Experiments Interview with Stefan H. Thomke

I had the opportunity recently to interview fellow author Stefan H. Thomke, the William Barclay Harding Professor of Business Administration at Harvard Business School to talk with him about his new book Experimentation Works: The Surprising Power of Business Experiments, to explore the important role that experimentation plays in business and innovation.

1. Why is there a business experimentation imperative?

My book Experimentation Works is about how to continuously innovate through business experiments. Innovation is important because it drives profitable growth and creates shareholder value. But here is the dilemma: despite being awash in information coming from every direction, today’s managers operate in an uncertain world where they lack the right data to inform strategic and tactical decisions. Consequently, for better or worse, our actions tend to rely on experience, intuition, and beliefs. But this all too often doesn’t work. And all too often, we discover that ideas that are truly innovative go against our experience and assumptions, or the conventional wisdom. Whether it’s improving customer experiences, trying out new business models, or developing new products and services, even the most experienced managers are often wrong, whether they like it or not. The book introduces you to many of those people and their situations—and how business experiments raised their innovation game dramatically.

2. What makes a good business experiment, and what are some of the keys to successful experiment design?

In an ideal experiment, testers separate an independent variable (the presumed cause) from a dependent variable (the observed effect) while holding all other potential causes constant. They then manipulate the former to study changes in the latter. The manipulation, followed by careful observation and analysis, yields insight into the relationships between cause and effect, which ideally can be applied and tested in other settings. To obtain that kind of learning—and ensure that each experiment contains the right elements and yields better decisions—companies should ask themselves seven important questions: (1) Does the experiment have a testable hypothesis? (2) Have stakeholders made a commitment to abide by the results? (3) Is the experiment doable? (4) How can we ensure reliable results? (5) Do we understand cause and effect? (6) Have we gotten the most value out of the experiment? And finally, (7) Are experiments really driving our decisions? Although some of the questions seem obvious, many companies conduct tests without fully addressing them.

Here is a complete list of elements that you may find useful:

Hypothesis

Is the hypothesis rooted in observations, insights, or data?
Does the experiment focus on a testable management action under consideration?
Does it have measurable variables, and can it be shown to be false?
What do people hope to learn from the experiments?

Buy-in

What specific changes would be made on the basis of the results?
How will the organization ensure that the results aren’t ignored?
How does the experiment fit into the organization’s overall learning agenda and strategic priorities?

Feasibility

Does the experiment have a testable prediction?
What is the required sample size? Note: The sample size will depend on the expected effect (for example, a 5 percent increase in sales).
Can the organization feasibly conduct the experiment at the test locations for the required duration?

Reliability

What measures will be used to account for systemic bias, whether it’s conscious or unconscious?
Do the characteristics of the control group match those of the test group?
Can the experiment be conducted in either “blind” or “double-blind” fashion?
Have any remaining biases been eliminated through statistical analyses or other techniques?
Would others conducting the same test obtain similar results?

Causality

Did we capture all variables that might influence our metrics?
Can we link specific interventions to the observed effect?
What is the strength of the evidence? Correlations are merely suggestive of causality.
Are we comfortable taking action without evidence of causality?

Value

Has the organization considered a targeted rollout—that is, one that takes into account a proposed initiative’s effect on different customers, markets, and segments—to concentrate investments in areas when the potential payback is the highest?
Has the organization implemented only the components of an initiative with the highest return on investment?
Does the organization have a better understanding of what variables are causing what effects?

Decisions

Do we acknowledge that not every business decisions can or should be resolved by experiments? But everything that can be tested should be tested.
Are we using experimental evidence to add transparency to our decision-making process?

Experimentation Works 3. Is there anything special about running online experiments?

In an A/B test, the experimenter sets up two experiences: the control (“A”) is usually the current system—considered the champion—and the treatment (“B”) is some modification that attempts to improve something—the challenger. Users are randomly assigned to the experiences, and key metrics are computed and compared. (A/B/C or A/B/n tests and multivariate tests, in contrast, assess more than one treatment or modifications of different variables at the same time.) Online, the modification could be a new feature, a change to the user interface (such as a new layout), a back-end change (such as an improvement to an algorithm that, say, recommends books at Amazon), or a different business model (such as an offer of free shipping). Whatever aspect of customer experiences companies care most about—be it sales, repeat usage, click-through rates, or time users spend on a site—they can use online A/B tests to learn how to optimize it. Any company that has at least a few thousand daily active users can conduct these tests. The ability to access large customer samples, to automatically collect huge amounts of data about user interactions on websites and apps, and to run concurrent experiments gives companies an unprecedented opportunity to evaluate many ideas quickly, with great precision, and at a negligible cost per additional experiment. Organizations can iterate rapidly, win fast, or fail fast and pivot. Indeed, product development itself is being transformed: all aspects of software—including user interfaces, security applications, and back-end changes—can now be subjected to A/B tests (technically, this is referred to as full stack experimentation).

4. What are some of the keys to building a culture of large-scale experimentation?

Shared behaviors, beliefs, and values (aka culture) are often an obstacle to running more experiments in companies. For every online experiment that succeeds, nearly 10 don’t—and in the eyes of many organizations that emphasize efficiency, predictability, and “winning,” those failures are wasteful. To successfully innovate, companies need to make experimentation an integral part of everyday life—even when budgets are tight. That means creating an environment in which employees’ curiosity is nurtured, data trumps opinion, anyone (not just people in R&D) can conduct or commission a test, all experiments are done ethically, and managers embrace a new model of leadership. More specifially, companies have addressed some of these obstacles in the following ways:

They Cultivate Curiosity

Everyone in the organization, from the leadership on down, needs to value surprises, despite the difficulty of assigning a dollar figure to them and the impossibility of predicting when and how often they’ll occur. When firms adopt this mindset, curiosity will prevail and people will see failures not as costly mistakes but as opportunities for learning. Many organizations are also too conservative about the nature and amount of experimentation. Overemphasizing the importance of successful experiments may inadvertently encourage employees to focus on familiar solutions or those that they already know will work and avoid testing ideas that they fear might fail.

They Insist That Data Trump Opinions

The empirical results of experiments must prevail when they clash with strong opinions, no matter whose opinions they are. But this is rare among most firms for an understandable reason: human nature. We tend to happily accept “good” results that confirm our biases but challenge and thoroughly investigate “bad” results that go against our assumptions. The remedy is to implement the changes experiments validate with few exceptions. Getting executives in the top ranks to abide by this rule is especially difficult. But it’s vital that they do: Nothing stalls innovation faster than a so-called HiPPO—highest-paid person’s opinion. Note that I’m not saying that all management decisions can or should be based on experiments. Some things are very difficult, if not impossible, to conduct tests on—for example, strategic calls on whether to acquire a company. But if everything that can be tested online is tested, experiments can become instrumental to management decisions and fuel healthy debates.

They Embrace a Different Leadership Model

If most decisions are made through experiments, what’s left for managers to do, beyond developing the company’s strategic direction and tackling big decisions such as which acquisitions to make? There are at least three things:
Set a grand challenge that can be broken into testable hypotheses and key performance metrics. Employees need to see how their experiments support an overall strategic goal.

Put in place systems, resources, and organizational designs that allow for large-scale experimentation. Scientifically testing nearly every idea requires infrastructure: instrumentation, data pipelines, and data scientists. Several third-party tools and services make it easy to try experiments, but to scale things up, senior leaders must tightly integrate the testing capability into company processes.

Be a role model. Leaders have to live by the same rules as everyone else and subject their own ideas to tests. Bosses ought to display intellectual humility and be unafraid to admit, “I don’t know…” They should heed the advice of Francis Bacon, the forefather of the scientific method: “If a man will begin with certainties, he shall end in doubts; but if he will be content to begin with doubts, he shall end in certainties.”

Continue reading the article on InnovationManagement.se