Measuring experiment success is a hard problem.

Establishing causality is easy with experiments. That’s why they’re great! But the problem of what variable specifically we want to look at decide whether an experiment’s outcome is “good” is a surprisingly difficult problem.

If we choose a metric that definitely indicates a win because it is an ultimate company metric (revenue, profit, etc), the experiment will have to run for a very long time because everything that affects our business moves the metric. Separating the signal of our experiment from all that noise will take longer. Speed matters.

So we want to choose a metric that is closer to the experiment. But ultimately, it still needs to move our bottom-line goal metrics. This balancing act makes choosing how to measure experiment success critical and difficult.

Basic rules:

  1. Decide on the metric that defines success prior to the experiment.
  2. Limit the number of metrics that define success so that decision-making is clear and well-specified prior to the experiment.
  3. Reduce the set of metrics to consider to those that have a demonstrated relationship with an ultimate goal of the business (revenue, engagement, profit, etc).
  4. Choose the metric most directly impacted by the experimental intervention from among this set.