SuperPM Blog/Prompt Guide

Design an autonomous experiment loop for product optimization

You know something in your product could be better — onboarding copy, pricing page layout, notification timing — but running A/B tests manually is slow and you never get through enough variants. Apply Karpathy's autoresearch pattern (https://github.com/karpathy/autoresearch) to set up a structured experiment loop where each iteration builds on the last.

Product Strategy

10 uses·Published 3/26/2026·Updated 3/27/2026

The Experiment Loop Is the Product Manager's Scientific Method

Product management borrows heavily from science without always admitting it. We form hypotheses about customer behavior, design interventions to test them, and use data to update our beliefs. But unlike scientists, most product teams lack a rigorous, repeatable process for running experiments.

The Problem

The typical product experiment goes like this: someone has an idea, the team debates it, they build it, they check the metrics a week later, and they either celebrate or move on. There is no formal hypothesis. There is no pre-registered success criteria. There is no systematic learning captured for future experiments.

This ad hoc approach produces two failure modes. First, teams run experiments that cannot actually disprove their hypothesis, because they never specified what "failure" looks like. Second, teams fail to compound their learnings, running each experiment in isolation rather than building on what previous experiments revealed.

A 2023 Eppo analysis of 10,000 product experiments found that teams with documented hypotheses and pre-registered success criteria were 3.2 times more likely to make correct ship/no-ship decisions compared to teams that evaluated results post hoc. The discipline of writing down what you expect to happen before you see the data is the single most impactful practice in experimentation.

How This Prompt Works

This prompt creates a self-reinforcing experiment loop that mirrors the scientific method. For each experiment, it generates:

Hypothesis: A specific, falsifiable prediction about what will happen
Method: The experiment design, including control conditions, audience segmentation, and duration
Success criteria: Pre-registered thresholds that define whether the hypothesis is supported or refuted
Learning capture: A structured template for documenting what was learned, regardless of outcome
Next experiment: Based on the results, what the next logical experiment should test

The "autonomous" aspect means the loop is self-perpetuating. Each experiment's results inform the next experiment's hypothesis, creating a compounding knowledge base rather than a scattered collection of one-off tests.

According to Microsoft's ExP Platform team, which runs over 20,000 controlled experiments per year, the most valuable output of experimentation is not any individual result but the organizational capability to learn quickly. Teams that run more experiments per unit of time make better product decisions, even when most individual experiments fail.

When to Use It

When launching a new feature to set up a learning system rather than a one-time measurement
During growth optimization where small improvements compound over time
When entering a new market to rapidly test assumptions about unfamiliar customer behavior
As a team practice to build experimentation muscle across the organization

Common Pitfalls

Not running experiments long enough. According to a 2023 Statsig report, 44% of product experiments are concluded before reaching statistical significance, leading to a false positive rate of up to 30%. Patience is a methodology, not a personality trait.
Ignoring guardrail metrics. An experiment that improves conversion by 5% but increases support tickets by 50% is not a win. Define guardrail metrics that must not degrade before you launch.
Optimizing for local maxima. Small iterative experiments can trap you on a local peak. Occasionally run big, bold experiments that explore fundamentally different approaches.
Experimentation theater. Running experiments without the willingness to act on surprising results is worse than not running them. If the team will ship the feature regardless of results, do not waste time pretending to experiment.

Design an autonomous experiment loop for product optimization

The Experiment Loop Is the Product Manager's Scientific Method

The Problem

How This Prompt Works

When to Use It

Common Pitfalls

Further Reading

Sources

Prompt details

Ready to try the prompt?