Common questions defined
Dynamic Optimisation allows marketers to test up to 10 variants in an experiment. But this does often bring up questions.
This article is for those who've themselves wondering:
Should I always test 10 variants?
How many variants can I statistically speaking safely test?
Does it differ when optimising on clicks versus opens?
We've got information below that will clarify these questions regarding minimum testing requirements for our Dynamic Optimisation technology.
Optimisation events
The answers to most questions regarding Dynamic Optimisation testing criteria hinge on what we refer to as optimisation events. These are simply occurrences of your chosen optimisation metric (i.e. the number opens or clicks).
For Dynamic Optimisation to work properly, there must be a certain number of optimisation events that occur within a certain time period. These differ for trigger and broadcast experiments.
Triggered experiment minimums
Let's start with triggers. For triggers to optimise properly, Jacquard suggests a minimum of 200 optimisation events per variant per day on average.
Looking at the table below, you can see how many optimisation events are suggested per day to optimise triggers in a meaningful, statistically significant way:
Events per day | Number of variants to test |
More than 1,000 events | 5 variants |
More than 1,200 events | 6 variants |
More than 1,400 events | 7 variants |
More than 1,600 events | 8 variants |
More than 1,800 events | 9 variants |
More than 2,000 events | 10 variants |
If you don't have enough optimisation events on average to allow for at least five variants, Jacquard does not recommend testing on that particular campaign.
Broadcast experiment minimums
Open optimisation
For broadcast experiments, things are a bit different. Jacquard suggests a minimum of 200,000 delivered recipients for five variants. Jacquard also recommends a minimum of four hours of optimisation time for a broadcast experiment optimising on opens.
So, when optimising on opens over four hours:
Delivered audience size | Number of variants to test |
More than 200,000 recipients | 5 variants |
More than 220,000 recipients | 6 variants |
More than 240,000 recipients | 7 variants |
More than 260,000 recipients | 8 variants |
More than 280,000 recipients | 9 variants |
More than 300,000 recipients | 10 variants |
Click optimisation
For broadcast experiments where you wish to optimise on clicks, Jacquard suggests a minimum of 2,000,000 delivered recipients for five variants. Jacquard also recommends a minimum of six hours of optimisation time for a broadcast experiment optimising on clicks.
So, when optimising on clicks over six hours:
Delivered audience size | Number of variants to test |
More than 2,000,000 recipients | 5 variants |
More than 2,200,000 recipients | 6 variants |
More than 2,400,000 recipients | 7 variants |
More than 2,600,000 recipients | 8 variants |
More than 2,800,000 recipients | 9 variants |
More than 3,000,000 recipients | 10 variants |
Minimum tested variants
Unless otherwise specified, we do not generally recommend testing fewer than five variants with Dynamic Optimisation.
If you don't have a large enough list to allow for at least five variants, Jacquard does not recommend testing with Dynamic Optimisation on that particular campaign type.
Experimenting outside of recommended guidelines
We certainly won't prevent you from testing as many variants as you'd like in a particular experiment, or from choosing clicks instead of opens.
However, you do risk the following outcomes if you choosing to experiment outside these minimum guidelines.
False winner identification and subsequent reversal
You may see a particular variant chosen as a winner when if given more data or time to mature, would not actually have been the winner overall.
Generally, we don't see a full reversal, where the declared winner ends up at the bottom of the pack. But it certainly still amounts to increased opportunity cost and leaving lots of potential uplift on the table.
Statistical guessing
When Dynamic Optimisation does not have enough data to make a proper statistically significant conclusion, it will still attempt to optimise based on the data is has received.
However, with so little data returned for each variant or not enough time to make a mature decision, the outcome is essentially a statistical guess: The equivalent of essentially picking one at random to send to more of you audience.
You may still see uplift reported when this happens. But that uplift was essentially happenstance. More often, you'll see negative uplift, which means the opportunity cost of conducting the test outweighed the benefit.
Related articles
Last reviewed: 18 June 2024