Skip to content
  • There are no suggestions because the search field is empty.

Multi-Armed Bandit (MAB) & Auto-Allocation in Convert

🎰 Boost Conversions Faster with Convert's Multi-Armed Bandit (MAB) Auto-Allocation for Smarter Traffic Allocation

MAB (Multi-Armed Bandit) is a traffic allocation method that helps you dynamically assign more traffic to better-performing variations during the experiment, rather than waiting until the end. This allows faster learnings, better use of traffic, and improved conversions during experimentation.

This feature is now available under Auto-Allocation in Convert for both Frequentist and Bayesian statistical models.

What is Auto-Allocation (MAB)?

Auto-allocation uses MAB strategies to shift more traffic to winning variations as the test progresses. This is different from traditional (manual) traffic allocation, where traffic is evenly split regardless of performance.

Once enabled, MAB continuously analyzes results and allocates traffic automatically using one of the selected strategies:

  • Thompson Sampling (Default)

  • Epsilon-Greedy

  • UCB (Upper Confidence Bound)

In practice, these strategies all try to solve the same problem: how to balance exploration and exploitation.

  • Exploration means continuing to send some traffic to other variations so the system can keep learning.
  • Exploitation means sending more traffic to the variation that currently looks strongest based on the selected primary metric.

This balance is important because a variation that looks best very early may not remain the best once more data comes in. MAB helps avoid committing too early while still increasing exposure to promising variations.

What the MAB Strategies Mean

Thompson Sampling (Default)

Thompson Sampling balances exploration and exploitation using probability distributions. Instead of always selecting the current leader, it estimates how likely each variation is to be the best and allocates traffic accordingly.

Why use it?
It adapts quickly to early signals without over-committing too soon, making it the best default choice for most experiments.

Epsilon-Greedy

Epsilon-Greedy routes a fixed percentage of traffic to random exploration, while the rest goes to the current best-performing variation.

Why use it?
It is predictable and simple, and is a good choice when you want explicit control over the exploration rate.

UCB (Upper Confidence Bound)

UCB prioritizes variations with high uncertainty, which forces more exploration earlier in the experiment. As the system gains confidence, traffic shifts more strongly toward the better-performing option.

Why use it?
It can converge faster when one variation is clearly superior, but it is generally less suitable for short experiments because it may spend more time exploring uncertainty early on.

How to Enable MAB (Auto-Allocation)

Auto-Allocation (MAB) can be enabled from:

  • Summary Page

  • Stats & Settings Dialog

When enabled, you cannot manually edit traffic distribution.


Stats & Settings Configuration

When enabling Auto-Allocation, you’ll be able to configure the following under Stats & Settings:

  1. MAB Strategy

    • Dropdown with options: Thompson Sampling (default), Epsilon Greedy, UCB

  2. Exploration Factor

    • Default: 0.3

    • The Exploration Factor controls how aggressively the selected strategy continues exploring alternatives instead of concentrating traffic on the current best-performing variation.

      A higher Exploration Factor means more learning and more willingness to keep testing other variations.
      A lower Exploration Factor means faster concentration on the current leader, with less exploration.

      How to think about it:

      • Lower exploration factor: more aggressive exploitation of the current leader
      • Higher exploration factor: more conservative allocation, with more traffic reserved for learning

      Examples:

      • Example 1: Exploration Factor = 0.1
        The algorithm will explore less and move traffic more quickly toward the current leading variation. This can be useful when you want stronger optimization pressure, but it also increases the risk of favoring an early leader too soon.
      • Example 2: Exploration Factor = 0.3
        This is the default balanced setting. It allows the system to keep learning while still shifting meaningful traffic toward stronger-performing variations. For most experiments, this is a good starting point.
      • Example 3: Exploration Factor = 0.5
        The algorithm will explore more aggressively and keep testing alternatives longer. This can be helpful when you expect performance to fluctuate, when differences between variations are small, or when you want to reduce the chance of premature commitment.

       

⚠️ Important:
The exact effect of the Exploration Factor depends on the selected MAB strategy, but in all cases, it influences the balance between continued learning and traffic concentration.

  1. Primary Metric

    • New setting available at both Project and Experience level

      • Experience Level - Under Summary tab > Three dots > Stats & Settings
        MAB2
      • Project Level - Under Projects > Configuration > Stats Settings
        MAB10-1
    • Metrics available:

      • Conversion Rate (CR) – Default

      • Revenue Per Visitor (RPV)

      • Average Products Per Visitor (APPV)

    • Used to evaluate the performance of the primary goal and MAB allocation

    • Cannot be changed after MAB is enabled on an active experience
      MAB6

Primary Goal and Goal Optimization

When switching from Manual to Auto-Allocation:

  • You’ll be prompted to confirm the Primary Goal which is the only one optimized by MAB

  • If multiple goals exist, only the Primary Goal is used for optimization

  • You cannot change the Primary Goal once the experience is active with MAB enabled

    • You’ll be advised to clone the experience to modify the Primary Goal
      MAB8

When MAB Can Be Used

  • Available for A/B and Multivariate tests

  • Works for Web and Full Stack experiences

  • Requires Sequential Test Type when using Frequentist model

    • If Sequential is not enabled, Auto-Allocation will not be available and a tooltip will explain the restriction

  • Not available for:

    • A/A Tests

    • Deploy Tests

Reporting with MAB Enabled

  • Report tables will include an additional column for Traffic Allocation

  • A graph will be displayed showing Traffic Allocation Over Time to visualize how MAB adapts
    MAB9

Keep in mind that MAB behavior may not always look like “send everything to the current winner.” Because the system continues exploring, some traffic may still go to other variations even when one variation appears to be ahead. This is expected behavior, not a product issue.

Access and Lock Permissions

  • MAB is only available to users with the MAB feature included in their Convert plan

  • If MAB is not available under your plan, a plan lock tooltip will show

  • If your account lacks permission to change stats settings, an access lock tooltip will show

  • Plan lock takes precedence over access lock
    MAB11

Notes and Best Practices

  • MAB is not enabled by default on new experiences

  • For users without the MAB feature in their plan:

    • MAB-related settings will not be accessible

    • Attempts to enable MAB via summary will be blocked

  • You can save Stats Settings while an experience is still in Draft mode

  • Best practices for choosing a strategy:

    • Use Thompson Sampling if you want the best default choice for most experiments
    • Use Epsilon-Greedy if you want a simpler and more predictable exploration model
    • Use UCB if you are comfortable with stronger early exploration to reduce uncertainty

    Best practices for choosing an Exploration Factor:

    • Use a lower value when you want the system to concentrate traffic faster
    • Use the default value when you want a balanced setup
    • Use a higher value when you want to keep learning longer before strongly favoring a leader