What Kinds of Experiments Work in Aampe?

A Fundamental Difference

Traditional CRM systems operate through campaigns: fixed messages sent to defined audiences at specific times. Each campaign typically has its own random holdout group, and attribution is straightforward because all treated users receive the same message simultaneously. Agentic AI systems like Aampe operate differently. Instead of campaigns, they create personalized message sequences for each user. This level of personalization is great for customers, but creates difficulties for traditional measurement techniques. Path Dependency and Its Implications In true 1:1 personalized systems like Aampe, actions become path-dependent. Each message influences the messages that follow depending on how it performs. This creates several measurement challenges for an individual message:

Random Assignment: Recipients and non-recipients of a given message have been chosen strategically, not randomly.
Cumulative Impact: Traditional A/B tests measure the immediate response to a message. Personalized sequences build impact over time as each message learns and improves from prior messages.
Treatment Timing: Measuring impact is difficult when some users receive a message on a Tuesday morning, while other users receive the same message on a Saturday evening.
Learning Phases: Aampe agents learn in real-time from watching users respond to different messages. It can take several customer interactions to discover strong patterns. Messages at the start of a sequence are not typically as impactful as messages later on.

Implications for Testing

Standard campaign-level holdouts lose meaning when users receive different messages at different times. But 1:1 personalization is exactly that, different messages at different times. So rather than evaluate individual messages, we need to step back and evaluate the message-generation process. Instead of asking, “does message A beat message B?” we start asking, “does an adaptive, learning system outperform static rules?”

This requires longer test periods, different group structures, and metrics that capture cumulative rather than immediate effects. What does this look like in practice? Learn how to set up your first holdout test in Aampe.

Experiment FAQs

How do we run an clean holdout test with Aampe?

Most customers who want to measure the incremental impact of Aampe do so by assigning a randomized portion of their users to Aampe audiences. This requires passing in the group assignment as a user attribute and using it to define audience eligibility.

Keep reading to learn more about how to set up an experiment in Aampe.

When is my Aampe account ready for an experiment?

Aampe performs best with a rich content library and frequent customer interaction. On the other hand, it would be unwise to spend a year building out content in Aampe before knowing if it delivers value to your business.

Somewhere in the middle is the sweet spot. You’ve build up enough of a content library in Aampe so that your agents have many ways to engage with users, but you’re still testing early to see where/how Aampe is most effective. This helps inform your next wave of content.

We ran a successful test on a small subset of users? Will my results scale to my entire customer base?

We certainly hope so! It depends on many things including the setup of the initial test, how easily the initial experience can be replicated, spillovers, and other forces of nature. Read more about how results can change as the audience scales.

Welcome to Aampe

Using Aampe

Managing Agents Strategically

How Do I Know Aampe is Working?

What Kinds of Experiments Work in Aampe?

A Fundamental Difference

Implications for Testing

Experiment FAQs

Welcome to Aampe

Using Aampe

Managing Agents Strategically

How Do I Know Aampe is Working?

​A Fundamental Difference

​Implications for Testing

​Experiment FAQs

A Fundamental Difference

Implications for Testing

Experiment FAQs