Logo von nextlevels
Hey!

E-Commerce: A/B Testing

A/BTesting

Data-driven decisions instead of gut feeling: A/B testing shows you with statistical confidence which variant of your shop converts better. We derive hypotheses from real user data, set up tests with methodological rigour, wait for sufficient sample size, and interpret results so you understand what you've learned — not just which number is larger. Every test becomes a building block of your optimisation knowledge.

Challenges you'll recognise

  • Design changes to the shop are made based on opinions and gut feeling, and no one knows afterwards whether they improved or worsened conversion.
  • Tests have been stopped too early in the past as soon as one variant was ahead — and implemented changes turned out to be ineffective or even harmful.
  • There is no systematic documentation of past tests, so the same hypotheses keep coming up and no cumulative knowledge about users is built.

Hypothesis Development

An A/B test without a hypothesis is guesswork. We derive test hypotheses from heatmaps, session recordings, funnel analyses, and user interviews. Each hypothesis names an observed problem, a proposed solution, and a measurable expectation. Every test has a clear objective — and you know what you learn, regardless of how it turns out.

Test Setup & Tooling

We set up A/B tests with established testing tools, configure correct audience segmentation, and ensure variants are distributed evenly and consistently. Sample ratio mismatch and other common implementation errors are actively checked before a test starts.

Statistical Analysis

A test is not finished when one variant leads, but when the sample size is sufficient for statistical significance. We calculate the required sample size before the test starts, monitor continuously, and stop only when reliable results are available. Stopping early leads to false positive findings.

Learning Documentation

Test results are documented in a central test log: hypothesis, result, statistical significance, and derived action. This knowledge base makes every test result the foundation for future hypotheses. Over months, an institutional understanding of what works for your users develops.

Good to know

  • Early stopping leads to wrong conclusions

    Stopping a test as soon as one variant leads falls into the peeking problem. Without a pre-calculated sample size and sufficient runtime, results are not statistically reliable — and decisions based on them can actually worsen conversion.

  • Hypothesis matters more than tool

    The A/B testing tool is interchangeable. The quality of the hypothesis determines the learning value of the test. A hypothesis cleanly derived from data with clear expectations yields insights even when the tested variant loses.

  • Test log is institutional capital

    Every test result — won or lost — is knowledge about your users. Organisations that don't document results lose this knowledge with the next personnel change and retest the same hypotheses. A maintained test log is one of the most valuable resources in the CRO process.

Frequently asked questions

How much traffic do I need for meaningful A/B tests?
That depends on the element being tested, the expected effect size, and the accepted significance level. As a rough guide: several hundred conversions per month enables tests with well-measurable effects. For lower traffic, we recommend qualitative methods and best-practice optimisations that deliver results even without large samples.
What do you typically test first?
We prioritise by potential, effort, and confidence. Checkout elements typically have the highest potential because cart abandonment is most costly there. Product page elements with high traffic follow. We never start with small cosmetic changes when larger structural levers haven't been tested yet.
What happens if a test shows no statistically significant results?
A null result is still a result: it shows the tested element has no measurable influence on conversion. We document this and derive whether the hypothesis was wrong or whether the effect is too small for the available sample size. Both are valuable for prioritising the next tests.

Related articles from our blog

Why nextlevels

Success you can measure

With us you're always at the cutting edge of technology and benefit directly from our developer expertise. Together we analyze your shop, identify key areas and develop tailor-made solutions. Your goals and expectations are at the center of our work.

  1. Developers, not resellers

    Your shop is built by developers who understand the code — nothing gets passed to subcontractors.

  2. Shopware down to the detail

    Architecture, API integration and performance from hundreds of project hours.

  3. One team, every discipline

    Development, design and marketing from a single source — no friction at the handoffs.

  4. Built for growth

    We build for conversion, load time and revenue — not for gut feeling.

  5. Partner, not vendor

    We stay on after launch and keep developing your shop continuously.

Ready for your successful online shop?

Whether it's an improvement or a fresh start — a no-obligation conversation never hurt anyone.

Profile picture of Paul Kalisch, Executive Partner
Paul Kalisch
Executive Partner

You might also be interested in