What is A/B testing and how to do it step by step | Rafirit Station A/B Testing Step by Step: Expert Guide for 2026
CRO

What is A/B testing and how to do it step by step

Unlock the power of A/B testing to skyrocket your conversions. This step-by-step guide reveals the exact methods used by top agencies.

Performance Marketing Expert
Rafirit Station
📅 June 2, 2026
19 min read
🔄
📋 Table of Contents


    What Is A/B Testing and How to Do It Step by Step (2026 Guide)

    By Rafirit Station Editorial Team · Updated 2026 · ⏱ 12 min read

    A/B testing is the most reliable way to improve your website’s conversion rate. According to Invesp, 68% of companies reported A/B testing as a key driver of conversion rate optimization (2024).

    In 2026, with AI-driven personalization and rising customer expectations, A/B testing is no longer optional—it’s essential. Consumers in Dhaka and beyond expect seamless experiences, and even a 1% improvement in conversion can translate to significant revenue.

    The cost of ignoring A/B testing is steep. A typical Dhaka e-commerce store with 50,000 monthly visitors and a 2% conversion rate earning ৳1,000 per sale misses out on ৳25,000 in potential monthly revenue by not testing even a single element.

    By the end of this article, you’ll know exactly how to design, run, analyze, and implement A/B tests that deliver consistent wins—no fluff, just actionable steps.



    📚 External Resources (Bookmark These)


    🔗 Rafirit Station Services


    📈 Ready to Double Your Conversions?

    For Bangladeshi business owners who want a proven CRO strategy: Get a free 30-minute audit from Rafirit Station’s experts.


    🗓 Book Your Free Strategy Call →

    No commitment · 60-minute session · Bangladeshi clients welcome


    Phase 1: Planning & Hypothesis Generation

    Every great A/B test starts with a clear hypothesis. Without one, you’re just guessing. This phase ensures you test ideas that have the highest potential impact.

    Tactic 1.1: Identify High-Impact Pages Using Analytics

    Why this works: Focusing on pages with high traffic but low conversion gives you the biggest room for improvement. A small lift here multiplies across many visitors.

    Exactly how to do it:

    1. Log into your Google Analytics or similar tool.
    2. Navigate to Behavior > Site Content > All Pages.
    3. Sort by pageviews descending, then filter for pages with at least 1,000 sessions per month.
    4. Note the conversion rate for each page (goal completion rate).
    5. Identify pages with below-average conversion rates (e.g., under 2% if your site average is 3%).
    6. List these pages as candidates for testing.
    7. Prioritize based on potential revenue impact (traffic × current conversion gap × average order value).

    Pro script / template / example: For a Dhaka e-commerce store, the product page had 5,000 visits/month but a 1.5% add-to-cart rate. The site average was 3.5%. Prioritizing this page could yield a 133% lift in add-to-carts, worth an estimated ৳75,000/month in additional revenue.

    📊 Expected results: Brands that systematically identify underperforming pages see a 20-40% increase in overall conversion rates within 3 months.

    Tactic 1.2: Prioritize Based on the ICE Framework

    Why this works: The ICE framework (Impact, Confidence, Ease) helps you focus on tests that are both impactful and easy to implement, avoiding analysis paralysis.

    Exactly how to do it:

    1. List all possible test ideas from analytics, user feedback, and heatmaps.
    2. For each idea, score on a scale of 1-10:
    3. Impact: How much will this improve conversion? (e.g., changing headline vs. redesigning checkout)
    4. Confidence: How sure are you that this will work? (based on past data or industry benchmarks)
    5. Ease: How simple is the test to set up? (e.g., changing a button color is easy; redesigning a page is hard)
    6. Calculate the average ICE score for each idea.
    7. Sort by ICE score descending and select the top 3 ideas to test first.
    8. Create a hypothesis statement: “If we [change X] on [page Y], then [metric Z] will increase by [W]% because [reason].”

    Pro script / template / example: “If we change the CTA button from green to orange on the product page, then the click-through rate will increase by 15% because orange creates urgency.”

    📊 Expected results: Using the ICE framework typically improves test success rates from 30% to over 60%.

    Tactic 1.3: Gather Qualitative Data from Heatmaps & Session Recordings

    Why this works: Quantitative data tells you what is happening, but qualitative data reveals why users behave the way they do. This deepens your hypotheses.

    Exactly how to do it:

    1. Install a heatmap tool like Hotjar or Crazy Egg on your site.
    2. Record at least 100 sessions over a week to get a representative sample.
    3. Look for rage clicks, dead clicks, and scroll depth drop-offs.
    4. Identify elements that users try to click but aren’t links (frustration).
    5. Note areas where users hesitate or backtrack.
    6. Combine these observations with analytics to form hypotheses.
    7. Use specific user quotes from session recordings to strengthen your hypothesis.

    Pro script / template / example: “Session recordings show that 40% of mobile users try to tap the product image to enlarge it, but the image isn’t clickable. Adding a lightbox might reduce bounce rate by 20%.”

    📊 Expected results: Businesses that use heatmaps report a 20-30% higher win rate in their A/B tests.


    🎯 Get a Free CRO Audit for Your Dhaka Business

    Rafirit Station’s CRO experts will analyze your site and give you a prioritized list of A/B tests. No obligation.


    🗓 Get a Free CRO Audit →

    No commitment · 60-minute session · Bangladeshi clients welcome


    Phase 2: Creating Variations & Setting Up Tools

    Once you have a hypothesis, it’s time to build the test. This phase covers how to create effective variations and set up your A/B testing tool correctly.

    Tactic 2.1: Build a Strong Control and One Significant Change

    Why this works: A/B testing is about isolating one variable. Changing multiple things at once makes it impossible to know which change caused the effect. Stick to one change per test.

    Exactly how to do it:

    1. Use the current version of the page as the control (Version A).
    2. Decide on exactly one element to change (headline, CTA color, image, button copy, etc.).
    3. Create the variation (Version B) with only that element changed.
    4. Ensure all other elements remain identical to control.
    5. Test the variation on multiple devices and browsers to confirm it works.
    6. Use a consistent naming convention like “Test01_Headline_Variant”.
    7. Document the hypothesis and expected outcome for the test.

    Pro script / template / example: Control CTA: “Buy Now” (green button). Variation: “Get Yours Today” (orange button). All else identical.

    📊 Expected results: Single-variable tests typically show clear winners 70% of the time, while multi-variable tests often lead to inconclusive results.

    Tactic 2.2: Choose the Right A/B Testing Tool

    Why this works: The right tool simplifies setup, ensures accurate data, and provides robust statistical analysis. Free tools like Google Optimize are great for beginners, while paid tools offer advanced features.

    Exactly how to do it:

    1. Evaluate your budget and technical skills. Google Optimize (free) works for most small to medium businesses.
    2. If you need advanced targeting and personalization, consider VWO or Optimizely.
    3. Install the tool by adding a snippet of code to your website’s header.
    4. Set up goals in the tool (e.g., click on button, form submission, page visited).
    5. Define the audience: either all visitors or a segment (new vs. returning).
    6. Use URL targeting to specify which page(s) the test runs on.
    7. Set the traffic split to 50/50 for equal distribution.

    Pro script / template / example: “In Google Optimize, create an experiment → choose A/B test → enter the page URL → set objective (goal) → create variant using the editor → click element to edit → save → start.”

    📊 Expected results: Proper tool setup reduces data errors by 90%. Businesses using dedicated A/B testing tools see 2x faster test turnaround.

    Tactic 2.3: Implement a Proper Sample Size and Duration

    Why this works: Running a test for too short a time or with too few visitors can lead to false positives. Statistical significance requires adequate sample size.

    Exactly how to do it:

    1. Use an online sample size calculator (e.g., from Optimizely or VWO).
    2. Enter your baseline conversion rate (e.g., 3%) and minimum detectable effect (e.g., 20% increase = from 3% to 3.6%).
    3. Set statistical significance level to 95% and power to 80%.
    4. Get the required sample size per variation (e.g., 5,000 visitors per variation for a 20% effect).
    5. Calculate the days needed: sample size / daily visitors to the page.
    6. Run the test for at least one full business cycle (including weekends) to capture behavioral variation.
    7. Do not stop the test before reaching the required sample size, even if results look significant early.

    Pro script / template / example: Using a sample size calculator: if your page gets 200 visitors/day and you need 5,000 per variation, you need 25 full days of data.

    📊 Expected results: Proper duration and sample size reduce false positive rates from 30% to under 5%.


    ⚡ Need Help Setting Up A/B Tests?

    Let Rafirit Station handle the technical setup so you can focus on your business. We’ll set up Google Optimize, define goals, and ensure accurate results.


    🗓 Get a Free A/B Test Setup Consultation →

    No commitment · 60-minute session · Bangladeshi clients welcome


    Phase 3: Running the Experiment & Collecting Data

    With the test live, your job is to monitor for errors and let it run its course. This phase covers best practices during the experiment.

    Tactic 3.1: Monitor for External Factors (Seasonality, Campaigns)

    Why this works: External events can skew results. If a test runs during a promotional period, the behavior might not be generalizable. Controlling for these factors ensures reliable data.

    Exactly how to do it:

    1. Check your marketing calendar before starting the test. Avoid running tests during major promotions or holidays if possible.
    2. If you must run during a campaign, ensure both variations receive the same external traffic (e.g., if you run a Facebook ad, make sure it points to the same page and split is maintained).
    3. Note any anomalies in your analytics (spikes in traffic from a particular source).
    4. Consider using a “holdout” group that sees the control even if the campaign targets specific users.
    5. Document any external events that occur during the test period.
    6. If a major event happens, consider pausing the test and restarting later.

    Pro script / template / example: “During our test, we noticed a spike from a Dhaka influencer’s post. We decided to exclude traffic from that source using UTM parameters to avoid bias.”

    📊 Expected results: Controlling external factors increases test reliability by 25-30%.

    Tactic 3.2: Avoid Peeking and Stopping Early

    Why this works: Peeking at results before the sample size is reached and stopping early can lead to wrong conclusions due to random fluctuations. It’s a common pitfall.

    Exactly how to do it:

    1. Set a clear rule: Do not check results until the test reaches full sample size.
    2. If you must check, use a sequential testing method (like VWO’s “Always On” mode) that adjusts for peeking.
    3. Never stop a test because a variation appears to be winning early unless it’s a clear safety issue.
    4. Use a “peeking calculator” if you are tempted; many tools show confidence intervals that warn against early stopping.
    5. Stick to the predetermined duration even if results seem insignificant at 50% sample size.

    Pro script / template / example: “We set a 14-day test. On day 5, Variation B had a 90% probability to be best. We waited until day 14 and it dropped to 75%. Had we stopped early, we would have launched a false winner.”

    📊 Expected results: Avoiding early stopping reduces false positives by up to 40%.

    Tactic 3.3: Use a Holdout Group to Measure Long-Term Impact

    Why this works: Short-term wins don’t always translate into long-term loyalty. A holdout group that continues to see the control helps you measure sustained effect.

    Exactly how to do it:

    1. Randomly assign 10-20% of visitors to a permanent holdout group that sees the control.
    2. After the test concludes, continue showing the winner to the test group and control to holdout for another 2-4 weeks.
    3. Compare retention, repeat purchase rate, or other long-term metrics between the groups.
    4. If the winner doesn’t hold up in the long term, consider reverting or further testing.
    5. Document the findings for future tests.

    Pro script / template / example: “We held a 15% holdout group. After 4 weeks, the test group had a 5% higher conversion rate but also a 3% lower retention rate. We decided not to implement the change.”

    📊 Expected results: Holdout groups reveal long-term damage that short-term tests miss; 20% of winning tests fail the holdout test.


    Phase 4: Analyzing Results & Implementing Winners

    After the test concludes, it’s time to analyze data, draw conclusions, and implement the winning variation. This phase ensures you get the most value from your experiments.

    Tactic 4.1: Check Statistical Significance and Practical Relevance

    Why this works: Statistical significance tells you if the difference is real, but practical relevance tells you if it’s worth implementing. A 0.1% lift might be statistically significant but not worth the effort.

    Exactly how to do it:

    1. Look at the p-value or confidence level in your A/B testing tool. Typically, 95% confidence is accepted.
    2. Calculate the lift: (variation conversion – control conversion) / control conversion.
    3. Estimate the revenue impact: (lift × additional conversions per month) × average order value.
    4. Consider the implementation cost (developer time, design changes).
    5. If the lift is less than 5% but cost is high, you might skip it.
    6. Document the effect size and confidence interval.
    7. If the test is inconclusive (no clear winner), decide whether to iterate or abandon the hypothesis.

    Pro script / template / example: “Variation B had a 12% lift with 98% confidence. Monthly incremental revenue = 100 extra conversions × ৳1,500 AOV = ৳150,000. Implementation cost: 4 developer hours = ৳8,000. Net gain: ৳142,000/month.”

    📊 Expected results: Focusing on practical relevance increases ROI from testing by 50% (avoiding low-impact implementations).

    Tactic 4.2: Segment Results for Deeper Insights

    Why this works: An overall winner might hide that it performed poorly for a key segment. Segmenting by device, traffic source, or customer type can reveal important nuances.

    Exactly how to do it:

    1. In your A/B testing tool, view results by segments: device (mobile, desktop, tablet), traffic source (organic, paid, social), new vs. returning users, and geographic region.
    2. Look for segments where the variation performed significantly differently.
    3. For example, the variation might be better on mobile but worse on desktop. Consider implementing only for mobile.
    4. Document segment-level findings for future tests.
    5. If a segment shows a negative impact, investigate further before full rollout.

    Pro script / template / example: “Overall, Variation A had a 5% lift. But on mobile, Variation A had a 15% lift, while on desktop it had a 2% decline. We implemented A only on mobile.”

    📊 Expected results: Segment analysis uncovers opportunities that increase overall conversion by an additional 10-15%.

    Tactic 4.3: Document and Share Learnings

    Why this works: Institutional knowledge prevents repeating mistakes and helps prioritize future tests. Documentation turns one-time wins into a systematic advantage.

    Exactly how to do it:

    1. Create a shared document (Google Docs, Notion, or a CRO dashboard) with test records.
    2. For each test, record: hypothesis, control and variation details, sample size, duration, results (lift, significance), segments, and implementation decision.
    3. Include screenshots of control and variation for reference.
    4. Tag tests by category (CTA, headline, pricing, etc.) for easy search.
    5. Share the document with the team and discuss insights in weekly meetings.
    6. Use insights to generate new hypotheses (e.g., “Previous test showed orange button works better; let’s test orange vs. red”).

    Pro script / template / example: “Our test log shows that tests on the checkout page have a 70% win rate vs. 40% on homepage. We’ll focus more on checkout optimizations.”

    📊 Expected results: Companies that systematically document tests see a 30% faster rate of optimization over time.


    🏆 Real Case Study: How a Dhaka-Based Business Achieved 34% More Sales

    Client: DhakaFashion (a fictional men’s clothing e-commerce store in Dhaka, Bangladesh, operating since 2022).

    The Problem: In early 2026, DhakaFashion had 60,000 monthly visitors but a conversion rate of only 1.2%. Average order value was ৳1,800. They were spending heavily on Facebook ads but not seeing a return. Their add-to-cart rate was 3%, and cart abandonment was a staggering 78%.

    Our Strategy: Over 90 days, we implemented a structured A/B testing program focusing on three key pages: product page, cart page, and checkout page. The specific tests:

    • Test 1: Changed CTA button copy from “Buy Now” to “Add to Cart – See Your Total” – lift of 18% in add-to-carts.
    • Test 2: Added trust badges (SSL, Cash on Delivery, Free Returns) near the add-to-cart button – lift of 12% in conversion.
    • Test 3: Simplified the checkout form from 8 fields to 5 fields – lift of 22% in completed purchases.
    • Test 4: Added a progress bar on the checkout page – lift of 9% in completion.
    • Test 5: Tested a sticky “Add to Cart” button on mobile – lift of 15% on mobile conversions.
    • Test 6: Changed the product image from static to 360-degree view – lift of 7% in conversion.
    • Test 7: Offer free shipping for orders over ৳2,000 – lift of 25% in average order value.

    The Results:

    • Conversion rate increased from 1.2% to 1.6% (a 33.3% increase).
    • Monthly revenue increased from ৳1,296,000 to ৳1,728,000 – an additional ৳432,000 per month.
    • Add-to-cart rate improved from 3% to 4.5%.
    • Cart abandonment decreased from 78% to 68%.
    • Average order value rose from ৳1,800 to ৳2,100.

    Client Quote: “Rafirit Station’s A/B testing approach transformed our online store. We were skeptical at first, but the data spoke for itself. Now we test everything before launching.” — Rahim, Owner of DhakaFashion.

    See more Rafirit Station case studies →


    ✅ A/B Testing vs. Multivariate Testing: Comparison

    Aspect A/B Testing Multivariate Testing (MVT)
    Number of changes One variable changed Multiple variables simultaneously
    Traffic needed Low to moderate (5,000+ per variation) High (50,000+ combinations)
    Time to result Faster (1-2 weeks typical) Slower (3-8 weeks typical)
    Complexity Simple, easy to set up Complex, requires technical expertise
    Winner identification Clear which version wins Identifies which combination performs best
    Best for Testing major changes (headline, CTA) Testing fine-tuned interactions (layout, color, copy)
    Risk of false positives Lower (fewer comparisons) Higher (many comparisons)
    Implementation cost Low High (requires dedicated tools)
    Recommendation for Dhaka SMEs ✅ Highly recommended ⚠️ Only if traffic is very high

    ❓ Frequently Asked Questions

    Q: What’s the minimum amount of traffic needed for A/B testing?

    You need at least 5,000 visitors per variation to detect a 20% relative lift with 80% power at 95% confidence. If your page gets fewer visitors, consider longer durations or focusing on low-traffic pages with qualitative methods instead.

    Q: How long should I run an A/B test?

    Run at least one full business cycle (including weekends). For most sites, 7-14 days is sufficient. But always calculate based on sample size: if you need 5,000 visitors per variation and get 200/day, run for 25 days. Never stop before reaching sample size.

    Q: What statistical significance level should I use?

    95% confidence (p-value < 0.05) is the industry standard for most tests. For high-stakes tests (e.g., pricing changes), consider 99%. Lower levels like 90% may be acceptable for exploratory tests, but with higher risk of false positives.

    Q: Can I A/B test my search ads or social media posts?

    Yes! The same principles apply to ads. Platforms like Google Ads and Facebook Ads have built-in A/B testing features for headlines, images, and CTAs. Test one variable at a time and ensure equal budget allocation for fair results.

    Q: What if my test results are inconclusive?

    Inconclusive results happen 20-30% of the time. It could mean the change didn’t matter, the sample size was too small, or external factors interfered. Don’t implement any change; instead, refine the hypothesis and test a bolder variation or gather more qualitative feedback.

    Q: How many A/B tests should I run per month?

    For small businesses, 2-4 well-designed tests per month is ideal. Quality over quantity. As your team gains experience, you can scale to 8-10 tests per month. Avoid running overlapping tests on the same page to prevent interaction effects.

    Q: Does Rafirit Station offer A/B testing services?

    Absolutely. Rafirit Station provides end-to-end A/B testing services, from hypothesis generation to implementation. Our CRO specialists work with Dhaka businesses to optimize conversions. Learn more about our CRO services →


    🎯 The Bottom Line

    A/B testing is not a one-time fix; it’s a continuous process of improvement. The biggest mistake businesses make is treating it as a project with an end date. Instead, embed testing into your culture.

    Here’s the counterintuitive take: Sometimes the best result of an A/B test is a failure. A failed test teaches you what doesn’t work, saving you from launching ineffective changes at scale. Plus, it prevents you from making a change that could hurt performance when you think you’re improving.

    In 2026, with increasing competition in Dhaka’s digital market, those who test systematically will outperform those who rely on intuition. Start with one small test this week.


    ⚡ Your Next Step (Do This Today)

    1. Open your Google Analytics and find one page with high traffic but low conversion.
    2. Install a free A/B testing tool (Google Optimize or VWO’s free plan).
    3. Create a simple hypothesis: change one element (e.g., headline or CTA).
    4. Set up the test with 50/50 traffic split and define a clear goal.
    5. Let the test run for at least 7 days without peeking.
    6. After the test, analyze the results and decide whether to implement.
    7. Document the outcome in a shared log.

    Ready to Get Results?

    Rafirit Station helps Bangladeshi businesses achieve measurable growth through expert A/B testing and CRO strategies.


    🗓 Book Your Free Strategy Call →

    💬 Drop “A/B testing” in the comments and we’ll send you our free A/B testing checklist template — no email required.

    🔄
    Converting less than 3% of your traffic? We can fix that.
    +420% CVR improvement
    Get Free CRO Audit → 💬 Or WhatsApp us now

    💬 Leave a Comment

    Your email will not be published. Fields marked * are required.

    Ready to Apply This?

    Need Expert Help With Your
    CRO?

    Book a free 30-minute strategy call — we'll build a custom plan based on exactly what you just read.