Analytics & Digital Marketing Tips

The Best Revenue Significance Calculator for A/B Testing

The Best Revenue Significance Calculator for A/B Testing

May 11, 2017           Conversion Testing

If you’re conducting A/B tests on your ecommerce website and are not tracking revenue, then you are missing out on a crucial component for successful testing: having the right KPI.

Tracking revenue allows your team to make effective business decisions, because you’re measuring performance in a way that actually impacts the bottom line.

So which revenue metrics should you choose?

Some common revenue metrics don’t tell the whole story, which is why we recommend using revenue per visitor (RPV). RPV measures the amount of revenue generated each time a user visits your site:

RPV = Total Revenue
           Total Users

We’re about to explain:

  • why revenue per visitor is such a crucial (composite) metric
  • the need to rewrite the RPV’s formula to include transaction rate and AOV
  • the right way to measure its statistical significance
  • how to use our free online revenue significance calculator
  • how to hack your way around sampled data to get the most accurate results

Why Use Revenue Per Visitor in A/B Testing?

If your team tracks only transaction rate (the percentage of visitors that purchased) or average order value (AOV) as your primary metric for testing, your results are at risk of having blind spots.

graphic showing revenue metrics in a/b testing

Some people assume that AOV is relatively constant and they only need to focus their efforts on increasing transaction rate in order to increase revenue. However, this logic doesn’t always apply.

In some circumstances, increasing conversion rate can negatively affect your overall revenue.

For example, if you have a test variation that increases the conversion rate, but users choose to purchase the lower-priced product instead of the more expensive product, this can decrease AOV and overall revenue.

chart: a/b test variation with increased conversion rate

chart: negative impact on revenue a/b testing results

Alternatively, people may focus their efforts on improving only AOV to increase revenue which can lead to a decrease in transaction rate, ultimately hurting revenue.

For example, consider an ecommerce website test where the variation increases the spend threshold to qualify for free shipping. This can lead to a higher AOV, but can also decrease transaction rate because there may be visitors who want free shipping but don’t want to spend the extra money to qualify. As a result, they may choose not to purchase.

table: a/b test results with lower transaction rate

chart: a/b testing results that decreased revenue

The examples above illustrate the need to have a solid conversion strategy for revenue that incorporates both metrics. Revenue per visitor is that composite metric, which accounts for both transaction rate and AOV.

In fact, we can rewrite the RPV’s formula to include these two elements:

Total Revenue = AOV x Transactions
Transaction Rate = Transactions/Total Users
RPV= AOV x Transaction Rate

So if your business had 1,000 transactions for every 15,000 users with an AOV of $50, the RPV would be:

Total Revenue = $50 * 1,000 = $50,000
Transaction Rate = 1000/15,000 = 0.067
RPV = $50 * 0.067 = $3.35

Monitoring trends in RPV can help your team analyze sales performance. It’s useful for evaluating your new visitor acquisition and paid user acquisition efforts.

Generally, a positive trend in RPV shows that your company’s sales efforts are working well.

However, if your revenue per visitor is trending downward, this could be the result of an increase in unqualified users to the site or potential site problems (e.g. broken shopping cart), which negatively affects your transaction rate.

Or your visitors may be converting at the same rate but are spending money on lower value items (e.g. higher priced product is out of stock), which negatively impacts your AOV.

Taking the example above, let’s say the number of users increased to 20,000 due to a social campaign that recently launched. Assuming the AOV stayed the same, your team would find that RPV is trending negatively:

Transaction Rate = 1,000/20,000 = 0.05
RPV = $50 * 0.05 = $2.50

Now let’s assume that the traffic stayed the same but your most expensive product was out of stock, causing the AOV to decrease to $37.30:

Transaction Rate = 1,000/15,000 = 0.067
RPV = $37.30 * 0.067 = $2.50

RPV does not replace the need to keep an eye other metrics like AOV and transaction rate. It removes potential blind spots that can occur if you choose to track only those metrics. In essence, it gives your team a better sense of the bigger picture.

How NOT to Calculate Statistical Significance

If your team is already using revenue per visitor as the main KPI for your tests, you may have figured out why you shouldn’t use the standard online revenue significance calculators to determine whether your test variation is having an actual impact on RPV. These standard “tools” perform calculations using a T-test, which operates on one critical assumption: that the metric you’re tracking follows a normal distribution.

example: normal distribution in a/b testing statistical analysis

Source: Statistics Cheat Sheet

Revenue per visitor doesn’t follow a normal distribution and therefore violates this assumption, because the majority of visitors to your site will not convert or make a purchase. As a result, you’ll discover that RPV’s distribution contains a greater concentration of $0 values and there is no limit on how much a visitor can spend, which may result in your RPV data containing some extreme values.

a/b test data distribution chart for revenue per visitor

For these reasons, RPV’s distribution tends to be right-skewed, making the standard T-test less reliable for measuring statistical significance

right skewed data distribution for revenue per visitor metric

The Right RPV Confidence Calculator for the Job

To solve this problem, we launched a free online Revenue Per Visitor confidence calculator designed specifically for calculating RPV’s statistical significance. Our RPV calculator utilizes the Wilcoxon Rank Sum Test, which is not based on the assumption that the data follows a normal distribution.

In fact, the Wilcoxon Rank Sum Test employs a non-parametric technique — a technique that does not rely on any specific distributional assumption — in order to test whether there is a difference.

This calculation is far more reliable in determining whether there is an actual impact on RPV. It includes a two-tailed calculation, so you can use it to determine whether the variation had a positive impact or a negative impact when compared to the control.

How to Use the RPV Calculator

If you take a sneak peek at our testing confidence calculator, you’ll notice it looks different from the standard statistical significance calculators.

Standard Online Calculators

example of standard revenue significance calculator

Blast’s Revenue Per Visitor (RPV) Calculator

screenshot of revenue significance calculator by blast

As mentioned above, you cannot simply enter total visitors and total revenue per variation to determine statistical significance.

To accurately measure whether there is an impact on RPV, you need to have user-level data.

Most businesses choose to integrate their A/B tests with their preferred analytics platform and analyze test performance there. This allows teams to make an apples-to-apples comparison when looking at performance across different channels, such as testing and marketing efforts.

The problem is that while you can see overall revenue for test variations within analytics, it is much more difficult to get access to user-level data.

Unsampled Google Analytics Data Hack

The Blast team has a solution for obtaining user-level data so you can make use of the revenue significance calculator.

It may take a little leg work in the beginning, but your team will reap the benefits for the long term. To get user-level data within Google Analytics, follow the steps below and you’ll be on your way to A/B testing success.

1. Create a Custom Dimension for Client ID

Google Analytics (GA) has recently started offering a new User Explorer report. The best part of this report is that it has a Client ID dimension that tracks user-level behavior, which is specific to browser and device.

screenshot client id dimension in google analytics user explorer report

Now the downside!

In its current state, you can’t access this dimension outside of this report, so your team can’t pull this data into a custom report.

To get around this problem, your team will need to create a custom dimension for the Client ID. This step should take roughly 1-2 hours for your analytics team to create, QA, and implement. Once this is implemented you’ll be able to use the Client ID dimension for your test reports as well other Google Analytics reports.

Google Analytics screenshot: where to create custom dimension for revenue significance calculator

You may think this step isn’t worth the effort and that you can just export the data from the User Explorer report, but that will only work if you have minimal traffic to the site. The User Explorer report caps the data to 10,001 rows.

If your site receives more than 10,000 visitors within the time frame you select, then you won’t be able to see all user-level data and instead will get a sampling of the data. By creating the Client ID custom dimension, you can create a custom report for your test, containing the Client ID, where you’ll be able to capture all the rows of data.

User Explorer Report: Limits Client ID and accompanying revenue data to 10,001 rows.

screenshot: limited rows in google analytics user explorer report

Custom Report: Provides Client ID (via a custom dimension) and accompanying revenue data greater than 10,0001 rows.

screenshot: google analytics custom report

2. Utilize unSampler to Export All Data

popup showing number of rows google analytics allows for exportAs your team uses the Client ID custom dimension within other Google Analytics reports, there is another challenge that lies ahead. Google Analytics caps the number of rows you can export at one time to 5,000 rows.

If you really have the time, you can attempt to export your report data 5,000 rows at a time, but for most people this is completely inefficient. Previous hacks like altering the number in the url to show more rows no longer work.

If your business has Google Analytics 360, then your team has the feature to export all data by utilizing the Unsampled report.

Resolve the sampling issues from the standard version of Google Analytics is as simple as creating an unSampler account and linking your Google Analytics account to it. Doing so will enable your team to easily create a test report (where you will have access to your custom dimensions) and export all of your data to CSV.

screenshot of workaround for sampled data in google analytics

image showing where to export unsampled google analytics data to csv file

3. Format & Upload CSV 

Once you’ve exported data from your unSampler Report, you’ll need to take a few quick steps to format it so it will be ready to use with the revenue significance calculator. First, you’ll need to filter your data for the control:

screenshot showing how to filter data before using revenue significance calculator

Then copy the revenue data and paste it in a new tab (optional: you can rename the header to Control Revenue).

Repeat this step with your test variation. After doing so, in the new tab you should have two columns for revenue (Control Revenue and Variation Revenue). Please note, if you have more than one test variation, you’ll need to create separate tabs for each one (e.g. Control vs Variation 1, Control vs Variation 2, Control vs Variation 3).

screenshot showing how to filter control data before using revenue significance calculator

Save this new tab as a CSV file (or multiple CSV files if you have more than one test variation) and then it’s ready for the RPV Calculator.

Before uploading your file to the calculator, you can adjust the threshold for determining statistical significance — the default is set at 95%. The last step is simply uploading your file.

where to upload your file to use the revenue significance calculator

The results you get are fast, reliable and easy to understand.

example of revenue significance calculator results

A/B Test Results You Can Trust

While it takes a little bit of effort in the beginning to properly measure revenue per visitor, once it’s set you can easily analyze this KPI for future tests. Further, by using the free online revenue significance calculator, you can trust that the correct method of analysis was applied.

Your team can rely on test performance results to make those important business decisions.

Please share your comments or let us know if you have questions regarding this process or the calculator.

 

  • Martin

    Good stuff!
    How would you handle a situation where there are not the same number of users in control and variation?

  • Roopa Carpenter

    Hi Martin! The RPV Calculator utilizes the Wilcoxon (Mann-Whitney U) test to compare the means, and so it does not require that your control and test sample sizes be exactly the same. However, generally speaking, it’s a good practice to have similar sample sizes for both groups. Thanks!

  • Roopa Carpenter

    Hi Martin! The RPV Calculator utilizes the Wilcoxon (Mann-Whitney U) test to compare the means, and so it does not require that your control and variation sample sizes be exactly the same. However, generally speaking, it’s a good practice to have similar sample sizes for both groups. Thanks!

  • Nice we just got a prototype for a calculator up by our selves for internal need, but you have more features and polish already available. 🙂
    I was searching for a mann whitney calculator, but since your page is SEO optimized for less specific (and probably more accessible) search terms I missed it in my initial search

    Whats your take on the difficulty of reaching significance on the revenue per user metric compared to for example transaction CR (I fully realize different implications of the different metrics, but testers tend to gravitate to test that have a higher chance of “winning”)

  • Silvia Bordogna

    Very interesting solution to test revenues!

    Just a question:
    you said that distribution of RPV should be left skewed and with modal value equal 0. Why in csv provided as sample dataset you don’t have any 0 value?

    I’ve tried to implement this test and my RPV (daily) is never 0. Maybe I should calculate RPV group by different tense unit? What did you mean in your article?

    Thanks!!

  • Roopa Carpenter

    Hi Patrik! I feel that it’s important to have goals or metrics that affect the bottom-line for a company. Our philosophy is that testing for the win is not as important as making the right business decision. Often times the right business decision is tied to revenue and that’s why we recommend having it as a primary KPI. However, other metrics, such as transaction CR, should be considered as well. Thank you!

  • Roopa Carpenter

    Hi Silvia! You are correct that in real RPV data the modal value will be zero and thus left skewed. The sample data in the csv, however, is only meant for demonstration purposes regarding visualization and calculations and is not intended to mirror actual RPV data that you will encounter. I hope this helps!

  • Yann

    That’s really great, thanks so much for doing this.
    I just wish the calculator wasn’t limited to 5mb files as my files are generally between 10-20mb, so I can’t use it.

  • paulmkoch

    Hi Roopa, thanks for the tool!

    I tried to quickly estimate how much longer we’d need to run a test, assuming our RPV numbers stay the same. I copied and pasted the identical columns of our actual RPV data to double the sample size, then re-uploaded to see how much the confidence level improved. But, the P value doesn’t change. I can add multiple copies of our live data to the columns, and the P value always stays the same. I’d have expected the higher sample size to increase the confidence level, even if each version’s total RPV stayed the same. Do you know why that’s not happening?

    Thanks!

  • Ioana

    Hi! Thanks for this article, sounds great. Can you please clarify the structure of the file that we upload to the calculator? Assuming we have just one variation, would it be a csv file with a column of all revenue values each user has spent in the control, and then another similar column for the variation? Of course, that will include values of 0 for users who haven’t spent anything.

    Thanks,
    Ioana

  • Roopa Carpenter

    Hi Ioana! The calculator is meant to compare revenue data for two or more variations and so if you only had one variation, you would need a second column (with a value of 0) for users who haven’t spent anything. Thank you!

  • Roopa Carpenter

    Hi Paul! You are right that the higher sample size should increase the confidence level even if each variation’s total RPV stays the same. We re-tested the calculator using the same method you outlined to double the sample size, and the confidence level (P value) does change with the higher sample size. I would encourage you to try it again and please let us know if you continue to see this happening! Thank you!

  • Ioana

    Hi Roopa, thank you for your reply. So I have a control group and a variation group. The file should have just 2 columns (one for control and one for variation) with the revenue associated with each client ID in each of the two groups, right?

    [I was confused by the headers in the sample dataset which read “control revenue per user” and “learn more revenue per user”, but clearly the metric we need to use in the columns isn’t “Revenue per user” but rather “Client ID Revenue”, right?]

    Thanks a lot

  • Roopa Carpenter

    Hi Ioana! Yes, the columns would contain revenue associated with each client ID. Thank you!

 

Analytics Blog

We're here to help with tips and insight on the following topics:

Subscribe to RSS


Optimize your website and marketing campaigns

Get a constant flow of Google Analytics help and digital marketing tips, case studies and more from Google Certified Partner Blast Analytics & Marketing.




Connect with Blast Analytics & Marketing