Ecommerce Use Case: How Machine Learning Can Increase Profits
Want to increase your sales? Here’s a detailed case study on how ecommerce companies can leverage profits with machine learning algorithms.
Table of Contents
Ecommerce is becoming a very crowded space. Competing businesses can sell their products all over the planet, and getting a good piece of the marketplace is harder and harder to accomplish.
Most eCommerce entrepreneurs have mastered content marketing. They understand the concepts of building relationships with customers, of keeping each content marketing platform engaging and up-to-date. They build their email lists and retain users with creative campaigns. They are even moving into geo-location and personalization with their content outreach. And still, they are not able to increase sales performance for all of their efforts.
Table of Contents
The Answer Lies in eCommerce Analytics and Data Science
When e-commerce companies are asked about how they are using analytics and data, there seems to be a disconnect. More than half of Fortune 500 companies are actually using big data to analyze their websites for traffic, user experience, and behavior, in order to gather the information to alter user behavior. But small and mid-sized e-commerce businesses have not taken the full advantage of the data science and big data analytics that is out there. There are two reasons for this:
- They may use Google data analytics and generate plenty of reports that show areas of weakness, but they are not sure how to effectively correct those weaknesses.
- They may believe that they have to hire a data scientist like the “big boys” do, who can not only collect and analyze but who can then collaborate with marketing staff to develop complex and expensive strategies. Such strategies will address challenges (traffic patterns, bounce spots, and rates, etc.), and, as well, analyze specific customer behaviors and how those behaviors can be targeted to increase sales in the future. It’s pretty amazing stuff, actually, and much of it is accomplished through data science machine learning, allowing machines to use algorithms and math to solve specific problems better than humans can.
The truth is this: An eCommerce business of any size can take advantage of data science for business and use it to ramp up its customer base (and thus profits). For small and mid-sized businesses, this does not mean adding expensive big data, science experts. It means contracting out with a service that has the data scientists who can collect the data, organize and analyze it, develop models, and then collaborate with others on their teams to make recommendations to an eCommerce business, including eCommerce analytics conversion rate.
Why Data Science?
If you have purchased anything on Amazon recently, you will see some interesting things pop up, as you search for products and ultimately make a purchase. One of the most prominent features you will see is the statement: “Other customers who purchased this product also purchased these.” And then additional products will be displayed for your viewing.
Data science for e-commerce analytics
has been used to group you with customers who may be of the same age range, the same sex, and with the same interests that you have. Data science is tracking your behavior and offering other potential purchases to you, based upon all of these factors. Chances are you will look at those other products, may purchase one or two, or at least be aware that they exist so that you may return and purchase them. Big data eCommerce analysis allowed Amazon to customize its website in real-time, just for you. And it can do much more.
Data science techniques, indeed, are powerful tools, and all eCommerce businesses should be using them. Let me show you how:
What Data Science Can Fix For Your Business Ecommerce Analytics
The problems that eCommerce businesses face are pretty typical – low conversion rates, high bounce rates, cart abandonment, lack of customer loyalty, etc. Their own analytics will show this in the reports they generate. But those reports lack the deep learning that data science can provide so that individual solutions can be developed and implemented.
Romexsoft has the team and the tools for deep learning through data science – learning that can drive what a business does to increase its revenue, user by user, customer by customer.
Case Study: Boosting Customer Loyalty and The Average Check With Big Data
Recently an online retailer contacted us with the following problem(s). He has a large line of casual and sports clothing and shoes for people of all ages, for both genders, and for style preferences.
What he was discovering was this: he could get a customer “in the door,” and often get a purchase. But most customers were not “coming back for more” and/or purchasing other products that would suit them.
What he wanted from Romexsoft was a full analysis of what he could do to change his customers’ behaviors and move them to purchase more.
Our process involved several steps, and in the end, we were able to make recommendations that, when implemented, increase his sales almost immediately. Here was the process:
Ecommerce Analytics Analysis of the Site Structure Itself
When our team entered the website, we were able to make a few suggestions after detailed research. Using basic analytics, we were able to locate those pages which were obviously least popular, those pages that resulted in the most bounce rates, most and least popular products, based upon the correlation between views and actual purchases.
For example, there were several shoe products that the retailer was considering discarding. While there were many views, the proportion of purchases was quite low. What we discovered through our analytics, was that the problem was not the product – the problem was the pricing.
Our e-Commerce Software Developers were able to remodel the structure of the site, revise groupings of products, and recommend the correct price points for “low sale” products.
But the real work to solve the problem was just beginning. The job ahead of us ultimately analyzed the behavior of each individual customer and determine how to change that behavior to translate into more purchases. This information would be valuable for existing customers but also for new customers who visited.
Generating The Test Data
To prepare for deep analysis, we had to first organize products based upon type (e.g., shirt, shoes) sex, age groups, their purpose (casual or sport), brands/pricing, and a full history of the numbers of views of each product page and the information that was provided on that page. We generated more than 150,000 records of data to test.
Statistical E-Commerce Analytics Analysis and Machine Learning
Using data science with Java and Apache Spark, we applied an item-to-item correlation filtering system recommended by Amazon. What this means is as follows:
- Each product was described by its type, sex, age, brand, and purpose.
- We filtered by three variants – the item code, the product code, and the “rate” which we defined as click-throughs to that product.
We were then able to generate data on actual customer tastes. Here is a sampling of that data:
User id | Brand | Product id | Category of product | Product type by age | Product type by gender | Product for sports or casual wear? |
1 | Brand A | 42 | shoes | children | male | casual |
1 | Brand A | 45 | shoes | children | male | casual |
1 | Brand A | 48 | shoes | children | male | casual |
1 | Brand B | 717 | jacket | children | male | sport |
… | … | … | … | … | … | … |
19761 | Brand H | 123 | shoes | children | female | casual |
19761 | Brand B | 1186 | shorts | children | male | sport |
19761 | Brand C | 1190 | shorts | children | male | sport |
… | … | … | … | … | … | … |
38335 | Brand H | 95 | shoes | adult | female | casual |
38335 | Brand C | 1596 | cap | children | male | sport |
38335 | Brand C | 1597 | cap | children | male | sport |
… | … | … | … | … | … | … |
39999 | Brand J | 41 | shoes | adult | male | casual |
39999 | Brand E | 59 | shoes | children | male | casual |
39999 | Brand E | 60 | shoes | children | male | casual |
39999 | Brand E | 61 | shoes | children | male | casual |
39999 | Brand E | 62 | shoes | children | male | casual |
39999 | Brand E | 64 | shoes | children | male | casual |
Establishing Predictions for Customer Rates Based Upon Actual Rates
Next, we wanted to generate data that would tell us the predicted rate (click-throughs) of customers who looked at more than one product, if they were shown similar products. This is a sampling of that data:
This first chart shows a customer looking at a specific product and the actual product rate (number of times the customer actually clicked-through).
Users id | Products id | Products rate (in fact) |
0004 | 0940 | 3 |
0005 | 1047 | 1 |
0007 | 1492 | 3 |
0010 | 0123 | 2 |
0011 | 0648 | 2 |
0012 | 0306 | 3 |
0014 | 0023 | 2 |
0017 | 0060 | 1 |
0019 | 0308 | 2 |
0020 | 0091 | 2 |
0021 | 0035 | 4 |
0025 | 0452 | 3 |
This next chart shows the same customer and the predicted product rate if shown similar items:
Users id | Products id | Products rate (in fact) | Products rate (predicted) |
0004 | 0940 | 3 | 3.199 |
0005 | 1047 | 1 | 1.722 |
0007 | 1492 | 3 | 2.615 |
0010 | 0123 | 2 | 2.724 |
0011 | 0648 | 2 | 1.830 |
0012 | 0306 | 3 | 2.708 |
0014 | 0023 | 2 | 2.105 |
0017 | 0060 | 1 | 1.196 |
0019 | 0308 | 2 | 2.403 |
0020 | 0091 | 2 | 2.468 |
0021 | 0035 | 4 | 3.255 |
0025 | 0452 | 3 | 2.119 |
You can clearly see how close the actual and predicted rates are, and they are based upon predictor models that have been proven. What this data science machine learning tells the business owner is that he should be showing individual customers similar products, which customers might not even hear about but which will suit him the most. And this is the value of using data science in retail – informing the retailer of the potential for customers to click-through to other products when presented with them. And because the data puts customers into groups, those groups of customers, with similar behavior and interests, can be shown the same similar products.
Predictions of Product Presentations/Ratings Based Upon Customer Groups
Now that the retailer knows he will be presenting similar products to his customers, the next data science challenge is to determine the products to present. Again, machine learning takes over based upon customer groups and past product rates of those groups and then generates a listing of the similar products to which customers should be exposed.
The following chart is an example of what this data report will show, based upon six additional products that should be shown to each customer, along with predicted ratings.
Users id | Product id | Rating | Product id | Rating | Product id | Rating | Product id | Rating | Product id | Rating | Product id | Rating |
14 | 1027 | 3.919 | 1316 | 3.774 | 507 | 3.745 | 861 | 3.645 | 1154 | 3.63 | 1686 | 3.608 |
11 | 1316 | 3.042 | 1430 | 2.958 | 890 | 2.836 | 958 | 2.809 | 1551 | 2.807 | 1825 | 2.804 |
17 | 188 | 4.517 | 890 | 4.475 | 895 | 4.372 | 177 | 4.354 | 899 | 4.284 | 209 | 4.27 |
4 | 1825 | 4.276 | 497 | 4.195 | 720 | 4.137 | 786 | 4.125 | 1796 | 4.093 | 942 | 4.01 |
39 | 219 | 3.794 | 709 | 3.762 | 188 | 3.762 | 1316 | 3.728 | 890 | 3.706 | 284 | 3.698 |
42 | 196 | 3.168 | 891 | 3.14 | 1238 | 3.139 | 801 | 3.072 | 371 | 3.072 | 266 | 3.059 |
12 | 890 | 4.72 | 507 | 4.628 | 1554 | 4.579 | 786 | 4.552 | 1856 | 4.519 | 127 | 4.511 |
33 | 1547 | 4.249 | 1270 | 4.176 | 801 | 4.136 | 1649 | 4.082 | 1152 | 4.009 | 1480 | 4.005 |
7 | 482 | 5.294 | 890 | 5.129 | 1370 | 5.055 | 1620 | 5.01 | 149 | 4.979 | 1647 | 4.923 |
Based on the existing data, we can also determine the potential buyers for a certain group of products or a certain brand even if they did not express any prior interest in some particular brand. Our model allows juxtapositioning them against people who have similar shopping preferences and had previously purchased the brand in question. As a result, we can narrow down the potential buyer segment that will feel interested in a certain group of products:
Product id | User id | Rating | User id | Rating | Userid | Rating | User id | Rating | User id | Rating | User id | Rating | User id |
23 | 6444 | 4.574 | 5032 | 4.269 | 3161 | 4.211 | 2534 | 4.211 | 9964 | 4.21 | 1430 | 4.2 | 6645 |
648 | 6229 | 4.727 | 4077 | 4.564 | 4724 | 4.399 | 4171 | 4.28 | 9443 | 4.229 | 1368 | 4.185 | 2462 |
60 | 8784 | 4.281 | 4019 | 4.092 | 4165 | 4.063 | 3912 | 4.063 | 6893 | 4.063 | 3935 | 4.002 | 5063 |
940 | 2814 | 4.955 | 9849 | 4.893 | 6893 | 4.832 | 3912 | 4.832 | 4165 | 4.832 | 1329 | 4.821 | 4411 |
298 | 1605 | 4.169 | 3149 | 3.987 | 6133 | 3.936 | 3227 | 3.919 | 1767 | 3.885 | 9366 | 3.881 | 3125 |
374 | 7147 | 4.496 | 4623 | 3.973 | 2242 | 3.903 | 2786 | 3.82 | 5416 | 3.781 | 7043 | 3.732 | 861 |
306 | 557 | 4.626 | 6105 | 4.494 | 4003 | 4.322 | 3689 | 4.311 | 8077 | 4.181 | 4567 | 4.137 | 9104 |
1642 | 2209 | 4.564 | 5941 | 4.431 | 5846 | 4.403 | 6772 | 4.4 | 8862 | 4.172 | 4991 | 4.045 | 23 |
The concept is simple: Customers’ who have completed specific purchases in the past, and those purchases have been similar to those of a group of customers, then future purchases can be predicted. Using real data of these purchases, and applying machine learning for data science, the business owner can customize and personalize (and direct) each customer’s experience and journey on his site.
The Benefits of Ecommerce Analytics Model
For our client, the benefits were obvious. He will increase the potential for purchases and, as the result, increase eCommerce analytics sales by displaying a larger assortment of similar products to each customer – products the customer didn’t even realize we’re on the site and products that will suit customer’s needs the most.
Another value of this model is that sales can be more accurately. The business owner can then better manage his inventory – something that will certainly help to grow business profits. As outlined above, you can make more accurate predictions on the kind of goods to be likely purchased. The predictions can be as accurate as claiming that your company will sell 100-120 Nike Air Max Model shoes with a 90% probability in the next week.
What is more, our model allows determining the exact factors that may (or may not) impact the sales volumes. For instance, in most cases, the frequency of visiting your website has no direct impact on sales. Users may spend a lot of time browsing and comparing goods without committing to a purchase. While factors like age, seasonality and past record of purchases have a significant impact on the probability of a purchase.
So What are Your Problems?
You may have the insight to know that you are not growing as you should. Knowing why is another matter. And that is where business eCommerce analytics comes in. It is a complex matter, but data science case studies continue to show that big data and machine learning can provide the answers. Romexsoft is ready to build a model for you, based upon your unique circumstances.