Understanding Black Crow’s LTV predictions
Black Crow's LTV predictions have quickly become critical infrastructure for our users. Why? They rely on them to:
- Focus their paid ads spend on high-value customers,
- Assess how successfully they're acquiring high-value customers,
- And better understand those high-value customers.
Because these predictions have the ability to transform their paid ads strategy, users understandably want to know (1) how they're generated and (2) how reliable these predictions are. In this article, we hope to answer these questions. Let's get into it.
Everything starts with data
There are two key sources of data that go into Black Crow’s predictions:
- Shopify orders from brand shops
- Browsing data from brand sites
From there we can generate a lot of signals for our model, such as:
- Total amount spent
- Exact combination of products ordered
- Any discount codes used
- Customer's location
- Number of sessions logged prior to ordering
- Products browsed prior to ordering
And so on. There are hundreds of possible signals like these for every order. Some are more important than others, but all contribute to the accuracy of the model.
One important callout: in building a model for a brand, we use that brand’s data and only that brand data. We don't share data across shops. Every model built is bespoke for each brand.
Next, we train a unique model
With data access, Black Crow starts training a predictive LTV model built specifically for our brand. This model looks at orders from the past couple of years of orders; i.e. not so far back that the data isn't relevant, but far enough to create a good, representative sample size.
The LTV prediction is for the six months following a customer's order. In other words, our predicted LTV is the total amount a given customer is expected to spend in six months, including their initial order. Customers may continue to be valuable after that six-month window, of course, but our experience shows that six months is enough time to identify our brands’ highest- and lowest-value customers.
A combination of classification and regression models are leveraged to generate our predictions. What’s critical is the output – a single predicted value for each customer. A user might see that:
- Customer 1: Spends $50 on their initial order, Black Crow predicts $67 in future revenue, total predicted LTV is $117
- Customer 2: Spends $50 on their initial order, Black Crow predicts $121 in future revenue, total predicted LTV is $176
It sure would be great to know who was customer 1 and who was customer 2 up front, right? This is exactly what Black Crow’s model tells our users. Of course, this assumes that the model works, which brings us to our next step ...
Then, we make sure the model is on point
Here's the crucial part. None of the above matters unless our model makes predictions that match what the customer actually does in the future. I won't bury the lede – it does – but let's get into how we know.
The first thing we do when training a model is to test it against historical orders from that two years of past data mentioned earlier. Not against the same orders we trained on – that would be cheating! Instead, we hold out 25% (or less, depending on the shop size) of orders from model training just for testing. That way, we're grading the model on orders it's never seen before.
So now, when running the model on those test orders from the past, we know:
- What the model predicted the customer would spend over the next six months, and
- What the customer actually spent over the next six months.
Let's see how this looks for a model trained recently for a new brand, using the lovely charts from our fancy modeling software:
Let's break down what what Black Crow’s model is predicting here:
- The model was applied to past customers that the model has never seen.
- Customers were sorted into 10 groups by their predicted LTV, low to high:some text
- Customers with the lowest predicted LTV are counted in decile 1
- Customers with the highest predicted LTV are counted in decile 10.
- The green bars show the average LTV predicted for each group; of course, this increases from left to right.
The differences are fairly dramatic: we predict 8x more LTV from the highest-LTV customers than the lowest.
But, how much LTV did these customers actually deliver? That’s represented in the fuchsia bars. If the model were really bad, those bars would be all over the place, and bear little relation to the prediction data in green. Here we see that the actual LTV matches the predicted LTV very well: for every customer group, the bars are about the same height. We systematically quantify this difference between predicted and actual LTV, ensuring that every model we generate for a brand meets a high standard of accuracy.
A similar process is also applied to predictions the model is making today. In six months, we'll be able to check today's predictions against real-world performance, just as we can with those historical orders. We do the same analysis, and see that the model is similarly accurate. There’s also a one-month and three-month check-in to make sure the model is on track.
Finally, we deploy
With a good model in hand, it’s time for the fun stuff: we push Black Crow’s predictions to Meta (and our portal), and our users can start acquiring more of the high-LTV customers that really drive revenue. Models are retrained every 30 days to ensure they’re up-to-date, and we're always testing new signals or ways of processing the data.
If you think that sounds cool – and honestly? it is kind of cool – then have a chat with your Black Crow CSM about our predictive LTV capabilities.