Dear Advisor: Why is my ML in Production doing WORSE than locally?

Post was originally published in January 2021, updated for relevancy in February 2021 and December 2023

First, congratulations on getting your model to production. That was no easy task! 

Remember that typically models in production will perform (at least) slightly worse than offline models, even if you used the same metrics to evaluate them, because the model is seeing data is hasn't seen before. But what are some reasons for why the difference is big?

Part I: Possible Reasons

If you're seeing a large difference in model performance between what you saw when you ran it locally to what's now happening in production, here are some possible reasons:

Part II: Diving Deeper into Leakage

What is leakage? Data leakage happens when training data has information you’re trying to predict [ref]. 

Examples of Leakage:

Part III: Recommended Next Steps

Depending on the model performance reason, you may want to consider turning off the live model, and explore locally -- on an old and also fresher data extract -- the potential reasons (above) for seeing such different model performance live vs offline.

Do you need an expert to help you figure out what happened? Please reach out.

Keywords: AI/ML in production, data products, customer understanding

You may also like: