What are some of the Biggest ML/AI Mistakes I Continually see Start-ups Make?

I hosted a "making data-driven decisions" office hour on the 805 Startups Discord channel on June 15th. One of the questions that the co-founder Gary Livingston asked, was: What are some of the biggest mistakes you continually see made by startups?

Without throwing anyone under the bus, here are the recurring themes I've seen in the last 5+ years.

Missing Information

  1. Not tracking everything about the business.
  2. (On the flip side) Tracking everything about the business, but with many different data providers and vendors, that don't link to one another.
    • Assessment: Do you know how many new active users there are on your platform today?
  3. Treating ML vendor as a silver bullet and buying a multi-year vendor license without doing a POC to see if the software actually solves the problem you bought it for.

Too early for ML -- all of the above and:

  1. Starting with ML/AI (e.g. predictive analytics), before understanding what's happening with customers and product historically and now (e.g. descriptive analytics).
    • Assessment: Do you currently know who cancelled your service within the last week, and how you acquired that customer in the first place?
  2. Doing ML for the sake of ML or treating ML as silver bullet, without understanding how the model will solve the business/customer's pain point, how it will be used by the stakeholder to solve that pain point, and what data is/isn't available.
    • Assessment: Are you trying to do ML for a process that you don't understand?

Executing on ML products as if they're software engineering tasks-- all of the above and:

  1. Model bias because of biased data or 4 other reasons.
  2. Not treating ML as data products that you scope down and iterate over, from proof-of-concept (POC) to v1, v2, etc.
    • Assessment: Do you have an ad-hoc/simple model that answers your business question, that you can compare + evaluate the next iteration of the ML model against?
  3. Inadvertently scoping out POC to be better than state-of-the-art for ML.
    • Example: Many pitch decks... :(
  4. Asking for a guarantee that ML model performance (based on ML metrics, KPI, etc.) will be just as good or better, than the offline model. This is impossible to guarantee, because customer behavior or product offering changed, or model may have (inadvertently) overfit offline data, etc.
  5. Difficulty hiring DS/ML/AI candidates, most likely because job descriptions list software requirements over a 30/60/90 plan of what the expectations and deliverables look like.

Nobody is perfect. Now that you know what to focus on, start small and iterate. If you need more support on what that process looks like for you -- or how to execute it, let's talk.

Keywords: AI, ML, start-ups, data strategy, data products, customer understanding

You may also like: