AI AlgorithM illusions: 8 ways it looks superior but falls short

What to Watch out for in AI Startups: Advice for AI Due Diligence (Part 2)

March 2024

Also in this series:

(Part 1) Dear Advisor: How do I Prepare for AI Technical Due Diligence?

Over the past two weeks, I’ve reviewed over 65 AI pitches and pitch decks (!), with more to come. There’s AI for everything now, but that’s not surprising. :) What’s still puzzling is how to tell what’s hype and what’s not. Here’s advice to help you detect the difference while evaluating a start-up.

At this stage in the diligence process, I'll assume that we already know that:

The solution is not a thin wrapper about ChatGPT, and
There is some (even vague) go-to-market strategy focused on solving specific customer needs beyond selling ``AI,” i.e., the startup passed the AI litmus test.

Here’s what else to consider.

Potential Risks when ``AI” is part of the MVP

Due diligence focus 1: As I’ve shared in the past, this is a potential orange flag, which depends on many things, including if it’s core to the product, the product vertical, the availability of customer data to learn from, and how much the ``AI” solution needs to be state-of-the-art (or hopefully not!) to pull off the start-ups’ claims.

How much value is AI bringing to customers at this stage in the product? Is it core to the experience – or a nice-to-have?

Due diligence focus 2: Demos always look great! Let’s evaluate whether a demo differs from the GTM product.

Based on the pitch, I’ll suggest a walk-through of how the product solves a common challenge your customers have end-to-end.

Due diligence focus 3: Is there a possibility of bias to invalidate the product/service/solution claims?

Most commonly, that bias can result from data leakage or incorrect application of an algorithm:

Data leakage happens when training data has information we’re trying to predict [ref], such as leaking information from the future into the past [ref], with examples below.

Incorrect use of an algorithm occurs when we don't understand the assumptions behind it, such as assuming that events are happening independently of each other when that’s not the case; more examples are below.

Please note that this can be true for "AI” developed at later product stages; we’ll need to evaluate it then as well.

Potential Risks for AI at any Product Stage: Data Leakage Coming from Clinical Workflows

Epic (an electronic healthcare records system) had to remove its sepsis-predicting algorithm after it included information on antibiotic prescriptions to predict infections because physicians would have already identified sepsis when they ordered an antibiotic.

Due diligence focus 4: How does/doesn’t “AI” build on and integrate with clinical workflows to improve decision support and/or efficiency?

Potential Risks for AI at any Product Stage: Data Leakage Coming from the Collection Process

Start-ups developing HealthTech solutions for diagnoses (such as heart or cancer-related) from images (such as MRI, CT, ultrasound, and echo) may step onto this landmine.

Algorithms designed to predict the existence of a health condition from these images will seem to perform well at detecting the condition because to get those images, a patient must first get a referral to a specialist and then be seen by a radiologist/technician/similar who will capture any “weird-looking” images. This way, a naive model that predicts everyone has the specific condition – will do very well (!).

This is the most common mistake HealthTech start-ups make!

Due diligence focus 5: How was the data collected to power the “AI” algorithm? How does it account for the diversity in the conditions predicting and patients with those conditions?

Potential Risks for AI at any Product Stage: Data Leakage Coming from the Algorithm’s Testing Process

Unfortunately, where patients get their care also plays a role, from their clinicians’ expertise to the hospital’s national ranking to geographic locations to the volume of patients the care center sees in specific expertise. An algorithm’s performance at a top-ranked hospital, for example, relative to a baseline of a rural one, will typically look better artificially due to that hospital's patient ecosystem and the volume of patients it cares for.

Due diligence focus 6: How was the algorithm's performance evaluated? Are the metrics we see based on the business use case and the testing data set?

Potential Risks for AI at any Product Stage: Using the Wrong Algorithm

There are algorithms aimed at forecasting the efficacy of a treatment for better managing patient care. We know that the demographics of each patient and the characteristics of where and from whom they’re getting care may affect outcomes; by controlling for this variability, we may be able to explain why something happened the way it did. If we don’t remove the effects of this individualized experience to treatment, we won’t understand the true efficacy of the treatment effect – if there’s even any left. Many founders forget to do the last step!

Due diligence focus 7: How well does the team understand, even at a high level, what the algorithm is actually doing under the hood? Or is it treated as a black box?

Potential Risks for AI at any Product Stage: Setting and Forgetting About It

AI demos look slick! But demos and real-time product usage differ!

Due diligence focus 8: How well does the team understand—and budget for in their financial projections—what it will take to host, support, maintain, debug, and ensure that the algorithm doesn’t hallucinate when the stakes are high?

As you know, gauging how the founders approach and answer these questions will tell you a lot! :)

Next Steps for More Support

VC funds and P/E firms, please use this Calendly link to find a time to discuss the scope for the next AI due diligence you need support with.
Everyone else: Please complete the Google form.

VC Frequently Asked Question: How would we work together?

I'm here to help you evaluate risks, challenges, and opportunities. Here's how I typically support funds:

Tier 1: Quick review of the website/deck and let you know (1) if the pitch is hyped or not, (2) a 1-sentence summary of where the biggest issue may be.
Tier 2: If it's potentially not all hype, help you uncover potential risks and opportunities by:
- Reviewing the website/deck/etc., and create questions for you to ask the start-up; or
- Supporting pitch due diligence to ask questions and discuss with you afterward.
Tier 3:
- Discuss the scope of a deeper dive into the Data Room/product/team, etc. (including any gaps you already see that you need more support with to help you evaluate);
- Evaluate Data Room, product, etc. (keeping you in the loop with the potential to expand scope);
- Prepare a summary of risks, challenges, and opportunities;
- Discuss.
Tier 4: Retainer/custom to your needs and scope.

You may also like (advice for investors):

AI Due Diligence Questionnaire to support you in pitch diligence
Dear Advisor: Why is my ML model in production doing worse than locally?
Investing in the Age of Generative AI, by Kevin Zhang, via EVCA Newsletter

You may also like (advice for founders):

Evaluating Retention: Advice for (Early-Stage Start-up) Diligence
AI Due Diligence Questionnaire to support you in pitch diligence
(Part 1) Dear Advisor: How do I Prepare for AI Technical Due Diligence?
Advice from Due Diligence of 50+ Pitch Decks: Advice for Aligning Product, Technology, and AI in your Pitch Deck Story
Dear Advisor: How do I avoid the biggest data/AI/ML mistakes others make?
Dear Advisor: What should (not) be your AI roadmap? (or) Why You Don't Need AI in Your SaaS MVP

Google Sites

Report abuse