Insights26 June 20265 min read

How to tell if your AI is actually working

Most businesses measure how much their AI gets used. Almost none measure whether it works. MIT found 95% of AI pilots show no bottom-line impact, and the fix is to measure outcomes, not activity.

Plenty of businesses can tell you how many people logged into their AI tools this month. Far fewer can tell you whether any of it left the business better off. That gap is the whole problem, and it is wider than most leaders think.

In 2025, MIT's NANDA initiative studied 300 AI deployments and found that around 95% of corporate generative-AI pilots delivered little or no measurable impact on the bottom line (MIT, The GenAI Divide: State of AI in Business 2025). The striking part wasn't that the tools were weak. It was that almost nobody could show, one way or the other, whether they had helped. The activity was real. The proof was missing.

Activity is not impact

The trap is measuring the easy thing. Logins, messages sent, prompts run: all simple to count, and all close to meaningless on their own. A tool can be used constantly and make nobody more effective, while a tool used quietly by a few people can reshape an entire process. Usage tells you something happened. It does not tell you it helped.

Impact lives one layer down, in outcomes: time saved, quality improved, work that ships faster, customers served better, cost taken out. Those are harder to see, which is exactly why so many businesses skip them and settle for a usage dashboard that makes everyone feel busy.

What to actually measure

You don't need a wall of metrics. You need the few that connect AI use to a business outcome, tracked steadily over time:

Time. How long the work took before AI, against how long it takes now, for the tasks AI genuinely touches.
Quality. Error rates, rework, and customer satisfaction on AI-assisted work, not just the volume of it.
Adoption that means something. Not "did they log in" but "has this become the way the work gets done".
Cost and waste. What you are spending across every AI subscription, and which seats nobody actually uses.
One number for the board. The return, put plainly: value created against money spent.

Measure the outcome, not the tool

There is a subtler mistake buried in here. Every AI vendor will happily show you their own usage stats, but each vendor marks its own homework, and none of them measure whether your work improved. Stitch a dozen vendor dashboards together and you still can't answer the only question that matters.

This is the gap we built The GAiGE to close. It runs short, quiet pulses across all your AI tools and turns real usage into the things that count: adoption, time saved, satisfaction, an ROI figure, and the wasted spend on seats nobody touches. One comparable picture, independent of the vendors, week after week. The same MIT research found that bought-in expertise succeeds about three times as often as going it alone (the gap nobody's measuring), and measurement is a large part of why.

Why it's worth the effort

When you can see what's working, three things change. You back the tools earning their place and stop paying for the ones that aren't. You show a board a real return instead of a hopeful anecdote. And you stop making the most expensive mistake in AI: quietly shelving something that was about to pay off, because you couldn't see the progress (three quiet ways pilots stall).

The businesses in MIT's successful 5% weren't lucky. They measured, learned and adjusted, while everyone else declared victory at the pilot and moved on.

If you can't yet answer "is our AI actually paying off?" with evidence rather than a hunch, that is the place to start. The free AI Maturity Assessment gives you a read on where you stand, measurement included.

Want to know where your team actually stands?

Start Your Assessment