I came in one morning to find that our AI had filed 1,004 improvement tickets about our small business automation while we slept.
That number stopped me cold. A thousand suggestions – about its own performance, its own gaps, things it thought should work better. Most of them were probably fine ideas. None of them were going to get built automatically. That part is entirely up to us.
That’s the setup for this post: not “AI improves itself,” but “AI proposes, humans decide.” The distinction matters more than it sounds if you are using AI tools around real customers, schedules, email, or job records.
What our AI review system actually does
The tool is something we call the Program Improvements Manager, or PIM. Every night, it runs a review pass across the AI systems we’ve built: voice agents, email tools, the CRM integration, the scheduling app. It looks for patterns: things that failed more than once, places where a response was slow, steps that a human had to correct, areas where the output was inconsistent.

Then it writes up suggestions. Each one gets a rough estimate of effort and a rough estimate of impact. They get sorted into categories: reliability improvements, new capability requests, content updates, configuration changes.
By morning, we didn’t have 1,004 raw suggestions. We had 129. That was better, but it was still too many. It was not practical to sift through 129 active improvement suggestions every day and make good decisions about all of them.
That became the real lesson: the system had to be tuned for usability, not just output. A review tool is only useful if the queue is small enough that a person will actually read it. You do not want AI doing a million things if all but a few get stuck in a human review backlog.
That human is me, usually. I review the queue, decide what is worth building, and tune the system when the list starts turning into noise. Some suggestions go into our actual development backlog. Most get noted and watched. A few get dismissed. Nothing happens automatically.
The best use of AI here is not as a magic box for every shiny new idea. It is a tool to reduce work we were already doing. If it overwhelms us with a pile of crap we will never get a chance to look at, it has failed, even if every individual suggestion sounds clever.
Why human approval is the whole point
An AI that can modify its own behavior without oversight is a liability. I’m not being dramatic about that. It’s just true. The value of an AI self-review system like this is not that it fixes things on its own. The value is that it sees things I would miss, at a speed and scale I can’t match, and hands them to me in a form I can actually act on. The judgment call stays with me.

The analogy I keep coming back to is a good estimator who reviews all your past jobs and writes up a report: here’s what came in under budget, here’s what ran over, here’s a pattern I’m noticing. That’s genuinely useful. But the decisions about how to bid the next job, which crew to put on it, when to take a pass – those stay with the person who signs the contracts.
Same idea here.
Nightly backups make the automation safer
The same night the PIM runs, a set of backup jobs runs too. Every customer recording, every voice-agent transcript, every video gets quietly copied to cloud storage. Not because anything bad happened, but because “automated backups run every night” is the kind of boring sentence that prevents years of work from disappearing.

I don’t think about it most days. That’s the point.
A few weeks back I was looking at the backup logs and noticed they’d been running without a single gap for over a month. Thousands of files, dozens of sessions, completely automated, no one watching. That’s the kind of automation I actually trust. Not because it’s clever, but because it’s predictable.
Why AI process improvement is practical now
One reason this is practical now is that the AI tools have gotten cheap enough for normal business experiments. In the videos I’ve made about AI, I keep coming back to the same point: a lot of this is not five-figure enterprise software anymore. Sometimes it is a $20-a-month tool, or a small metered cost, doing a job that would have taken a person hours.
That does not mean everything should be automated. It means the math is finally good enough that a small company can try things, keep what works, and shut off what does not.
This only works if someone reviews the queue
An AI that writes its own improvement tickets is only as good as the person who reads them. If the queue piles up unreviewed for weeks, or if the person reviewing it doesn’t understand the systems well enough to evaluate the suggestions, you get noise. Or worse, a false sense that the system is improving itself when really it’s just generating paperwork.
This approach works for us because I’m close enough to the code and the business logic to make calls quickly. For a shop owner who doesn’t build their own tools, the right version of this might just be a monthly review session with whoever manages your software vendors. Periodic review by a person who can act on it matters more than the specific implementation.
One place to start
If you’re building any kind of automation, build the review step first. Not as an afterthought. Decide ahead of time: how will I know if this is working? How will I know if it’s quietly doing the wrong thing? The answer doesn’t have to be sophisticated. It can be a weekly email summary, a log you glance at on Mondays, a number that should stay below a threshold. Just something a human actually reads.

