Here's something I tell almost every client before we start building anything:

“You don't have to design the perfect system from day one. The most important thing is the centralization of data and the cleanup of data. Once you figure that out, everything else is relatively easy.”

Most people who want to implement AI in their business assume the hard work is on the AI side — picking the right model, writing the right prompts, choosing the right platform. And those things matter. But they're not where projects fail.

Projects fail because the data underneath is a mess.

The Real Bottleneck

Picture a team that wants to automate their meeting debriefs. They want AI to listen to a call, update the CRM, draft the follow-up email, and add tasks to their project management tool.

Technically, all of that is buildable in a day. The platforms exist. The integrations are available.

But here's what actually happens when they try to build it: the CRM has fields that don't match. Some contacts have been entered three times under different names. The “meeting notes” field has been used for completely different things by different team members. Some records have never been updated.

Now you automate on top of that. The AI updates the wrong record. Drafts a follow-up to someone who left the company six months ago. Creates duplicate tasks in a project that doesn't exist.

The AI didn't fail. The data failed. The AI just moved faster on bad inputs.

AI Doesn't Fix Messy Data — It Amplifies It

I learned this the hard way in my own work. I was building a multi-step agent workflow, and no matter how well I tuned the prompts, the outputs kept coming out wrong. A mentor finally told me: clean and structure the data before feeding it to AI. Remove redundancies. Strip unnecessary information. Make sure your fields mean what you think they mean.

Once I did that, the same prompts that had been producing garbage started producing exactly what I wanted.

Same AI. Better data. Completely different results.

This is why one of the core principles I use when advising clients is what I call Data Centralization First — the idea that centralized data, consistent field definitions, and a clear data structure create the actual unlock for automation. Not the AI layer. The data layer underneath it.

If the numbers aren't trusted by the people who see them, the automation won't be trusted either.

What “Data First” Actually Looks Like

This doesn't have to be a massive project. Here's the practical version:

Step 1: Find where your data lives. Most teams have it scattered — a CRM, a spreadsheet, notes in someone's inbox, data in a project management tool that nobody updates. List all the sources.

Step 2: Pick one source to clean first. You don't need to fix everything at once. Pick the data source that would unlock the most value if it were reliable. For most businesses, that's the CRM or a customer database.

Step 3: Standardize the fields. Make sure the same thing is always entered in the same place, in the same format. Names, dates, company names — consistency here matters a lot for AI.

Step 4: Remove duplicates and fill gaps. One afternoon of cleanup work here is worth weeks of prompt engineering later.

Step 5: Now build the first automation. Start simple. One voice note that auto-updates the CRM after a meeting is a huge win on its own. That single workflow improves data accuracy across the team. Then you layer the next thing.

The Compounding Effect

Here's what good data hygiene does for your automation over time.

Every workflow you build on top of clean, centralized data works well. Each one compounds. You add a briefing automation — works. You add a follow-up email draft — works. You add a network matching agent that surfaces relationship opportunities — it has the data it needs to find good matches.

The hard part is done once. Everything after that is layering.

Teams that try to design the perfect AI system before sorting out the data almost always get stuck. They spend weeks troubleshooting prompts, when the actual problem is two levels deeper. The teams that start by cleaning and centralizing one data source end up with something that actually works — and scales.

There's an analogy I keep coming back to from a different context. When we were designing the camera systems for Padel Society, the temptation was to focus on the cameras themselves — which models, what specs, how many. But the real decision that determined everything else was where to run the wires. Get the infrastructure right first. Then the surface-level decisions are easy.

Data is the infrastructure. Get it right first.


If you want to see what's possible once the foundation is in place, the 4-Day AI Sprint covers the practical workflows — what to build, in what order, and how to layer them.


You may also Like

Last Updated: June 25, 2026

Read More
Read More
Read More

ABOUT THE AUTHOR

Thanh Pham

Founder of Asian Efficiency where we help people become more productive at work and in life. I've been featured on Forbes, Fast Company, and The Globe & Mail as a productivity thought leader. At AE I'm responsible for leading teams and executing our vision to assist people all over the world live their best life possible.


Leave a Reply


Your email address will not be published. Required fields are marked

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}