I have spent the last year building AI agents at Gatekeeper.
They review contracts, onboard suppliers, monitor compliance, assess due diligence, and manage renewals.
Some worked immediately.
Others took rounds of testing and refinement.
All of them taught me something about how AI agents actually work in production.
In my last post, I explained what AI agents are. Read that here.
In this one, I am showing you exactly how they work across real procurement workflows, with actual outputs and the lessons I learned building them.
First, the things nobody tells you.
Four Things I Learned Building Agents
1. Agents do not get bored.
This sounds obvious until you see it in action.
A supplier submits their due diligence response. What happens next used to be predictable.
It sits in a queue.
Gets batched with 20 others.
Someone carves out half a day.
They get through five before context switching.
The rest wait until next week.
Each response takes a minimum of four hours to review properly. You are checking the same boxes, reading the same questions, validating the same compliance requirements.
You get tired.
You procrastinate.
You check Slack.
You get a coffee.
You tell yourself you will finish tomorrow.
An agent takes minutes per response. It does not care if it is the first or the fiftieth. It just follows your instructions and keeps working.
That is the breakthrough. Not speed alone, but the elimination of human fatigue on repetitive work.

2. Agents are more thorough than humans at this work.
Their diligence is superior.
They examine every detail without the mental drain of voluminous, repetitive data work.
If we take a supplier that we are onboarding, the due diligence review from an agent will uncover every nook and cranny. Stuff that we as human reviewers would usually miss because, one, we are so fed up of this soul draining work, and two, we are just so busy with the sheer volume of it.
It is okay to admit that an AI agent is superior at this kind of work. It should be. It is designed to take huge amounts of data in, work with that data, and give an output. We are not designed that way.
3. Setup is straightforward, but process maturity matters.
Building agents that work is not complicated.
But it does require you to have a good understanding of your processes in the first place.
Going from manual straight to agentic ways of working is absolutely doable.
But we would traditionally consider this a staircase.

You go from manual to an automated process. You understand everything that works. Then you add agents on top of your digital processes.
If you have the right information ready, the setup work could be done within minutes or hours.
For people who do not have the right information, it will take longer.
That is why I am writing this content: to get you prepared and thinking about this so you can rapidly deploy agents in the future.
If you find this useful to your work, make sure you pick up a subscription to access the achives and community only posts. You can expense this publication with your employer.
4. Context beyond the foundation model is everything.
We referenced this in my last article.
The LLM is the brain, but it needs context about how your organisation thinks about the world.
How you think about business, risk, buying, and contracting as an example here.
This context might be those policy documents you have pushed out to the business that no one reads. LOL. Seriously, no one reads these.
Your procurement policies, vendor onboarding policies, security policies, contracting standards.
Well, guess what?
An agent will read them. It will use them. And it will run with them and do things in wonderful ways.
The Real Advantage: Concurrency
Here is what 50 contracts over a month actually looks like.
Human review: Four hours per contract, minimum. That is 200 hours total. Over a month, that is five full weeks doing nothing but contract review. Except you do not have five full weeks. So contracts sit in a queue. They get batched. Someone carves out time. They get through a few before context switching. The rest wait.
Agent review: Minutes per contract. And here is the bit that really matters: those contracts do not arrive at once. They trickle in throughout the month. Different suppliers, different times, different days.
With an agent, each one gets reviewed the moment it arrives. No queue. No waiting. If ten contracts arrive Tuesday morning, the agent reviews all ten concurrently. A human would take weeks when you factor in the queuing, the batching, the context switching, the other priorities that keep pushing it back.
The compliance work still gets done right. Every time. But now you are actually working with your suppliers and partners faster than ever before.
Agents can work concurrently without degradation. Humans experience fatigue. That difference changes everything about how fast your organisation can move.
How We Deploy Agents
The breakthrough for us was not the AI itself. It was deploying agents inside workflows where people already work.
We did not change how people operate. We did not introduce new tools. We just added intelligence to existing workflows.
Here is what that looks like: A contract gets submitted through the same intake form people have always used. But now, before it hits a human reviewer, an agent does the first pass. Extracts key terms. Checks against the playbook. Flags issues.
If it is straightforward and matches standards, it routes forward. If something is non standard, it routes to the right person with a summary of what needs review.
The human experience: I only see work that needs my attention, and it comes with context. Instead of reviewing 50 items manually, they review the flagged ones with summaries already provided.
Agent does the boring bits. Human does the judgment calls. That is the model.
I pulled together some Agentic Videos/Demos for you all to check out. Let me know if you have any questions about these below.

Real Agent Outputs
Let me show you what this looks like in practice. These are actual outputs from agents I have built.
Intake Review Agent
This agent reviews incoming requests to determine if they have enough information to proceed. It makes a decision: approve or reject.
When it approves, it explains why. It confirms what is being procured, from whom, and why it is needed, then validates that all three align with the business driver stated.

When it rejects, it does not just say no. It explains what it saw, what is missing, how to resolve it, and even gives an example of what good looks like. This is better feedback than most humans give when drowning in volume.

The requester gets coaching, not just a rejection. They know exactly what to fix and can resubmit with confidence.
Risk Register Creator Agent
This agent reviews questionnaire responses and surfaces risks as structured, actionable records.
For each risk it identifies, it provides: the risk type, the source, the specific question reference, probability and impact ratings, a recommended owner, a suggested due date, and detailed comments explaining the risk in context.
It does not just flag that something might be wrong. It creates a complete risk record ready to be managed. Question O.15 answered No? That becomes a documented ethical sourcing gap with a recommended owner and six month timeline.
A human reviewer doing this work would spend hours. The agent surfaces four specific risks in seconds, each with full context and next steps.
InfoSec/Due Diligence Reviewer Agent
This agent performs comprehensive security assessments against questionnaire responses.
The output includes: an executive summary, the procurement context, the review tier applied, a risk profile with strengths and areas of note, residual risks accepted, a domain by domain assessment table covering everything from enterprise risk management to cloud services, recommended contract provisions, and next review triggers.
This is the level of thoroughness you would expect from a senior security reviewer working for hours. The agent produces it in minutes, examining every domain without fatigue.
Security Policy Review Agent
This agent reviews supplier security policies against your requirements.
It extracts the policy details, version, owner, and framework alignment. It produces a coverage assessment table showing which domains are addressed. It identifies specific findings, both strengths and gaps, with references to the actual policy content.
The recommendation comes with clear rationale: here is what is covered, here is what meets or exceeds requirements, here is the recommendation.
What This Means For Your Team
These agents are not replacing expertise. They are removing the drudge work so humans can focus on the interesting bits.
The compliance work still gets done correctly every time. But now your organisation can accelerate business operations. Suppliers get onboarded faster. Contracts get reviewed sooner. Risks get surfaced immediately.
You are not outsourcing judgment. You are taking full ownership. You create a team of agents that report directly to you. You are responsible for their outputs. You stay accountable for making sure they perform.
That is fundamentally the model: manage them like you would manage employees, because that is what they are. Digital employees doing the work you do not have time for.
Getting Started
If this is new to you, start by thinking about where in your processes you are reviewing vast amounts of data. Intake forms, due diligence questionnaires, contracts, compliance documents. Those are your deployment opportunities.
Then decide what you want the agent to do.
Review and provide insight?
Or review and make a decision?
Then give it context.
Your policies, your standards, your way of thinking about risk. The agent will actually read them.
In my next article, I will walk through how to build an AI agent from scratch.
The process, the decisions, the iteration.
Because understanding what agents can do is one thing. Building one yourself is where it gets real.
