Business owner reviewing AI agent workflow tasks before approval

By MKovacs Egyéb 26/05/2026

Weekly AI business briefing

AI Agents Are Becoming Workflow Tools. Check the Guardrails First.

The useful question this week is not, "Which AI agent should we buy?" It is, "Which repeatable business task is clear enough for an agent to help with, and where does a person still need to approve the work?"

Several AI updates from the last week point in the same direction. OpenAI is adding more context and longer-running work modes to Codex. Anthropic announced a large Claude rollout with KPMG, where the emphasis is not chat for chat's sake but tax, legal, cybersecurity, and platform-based work. Google used I/O 2026 to push agent-first development, managed agents, and background information agents. GitHub Copilot added automatic model selection inside VS Code so the tool can route different tasks to different models.

For a business owner, the message is practical: AI agents are moving closer to normal work. That can help. It can also create noise if the process underneath is vague.

I would not start by asking whether your company needs OpenAI, Claude, Gemini, or Copilot. I would start with the work your team repeats every week: quote follow-ups, internal reporting, invoice checks, customer replies, sales handover notes, product feedback summaries, or internal knowledge search.

If that work has clear inputs, clear rules, and a clear approval point, it may be ready for an AI-assisted workflow. If it depends on undocumented judgment from one experienced person, map that first.

What changed this week

OpenAI Codex is becoming more persistent and context-aware

OpenAI's May 21 ChatGPT Business release notes describe several Codex updates that matter for real work: app window context through Appshots, goal mode across the app, IDE extension, and CLI, improved in-app browser annotations, locked computer use, shared plugins for Business workspaces, and Codex analytics for admins.

The business point is simple. Agents are being designed to keep working toward an outcome, not just answer one prompt. That makes documentation, success criteria, and review steps more important. If you ask an agent to "improve the report," you may get a lot of movement and not much business value. If you ask it to "check this weekly sales report for missing fields, compare it with the CRM export, list exceptions, and wait for approval before changing anything," you have a workflow.

Anthropic and KPMG show what enterprise adoption looks like

Anthropic announced on May 19 that KPMG is embedding Claude into its Digital Gateway platform and giving Claude access to more than 276,000 employees. The most useful detail is where the agents sit: inside the platform where tax expertise, proprietary tools, and client data already live.

That is an important lesson for smaller companies too. The agent should not float outside the business as another tab people forget to use. The value comes when it is connected to the work surface: the CRM, the inbox, the document folder, the helpdesk, the reporting spreadsheet, or the project system.

The announcement also put real weight on the human role: employees shape workflows, evaluate outputs, and make decisions with AI. That is the part many companies skip. "Human in the loop" is not a checkbox. It should say who checks what, when, and against which rule.

Google is pushing agents into development, search, and daily work

Google's I/O 2026 developer announcements focused heavily on agent-first development. Google described Antigravity 2.0, an Antigravity CLI, managed agents in the Gemini API, persistent isolated environments, and integrations with Workspace, Android, Firebase, and Cloud Run. In its wider I/O roundup, Google also described Search information agents that can monitor topics in the background and Gemini Spark as a personal AI agent that works under user direction.

The practical pattern is background work. Agents will increasingly monitor, draft, check, compare, summarize, and prepare. That is useful for business tasks that already have a rhythm: weekly competitor checks, supplier updates, open quote reviews, customer support themes, or overdue invoice reminders.

But background work needs boundaries. What should the agent monitor? Which sources are trusted? What should trigger a notification? What should never be sent, edited, ordered, or approved without a person?

GitHub Copilot is reducing model choice friction

GitHub's May 20 Copilot update added automatic model selection in VS Code. GitHub says Auto considers real-time availability, reliability, reasoning needs, code complexity, bug diagnosis difficulty, and tool orchestration needs. Users can still see which model was used and switch manually.

This is a small but telling shift. The user should not always need to understand which model is best for which task. The tool can make that routing decision. In business workflows, I expect the same pattern: the owner will not choose a model for every invoice check or customer reply draft. The workflow will route the task, then the person will review the parts that carry risk.

What this means for business owners

The agents are getting better, but the first win still comes from clear work. A messy process does not become reliable because an AI tool touches it.

Before you give an agent more responsibility, write down five things:

The repeated task: for example, "check new quote requests every morning and prepare follow-up drafts."
The allowed sources: CRM records, a pricing sheet, approved email templates, or a specific folder.
The output: a draft email, an exceptions list, a weekly summary, or a tagged task.
The approval point: who reviews it before anything is sent, changed, or escalated.
The stop rule: what the agent should do when information is missing or confidence is low.

This does not need to be a large transformation project. In many small and mid-sized businesses, a useful first agent workflow is boring in the best way: it checks a spreadsheet against a system, drafts replies from approved notes, summarizes open issues, or finds repeated customer questions.

A good starting test: pick one task that costs your team at least two hours every week and has a clear before-and-after. Do not automate the whole process first. Let AI prepare the work, then have a person approve it. If the team trusts the output after a few cycles, expand from there.

Where I would look first

If you run a business and want a practical AI starting point, look for time leaks that have enough structure:

Sales quote follow-ups that are late or inconsistent.
Weekly reports that require copy-pasting from multiple tools.
Invoice or order checks where the same fields are reviewed every time.
Customer replies that start from the same approved information.
Internal questions that one person answers again and again.

Those tasks are not glamorous. That is why they are good. They are close to the business, easy to measure, and usually painful enough that the team will notice if the workflow improves.

The decision for this week

The latest AI news is not really about one vendor winning a feature race. It is about agents becoming part of the operating layer of work: coding, checking, monitoring, drafting, routing, and preparing decisions.

That means the business owner's job changes slightly. You do not need to understand every model release. You do need to know which workflows deserve help, which data the agent can use, and where human judgment must stay in control.

Clarity before tools still wins.

Need a practical AI starting point?

If your team is curious about AI but unsure where to start, take the free AI Business Assessment. It helps identify which workflows may be worth improving first, before you spend time on another tool nobody uses.

Take the free AI Business Assessment