Custom AI agents for audit and advisory: what they are and how they work

Written by Amanda Waldmann | Jun 4, 2026 8:12:39 PM

Key Features:

An AI agent is structurally different from a copilot: it pursues a goal across multiple steps, while a copilot responds one prompt at a time.
Custom AI agents reason inside the firm's methodology, not on top of it. That's where reliable audit output comes from.
Domain context is what makes AI output defensible in an audit file. Risk indicators are industry-specific. Control objectives belong to the framework. Strip that out and the output reads well and tests badly.

For partners managing concurrent engagements with lean teams, the same things squeeze every year: the evidence chasing, the review bottleneck at the end of fieldwork, the uneven execution from staff to staff. AI agents are getting attention because they go after exactly that work. They execute across steps instead of responding to prompts one at a time, which is why senior time can finally move from production to review. This article covers what separates an agent from a copilot, how Field Agents execute across the engagement lifecycle, and where regulators currently stand.

What makes an AI agent different from a copilot or chatbot

In 2026, every AI tool on the market gets stamped as an agent. Vendors are calling rule-based chatbots, document-summary copilots, and single-prompt automations "agentic," and most of what gets sold under the label still falls short of the actual bar.

An agent is AI that pursues a goal across multiple steps, evaluates the results along the way, and adjusts when conditions change. That's a different kind of tool from a chatbot or a copilot, and the difference is easiest to see side by side:

Chatbot: Answers a question. The work starts and ends with the response.
Copilot: Drafts an output when prompted, like a memo, a summary, or a snippet of code. The work starts and ends with the draft.
Agent: Runs a workflow. It picks the next step, executes it, evaluates the result, and keeps going until the work is done or a human steps in.

Only the third one changes how the work gets done. The first two speed up individual outputs. The third runs the workflow that produces them.

The benefits in practice

The benefit of an agent isn't a faster memo. It's the three weeks of evidence chasing, the rework every roll-forward, and the review bottlenecks at the end of fieldwork. Execution moves to the agent. The practitioner reviews the work, applies judgment to the calls that matter, and signs off.

For a partner managing a portfolio of SOC 2 examinations and financial audits, that shift is concrete. If a copilot drafts a memo when asked, the senior still spends the afternoon chasing the supporting evidence, reconciling it to the request list, and writing it up. An agent works at a different level. On the Fieldguide platform, the Request Agent analyzes client-submitted evidence as it arrives, flags gaps and inconsistencies against the test requirements, and drafts targeted follow-ups for the team to review and send. The Testing Agent picks up from there, matching evidence to samples, checking data against framework requirements, identifying potential exceptions, and drafting workpaper documentation, with up to 70% of testing on the advisory side covered by agent execution. Field Reviewer surfaces exceptions, judgment calls, and elevated-risk areas before the partner looks.

For the firm, that changes the operating economics of an engagement. Senior time moves from production to review. Realization stops bleeding in the last two weeks. Managers stop carrying the rework on their nights and weekends. And the engagement file that lands in front of the partner already has the routine work done, the exceptions surfaced, and the trail documented.

The perceive-reason-act-learn cycle

The way agents work is a four-function loop:

Perceive: gather data from documents, statements, and systems.
Reason: process that data and set goals using LLM technology.
Act: execute tasks across multiple steps.
Learn: improve through feedback over time.

The loop is what lets an agent handle work that older automation can't. Robotic Process Automation (RPA), which firms have used for years to handle structured tasks like invoice intake or form filling, follows a fixed sequence and breaks the moment the sequence changes. An agent runs the loop instead: it perceives the new condition, reasons about what to do next, acts on it, and learns from the outcome for the next run.

Why audit and advisory firms need custom AI agents, not generic ones

Audit AI has to know audit. The framework being tested, the firm's methodology, the risk thresholds set during planning, the prior-year work: all of it has to be in the agent's context, or the output won't survive a PCAOB inspection or peer review. A generic agent doesn't have that context. A custom one does.

This is the design point behind Fieldguide's Field Agents. Agent Knowledge holds the firm's methodology, prior-year work, and the standards the agents draw on. Agent Configuration tunes Field Agents to the firm's frameworks, sample sizes, and approach. The work the agent does sits inside the firm's methodology by default, not next to it.

Gartner predicts that by 2027, organizations will use small, task-specific AI models at least three times more than general-purpose LLMs. General-purpose models are strong at language and weak at context, and audit is almost entirely context. Risk assessment depends on industry-specific risk indicators. Control testing turns on the framework's control objectives. Financial statement analysis depends on GAAP or IFRS applied to specific transactions. Strip that context out and the output reads well and tests badly.

What "custom" actually means

Frameworks are the work. SOC 2 Trust Services Criteria, PCI DSS requirements, HITRUST CSF controls, and COSO's five components are the structured knowledge the agent has to reason inside. A custom agent tests evidence against those requirements, flags exceptions against the firm's risk threshold, and documents results in the firm's workpaper format. It also knows which engagement it's running, because a financial audit, a SOC examination, and an internal audit advisory engagement each carry different professional standards.

What changes when Field Agents are running the engagement

On the Fieldguide platform, the work is divided across a roster of Field Agents, each purpose-built for a phase of the engagement.

Planning lands faster

Field Planner handles the staff-auditor groundwork that used to take days of senior time: drafting scoping documentation, walkthrough write-ups, control design proposals, and risk assessment inputs. It draws on Agent Knowledge, which holds the firm's methodology, prior-year work, and the standards encoded into the platform, so the output sits inside the firm's approach by default. The manager reviews and finalizes a planning file instead of producing one from scratch.

Evidence collection runs without senior follow-up

The three weeks senior staff usually spend chasing PBC requests turns into hours. The Request Agent analyzes client-submitted evidence as it arrives, flags apparent gaps and inconsistencies against the test requirements, and drafts targeted follow-ups for the team to review and send. The Testing Agent picks up from there, matching evidence to samples, checking data against framework requirements, highlighting potential exceptions, and drafting workpaper documentation with direct source references. The practitioner reviews and approves every output before it moves on.

Agent Triggers run the next agent automatically when a document uploads or a request is marked received. Multiple agents can work across controls simultaneously, and the kanban-style Field Board shows the team what's ready for testing, waiting on evidence, in progress, flagged for review, or complete.

Reporting is review-ready, not reconstructed

The end-of-engagement bottleneck goes away. AI-assisted reporting drafts the work, with practitioners providing final review and approval. That includes drafting memos, summarizing exceptions, pulling disclosure language, and navigating the financial statements. Because the agents documented results with direct source references as the work was executed, the trail is already there when the partner sits down to review.

Where regulators stand on AI in audit today

There is no PCAOB auditing standard written specifically for AI, and there doesn't need to be. The existing standards on supervision, documentation, and quality control already cover it. Inspectors don't grade the technology. They grade how the engagement team supervised it and documented what it did.

The PCAOB's QC 1000 standard takes effect December 15, 2026, and names technology as a quality control risk factor. That builds on what the PCAOB has already been signaling: a 2024 GenAI Spotlight found firms using GenAI mostly for administrative and research work, with human supervision central in every case, and the 2025 inspection priorities flagged technology as a focus area, particularly where it supports procedures responding to identified risks of material misstatement.

Capacity gains from AI only count if the work survives inspection. The platforms that hold up are the ones that produce a documented trace, tie outputs back to source evidence, build human review into the workflow, and let the agent reason inside the firm's methodology.

How Fieldguide approaches custom AI agents for audit and advisory

Fieldguide is the industry's only end-to-end AI-native platform purpose-built for audit and advisory, covering the engagement lifecycle from planning through reporting across financial audit, SOC, internal audit, and compliance. The platform brings AI Assist and Agent Workforce together in one place, so the Field Agents execute the engagement work inside the firm's methodology while practitioners review, judge, and sign off. To see how it runs on a live engagement, request a demo.

View full post