← Back to Blog
Internal Audit · Data Analytics · P2P

Modernising Internal Audit, One Module at a Time - Procure to Pay

A practitioner’s account of automating 30+ P2P controls across a 10+ entities group and 20 million+ records, and what it actually takes to make it work.

There’s a version of audit automation that looks impressive in a presentation deck. A script runs. Exceptions get flagged. Auditor reviews the list. The CAE nods appreciatively. And then the next cycle begins.. Someone runs the same script again, manually, the same way they did it last time. That’s not transformation. Just a faster version of the same incomplete process.

Beyond the Sample — automating 30+ P2P controls across 20 million+ records
Moving from sample-based testing to full-population coverage across every P2P control and every entity.
The Problem Why Sampling Falls Short

The Problem With Sampling in P2P

Procure-to-Pay is one of the most control-dense, risk-exposed processes in any large organisation. Duplicate payments, Split purchase orders designed to circumvent approval thresholds, Unapproved or dormant vendors reactivated without scrutiny, Three-way match failures that quietly age into write-offs, Approval sequencing anomalies, Policy threshold breaches.

The traditional response to all of this? Sample 60–100 transactions per entity. Test them manually. Document the results. Issue a report.

You’re not auditing the P2P process. You’re just testing a sliver of annual procurement... and then writing a report that implies broader coverage than you actually have.

The problem isn’t that sampling is wrong. It’s that sampling across 10+ entities, with 20 million+ records across a full audit period, creates a structural blind spot that no one talks about honestly. The exposure that sits in the other 19,999,900+ records? It goes unexamined, every year.

Common P2P Risks That Sampling Routinely Misses

Duplicate payments on low-value invoices · Split POs just below approval thresholds · Reactivated dormant vendors · Approval sequencing anomalies · Systemic policy threshold breaches across entities

The Build Full-Population Coverage

What We Built

The objective was clear: move from sample-based testing to full-population coverage, across every entity, for every material P2P control.

We mapped 30+ controls to specific risks across the end-to-end P2P process. Each with defined data sources (tables, views or custom reports), and exception logics that had to be precise enough to flag genuine anomalies, not drown the team in false positives.

Some controls were structurally straightforward: duplicate invoice detection, PO-GRN-IV matching, vendor master completeness checks. Others required more nuanced logic: behavioural pattern analysis on approver / release strategy sequences, layered recursive checks on relationships that cut across both the vendor master and the employee master.

Control Category Example Controls Logic Complexity
Invoice Integrity Duplicate invoice detection, three-way match Structurally straightforward
Vendor Master Completeness checks, dormant vendor reactivation Structurally straightforward
Approval Sequencing Behavioural pattern analysis on approver sequences Moderate nuance required
Timing Anomalies Requisition creation to goods receipt timing Moderate nuance required
Vendor–Employee Relationships Cross-master relationship checks High complexity, layered logic
Threshold Circumvention Split PO detection, approval limit breaches High complexity, layered logic

Each control fed into a consolidated dashboard. One view, covering all entities. Colour-coded by status: Working as designed, Showing strain, or Broken.

Now the CAE was not reviewing a report that arrived 6 - 12 weeks after the review period. The shift was immediate. A live view of what’s functioning and what isn’t.. updated with each data refresh.

The Impact A Different Kind of Audit Relationship

What Changed in Practice

Before this infrastructure was brought to life, a P2P audit across the group ran for months, engagaing multiple auditors. Findings were retrospective by design. By the time a duplicate payment pattern surfaced in a report, the exposure had already aged - and in some cases, the vendor relationship had already moved on.

After this engine was made operational, the dynamic shifted in a way that’s hard to overstate.

In one instance, a sudden spike in non-PO procurement was detected right after finance closed the month.

When audit team walked into the opening meeting, they already knew where to look and what questions to ask.

The conversation with management had shifted from “here is the overall scope of review” to “here's a change we’ve observed in your process, tell us more about it.” That’s just a fundamentally different kind of audit relationship.

The coverage shift was equally significant. Moving from sample to population means we were no longer making probabilistic statements about controls effectiveness. When 20 million+ records are in scope, a finding goes from being an "anomaly in a sample" to a "pattern across the full universe" of transactions.

The Reality What the Presentations Leave Out

The Half That Doesn’t Get Talked About

Now, the part of this story that tends to get left out when people discuss audit automation.

Building the pipeline is the easier half of the work.

The harder work, that determined whether this was actually useful, was the audit craft.

Deciding which controls genuinely mattered. Understanding the business process and logic behind each control deeply enough to write exception rules that reflected operational reality. Knowing what a genuine timing anomaly looks like in this industry versus a quirk of how the AP team processes weekend approvals. Understanding why a vendor master exception in one entity might be systemic, while the same pattern in another is a data migration issue.

A data engineer alone could build the pipeline. But it took an audit expert to make it meaningful.

This is the distinction that gets collapsed in most conversations about “audit automation.” The automation surfaces the signals, but whether the signal is meaningful or just a list of false positives, it depends upon the judgment, built from years of understanding how these processes actually operate, how people behave inside them, and what risk looks like in context.

A Question Worth Sitting With

This is not just about P2P.. If your audits currently covers millions of records across multiple entities and relies on manual sample-based testing, what’s your honest answer to this question:

What percentage of your total transaction population did you actually examine last cycle?

Most teams, if they’re honest, would say less than 2%.

Anf this is the structural reality of how audit has historically operated under resource constraints.

But in 2026, with the tools and data infrastructure available to most organisations, it’s a gap that’s increasingly difficult to justify to an Audit Committee that’s asking harder questions about coverage, risk, and assurance quality.

The floor is rising. The question is whether your methodology is rising with it.

← Back to Blog