If you are commissioning a review of your algorithmic system, you ordinarily won’t be concerned with precisely how the review will be conducted.
But a high-level understanding is important.
It can help you determine how robust the review will be and what level of comfort the result will provide.
It also helps work out, upfront, what will be needed from your team for the review.
One aspect to understand is whether the review will include controls testing, substantive testing, or both.
Each of these offers distinct benefits and limitations.
Sometimes you need just one of them. Often you need/want both.
If you want to identify errors or unfairness in your algorithmic system, you need a substantive test. For a direct conclusion about accuracy or fairness. This article details a (largely) substantive testing method for accuracy reviews.
If you want to assess the effectiveness of your governance processes, you need controls testing. This is often needed to fulfil compliance obligations, meet contractual expectations, or demonstrate adherence to standards.
Controls testing only:
This distinction is important.
Note: there is a way to extend the controls testing to provide a broader conclusion; but this is atypical. It needs the right types of controls, and the right level of depth in testing those controls.
Substantive testing only:
If you want both - a conclusion about integrity, and about the controls in place to ensure integrity - you need a combined testing approach. A review that tests controls and tests the models/algorithms and outputs.
The descriptions and examples here are high-level.
Substantive testing provides direct evidence by examining the algorithmic system’s outputs or results.
For example, in a credit decisioning algorithm review, substantive testing might reveal that 5% of declined applications for a specific demographic group were incorrectly assessed.
Controls testing evaluates the governance mechanisms surrounding the algorithmic system.
In the same credit decisioning context, controls testing might uncover that credit scoring models were inconsistently updated throughout the year, potentially leading to incorrect assessments.
In the substantive testing example, the result showed what has actually happened. Controls testing showed the potential for something to happen and to not be picked up when it does.
The substantive testing is a lookback. It does not consider controls, so it usually can’t be used to determine what might happen in future.
The controls testing can’t reliably reveal errors. But it can help determine how sustainably a process may operate in future (if controls continue to function in the same way).
To address these limitations, the two approaches can be combined.
Combining controls testing and substantive testing provides a more robust view of algorithm integrity. It covers both the operational framework (controls) and actual performance (outputs).
Consider an insurance company reviewing its claims processing algorithm:
By combining these approaches, the review offers assurance on both the effectiveness of governance mechanisms and the validity of individual claim decisions.
You don’t need to know the exact details – technical knowledge about reviews is not necessary.
However, understanding the different approaches can help you anticipate what the review will cover and ensure it aligns with your objectives.
Disclaimer: The information in this article does not constitute legal advice. It may not be relevant to your circumstances. It was written for specific algorithmic contexts within banks and insurance companies, may not apply to other contexts, and may not be relevant to other types of organisations.