13 Feb 2025

Algorithmic System Integrity: Testing

TL;DR• Testing is a core basic step for algorithmic integrity.
• Testing involves various stages, from developer self-checks to UAT. Where these happen will depend on whether the system is built in-house or bought.
• Testing needs to cover several integrity aspects, including accuracy, fairness, security, privacy, and performance.
• Continuous testing is needed for AI systems, differing from traditional testing due to the way these newer systems change (without code changes).

In reviewing algorithmic systems for integrity, we often consider how the systems are tested and what the quality assurance process is.

Testing and quality assurance go hand in hand, with testing generally considered a subset of overall QA.

Testing is a core component of change control. It helps ensure that algorithms function as intended.

Why Testing Matters

If you’re reading this, you probably already know why you need to test.

In brief, testing:

helps identify errors before systems are put into production
saves time and resources – fixing issues in live systems is often costlier and more complex
increases confidence in system flows and outputs
is needed to meet regulatory obligations; this could either be explicit (e.g., CPS234 in Australia) or implicit (e.g., GDPR, HIPAA and many others).

Testing Stages

Before we explore the key testing stages, there are three important points to note:

Some testing may be manual, others could be automated.
Continuous testing may be necessary, especially for AI systems. This differs from traditional systems, where testing may only be needed for new apps or changes. This is because the models change without changes to code - for example, new data inputs result in model drift, etc.
If you’re buying software, rather than building it, the earlier steps will be performed by the vendor, so check that/how this is done. UAT, at a minimum, will fall within your team.

This is not a comprehensive list, but it covers most core stages:

Unit Testing / Developer Self-Check

Developers perform initial code/flow reviews, tests on individual components or functions of the algorithmic system. This helps catch basic errors early in the development process.

Peer Review

Colleagues review the code and test results, providing a fresh perspective and potentially finding issues the developer might have missed.

Technical Review

A more experienced team member evaluates technical aspects, ensuring it meets architectural standards and performance requirements. This may also involve other teams to conduct non-functional testing (e.g., certain security related testing, if not covered elsewhere).

Functional Testing

A non-technical team member ensures the system meets the intended functional requirements.

Business Testing - User Acceptance Testing (UAT)

End-users or business stakeholders test the algorithm in a real-world context. This is to ensure that it meets their requirements, and functions as expected. This, ideally, uses real data across a variety of scenarios.

Testing Across Integrity Areas

Effective testing should address multiple aspects of algorithm integrity.

Examples of core aspects include accuracy, fairness, security, privacy and performance.

Accuracy

Verify that the algorithmic system produces correct (complete, valid, accurate) outputs/results.

For example, test the system using known inputs and expected outputs - across various scenarios.

Fairness

Evaluate the system for bias against protected attributes, looking for unintended discriminatory outcomes.

For example, if the system now uses free text fields, beyond structured data, protected attributes may be inferred.

Security

Assess resilience against vulnerabilities, making sure that existing security controls are not inadvertently bypassed.

For example, check that security controls for authentication still work, especially if there are workarounds in place.

Privacy

Check that confidentiality is maintained, and sensitive data is protected.

For example, verify that the algorithm doesn't reveal sensitive data through its outputs.

Performance

Measure efficiency and scalability.

For example, simulate high-volume scenarios to ensure the algorithm works under peak loads.

A basic, but important set of steps toward algorithm integrity

A testing strategy that covers various aspects can significantly enhance algorithm integrity.

Ensuring the systems are not only technically sound but also fair, secure, and aligned with business objectives.

Disclaimer: The information in this article does not constitute legal advice. It may not be relevant to your circumstances. It was written for specific algorithmic contexts within banks and insurance companies, may not apply to other contexts, and may not be relevant to other types of organisations.

Weekly Articles Get weekly emails in your inbox.