TL;DR • This is a new monthly feature, with regular articles in other weeks. • This month’s...
Bias can have a direct monetary impact
If your algorithmic system contains bias, it could be distracting your team and costing you money.
Removing bias is the right thing to do and we know this. But it can also help financially.
To illustrate this, let’s consider insurance claims fraud.
A typical insurance fraud workflow
A typical workflow usually runs from data collection, through rules and/or models, scoring, triage, investigation, and decision, with feedback to keep improving the system.
In many cases, triage will narrow the number of claims based on capacity. There’s only so much that we can focus on, so we need to target the claims that have the highest potential for fraud.
This, in turn, is influenced by scoring. A mature program weights the results of each rule or model and adds them up.
If the rules/models are biased, we could be scoring claims incorrectly. And this means that some lower risk claims will be selected for investigation, leaving others, that should be higher risk, out.
A scenario
We only have capacity to investigate half of the claims we receive.
In practice, we'd be working with hundreds/thousands of claims.
For simplicity, imagine that we have four claims. So we will need to select two for investigation.
This is what we have after we've scored and triaged:
# | Score | Triage Result |
Claim1 | 60 | Investigate |
Claim2 | 50 | Investigate |
Claim3 | 45 | Set aside |
Claim4 | 10 | Set aside |
A problem
But there is a problem with our rules that, if accounted for, would reveal this:
# | Score | Triage Result | Problem | Optimal Score | Optimal Result |
Claim1 | 60 | Investigate | 60 | Investigate | |
Claim2 | 50 | Investigate | Driver license type: 10 points | 40 | Set aside |
Claim3 | 45 | Set aside | 45 | Investigate | |
Claim4 | 10 | Set aside | 10 | Set aside |
Claim1 will be included either way.
But Claim2 had 10 points too many added to its score, because it met the criteria for a problematic rule.
In this example, a 10-point rule applied to a claimant without a full/open/regular license. We added this because we identified increased risk of fraud among inexperienced drivers.
There are two potential issues here:
- Data/code: The rule applies to a specific class of license; e.g., in Australia, “Provisional License” holders. But:
- the rule was coded as exclude “Open License” holders (which then includes anyone else)
- at the time of coding, there were two categories – Open and Provisional. So, excluding Open would be the same as only including Provisional.
- somewhere along the line, a third category “Foreign License” was added. The rule doesn’t work properly anymore, due to upstream data changes. It now inadvertently includes those foreign license holders.
- An assumption: Provisional License holders can include experienced drivers that have immigrated to Australia, with no recognition of prior driving experience. Translating “Provisional” to inexperienced is not quite accurate. We’re treating license type as a stand-in experience, but that doesn't always hold true.
Deliberate or inadvertent?
Either of these could be intentional. But “Foreign License” may act as a proxy for race. In Australia, the Racial Discrimination Act 1975 makes it unlawful to discriminate against a person based on their race, colour, country of origin, ethnic origin or immigration status. We should get our legal team to weigh in, if this is the case.
It is more likely that the problem stems from a mistake.
How these things go wrong
This (mistake) could happen in several ways, such as:
- The business requirement was not expressed properly in the functional specification
- The technical spec did not match the functional spec
- The code was not written in alignment with the technical spec
- The data on experience was not available; we found a workaround, but this failed when the data changed.
The impact
Regardless of the reason, Claim2 should have had 10 fewer points. So, we should be investigating Claim3 instead. It has a higher real likelihood of being fraudulent.
We have not optimised our investigation efforts and may end up settling Claim3. It could be undetected fraud. It wasn’t selected for investigation because another claim was incorrectly included.
If it also turns out that we settled Claim2, because it was cleared by the investigation, we’ve lost money.
Coded bias has directly affected our bottom line.
Disclaimer: The information in this article does not constitute legal advice. It may not be relevant to your circumstances. It was written for specific algorithmic contexts within banks and insurance companies, may not apply to other contexts, and may not be relevant to other types of organisations.
