Skip to content

Should we use demographic survey data to guide fraud detection?

TL;DR
• Demographic survey data can help inform public education and intervention, but isn’t reliable for fraud detection.
• Online survey risks include bots, sampling gaps and unrepresentative data.
• Avoid using demographic survey data for fraud detection.

 

In short, I don't think we should. You may not agree with me. But first, allow me to explain.

I’ve come across a few survey reports recently, from both fraud focused organisations and insurance associations.

Survey data is valuable for understanding broad attitudes and trends, but it’s not a reliable basis for fraud detection. There are real limitations that can’t be overlooked.

Some of the reports make a link between demographic characteristics and fraud. We could, perhaps, use the results for education/awareness and intervention purposes. But we should avoid using them for fraud detection.

 

How Survey Data Falls Short

Surveys can be affected by fraud themselves, especially when collected online. Bots, repeated entries, and untruthful respondents are common issues.

Demographics are often self-reported. So, even if the data is “representative”, we can’t be certain that the respondents were truthful in answering the demographic questions.

We also can’t be sure that the data is truly representative of attitudes or behaviours.

Some respondent groups may be difficult to reach. For example, gaps in mobile or internet access mean some populations are left out. Others may simply choose not to complete a survey at all.

 

Even if Survey Data is “accurate”

Assume that the survey data is perfect, capturing a cross-section, equally accessible and truthfully completed.

There are other issues:

  • Not causal: Being young, educated, or part of a particular group does not necessarily cause insurance fraud. These studies point to attitudes and self-reported intentions. They don’t predict future actions with certainty.
  • Risk of misuse: If insurers use demographics in profiling or automated screening, it risks unfair treatment, discrimination, and the reinforcement of bias. For example, flagging claims for investigation purely because a policyholder fits a certain demographic category is unethical, legally risky and ineffective.
  • Subconscious bias: Even being aware of demographic studies can shape subconscious thinking. That makes it important to put safeguards in place, so that stereotypes don’t slip into triage systems or investigations.

 

The bottom line

Demographic survey data can be useful for those involved in education and intervention.

They are not appropriate for fraud detection.



Disclaimer: The information in this article does not constitute legal advice. It may not be relevant to your circumstances. It was written for specific algorithmic contexts within banks and insurance companies, may not apply to other contexts, and may not be relevant to other types of organisations.