This is the fifth article in a series about finding protected attributes in data.
We need to manage 4 key discrimination categories in ensuring that our algorithms are fair. We previously explored Race, Sex/Gender, and Disability.
Another key category is Age. Unlike the other categories, age is a single attribute.
This article explores where age might appear in our data and what we can do about it.
Legislation in many jurisdictions make it unlawful to discriminate against a person because of their age, unless an exception applies (such as age-based eligibility for certain products or services).
Age is as simple as it sounds. It refers to how old a person is. Algorithmic systems can discriminate by age even if they never see the age field directly. For example, using proxies like years since graduation.
So, even though age is a simple concept, it can show up in data in many ways:
Age is often collected as date of birth on forms, applications, identity documents or driver’s licenses. This is straightforward, and most banking and insurance systems will hold this data. For both existing customers and prospective customers, we often need this for KYC purposes.
Age can be estimated, or age buckets can be inferred, from other sources. For example:
Systems can calculate age from other data, e.g., subtracting year of birth from the current year. Again, this is straightforward and fairly common. But it can introduce inconsistencies, for instance where data is missing. This can affect algorithms: how we handle the anomalies can affect decisions.
Age is often visible, but sometimes it’s inferred or hidden.
In practice, we can inspect our algorithmic systems, and:
Disclaimer: The information in this article does not constitute legal advice. It may not be relevant to your circumstances. It was written for specific algorithmic contexts within banks and insurance companies, may not apply to other contexts, and may not be relevant to other types of organisations.