There has been a raft of data breaches over the past few months.
Some of those were due to poor controls and/or significant effort by hackers.
But some recent breaches have been rather inadvertent. Despite some controls put in place, with no significant effort by evil actors and with the risk of breach not that easy to identify at the outset.
The travel card data privacy breach in Victoria, Australia, that was reported on in this article in August 2019.
Data was released for use in a public competition, a "datathon", with the data de-identified. By itself, the data did not result in a potential breach. The risk that materialised related to "re-identification". The data, when combined with other information sources, could be used to identify individuals.
With the myki data, this included:
There are other examples, within and outside the public sector, like this.
This article in Science Daily says "Re-identifying anonymised data is how journalists exposed Donald Trump's 1985-94 tax returns in May 2019."
Between 1 July 2018 and 30 June 2019, according to data derived from OAIC (Office of the Australian Information Commissioner) reports, 21 "unauthorised disclosure (failure to redact)" breaches were reported.
With typical breaches, the risks are almost immediately apparent. These incidents were a bit different.
That's not an excuse though. The investigation into the myki incident suggested that more could have been done.
In an ideal world, we'd eliminate the risk. But that could mean that we don't share the data at all.
This article in The Guardian says "anonymising data is practically impossible for any complex dataset".
So let's consider how to minimise the risk instead. This is not straight forward, but it is not a new area either.
Here are three examples of guidelines and frameworks that can help - newest first:
|OVIC wrote about the investigation in this blog article (amongst other resources that they provided).
They provide 5 lessons learnt in that article. All relevant and the article is easy to read.
I found the fourth item to be of particular interest. "PIAs, if done incorrectly, can create a false sense of security)". We explore this further in this blog article.
|The De-Identification Decision-Making Framework was developed through a collaboration between various Australian government agencies and departments.
A fairly extensive document. It focuses on practical, operational advice.
Interestingly, there is an acknowledgement that "De-identification is not an exact science and, even using the De-Identification Decision-Making Framework (DDF) at this level, you will not be able to avoid the need for complex judgement calls."
|Microsoft - A Guide to Data Governance for Privacy, Confidentiality, and Compliance.
Almost a decade old, but still in use, and for good reason.
There are 5 parts, each self-contained:
Importantly, one of the introductory suggestions is that organisations should consider:
"Augmenting approaches that focus on mere compliance 'with the letter of the law' by implementing and enforcing" ... "measures that go beyond mere compliance with the letter of regulations and standards."
There are many others, but a combination of these would provide a good starting point.
The clear message is that we should consider the specifics of the situation and the associated risks:
How are you protecting your customers and citizens?