A is for aggregation; the ethical sensibilities of deanonymising personal data

A is for aggregation; the ethical sensibilities of deanonymising personal data

Can AI be ethical?

 

In this series we are discussing the pros and cons of AI, and the ethical issues that arise out of its use.

Today’s subject is: A is for aggregation; the ethical sensibilities of deanonymising personal data.

 

Let’s say I sell life insurance and, being cautious, I decide not to rely simply on the questionnaire my potential client has completed regarding their health and lifestyle to make the final decision on whether I will insure them. Let’s imagine for a moment what is feasible (not necessarily ethical) here.

 

So what else could I do?  If I was to look at the life my new client actually has, rather than the life they’ve told me about, I’d look at all forms of information I can find relating to them. An obvious start would be to look at social media, or search for online references to them, but what else? What if I could take anonymised data from an organisation, for example a health provider, for statistical analysis, and deanonymise it, thereby seeing new information about them? This is certainly possible - anonymised data can be combined with readily available data on the internet and the information aggregated to find out more about a person. So now I have more information that will let me make a more informed decision; it will eliminate some risk on my part and provide a more realistic, perhaps more appropriate, premium to the customer. However, there are several obvious questions here. First: Should I be doing this? Second: What about the costs of this activity? And third: What value does the data really have? These are all valid questions and move us into the ethical aspect of the discussion.

 

Should I be doing this? is a problematic question, and needs to be asked every time. From my - insurer’s - perspective, it makes perfect sense; I am providing a service, and to provide the best service I need as much information as possible. However, personal data is protected, and it’s been anonymised for a reason, so at the very least I’m not acting ethically. In fact, I’m most probably acting illegally. 

 

Perhaps more importantly to me, as a profit-making entity, is what is the cost of the activity?. After all, if it costs me a lot to perform the action, it is possible I might spend more than I make. If I use AI to give me a helping hand, for example by training my AI tool to deanonymise data from online sources, then it could become an almost cost-free activity. 

 

So, whilst it may be ethically dubious for me to deanonymise data, it might have significant benefits to me, as the insurer. And additionally, I can complete the action cost-effectively, after a reasonable initial outlay.

 

And so, we come to the last question: What value does the data really have?. Referring to the concept of ‘unintended consequences’ from my previous blog, it is possible that the deanonymisation approach might provide outcomes that are unreliable and of poor quality. Meaning at the very least that my decision regarding the customer’s life assurance might be wrong. In fact, any decision I might make could even be unsafe because it might make assumptions that are incorrect and could impact future applications for life assurance and potentially other areas of the person’s life. In addition to all this, the decision will be non-transparent, unexplainable, and unjustifiable, all of which are not ethically acceptable.

 

Would everyone see it this way? Most probably not, and this is a significant, as yet under-estimated problem. As more and more data becomes available to us, the likelihood of organisations - both legitimate and criminal - using these methods to drive decisions become greater. 

 

Does this mean ‘big data’ and AI do not have a place in the decision making process? Most certainly not, but it’s essential that we abide by existing legal frameworks when using data and ensure that we can both justify the use of that data and explain the outcomes it generates. The mantra, as always, should be: Just because I can, it doesn’t mean I should.

 

Next time: D is for denial