Speaking about the use of artificial intelligence (AI) in Digital Health during London Tech Week, Babylon Health Founder and CEO Ali Parsa said “There (are) real risks with this technology and it’s critical that we’re very deliberate. We don’t want to extend the current inequalities and biases we see”.
The problem with machine learning (ML) algorithms and the artificial intelligence decisions that can be built upon them, is that there can be a desire to scale them quickly and realise the long term benefits in the short term, which can result in under-tested systems hitting the market.
We have seen first-hand just how quickly machine learning can completely re-shape businesses understanding of large data sets, but we are also cognisant of the risks involved when moving a minimum viable product into production without robust testing.
Let’s take a look at where the risks are and how digital health organisations can build strong ML and AI foundations.
Anyone tasked with gleaning insights from data will tell you why data quality is important. Good quality data in, means useful insights out.
Making sure that the data is fit for its intended use and accurately represents the real-world scenario that the data describes is key. Training models on questionable or indentifiably biased data can dramatically increase the prevalence of false positives and can mean that your most fundamental building block is compromised.
Ensuring that the people tasked with delivering your application or system are open to working within a growth framework will ensure better outcomes.
Establishing a set of metrics which reflect the accuracy of your model, then allowing an output to be scored both quantitatively and qualitatively means that you can scale solutions that have been marked through a clear and auditable process, and fail work which has poor classification accuracy or false positive rate.
In the scientific tradition, peer review would mostly happen within a small group or panel of experts within a given field. The size, location and interests of the peer review group could see underlying biases persist and dis-incentivise rigorous scrutiny.
In an age of ethical hacking, software development teams know that they need independent 3rd parties to think outside the box and identify inherent issues in applications and systems, so finding a distributed, multi-disciplinary peer review group to analyse your machine learning algorithms and the automated task completion delivered by your AI application is simply best practice.
You don’t have to look far in the media to find examples of where imperfect data sets, flawed decisioning or implicit bias has impacted ML and AI applications. From YouTube switching from ‘IF THIS THEN THAT’ to a watch time algorithm, through to Twitter’s facial recognition neural network. There are a range of social and societal impacts that can be useful to evaluate against your own models, to try and ensure that any potential negative implications are addressed prior to production.
At Waracle, we help visionary organisations develop digital solutions that map to better scientist, healthcare professionals and patient outcomes. We care about our work and want to collaborate to create innovative, boundary-pushing solutions. If you have a project that you’d like to speak to us about, our team are waiting to hear from you.