Blog series on exploring the Intersections of Technology, Health, and Law: Data bias and the risk of algorithmic apartheid in South African healthcare

Johannes Machinya, Lecturer, Department of Sociology, University of the Witwatersrand

The South African healthcare landscape is undergoing a transformative shift, driven by rapid technological advancements, as evidenced by the rise of the “digital nurse,” reflecting the growing integration of digital technologies into the nursing profession. The concept of the digital nurse refers to nurses who develop expertise in digital technologies and incorporate them into their practice to enhance patient care. The use of artificial intelligence (AI) in nursing and by nurses is said to be revolutionising healthcare by improving patient outcomes, enhancing clinical decision support, and optimising workflows. For instance, AI-powered predictive analytics enable nurses to anticipate patient deterioration before it occurs. The integration of these technologies into the country’s healthcare systems promises to revolutionise the way care is delivered, enhancing efficiency, accessibility, and patient outcomes.

However, this digital transformation is not without its pitfalls. In South Africa, where social and economic inequalities are deeply entrenched, often along racial lines, the deployment of these AI-driven technologies may risk perpetuating or even exacerbating existing disparities. Globally, these concerns are substantiated by empirical research demonstrating that AI systems and their underlying algorithms can perpetuate existing disparities when trained on biased data that reflects societal inequalities, thereby inadvertently reinforcing those biases and disproportionately disadvantaging marginalised communities.

Data bias in AI systems refers to the systematic errors or skewed representations within datasets that lead to inaccurate or unfair outcomes when used in algorithmic decision-making. In South Africa, given the racial underpinnings of existing inequalities, this suggests that while the country celebrated its triumph over its racist past in 1994, the rise of AI-driven technologies, which rely heavily on historical data, could lead to the emergence of a new form of discrimination in this digital age – techno-racism – perpetuated by non-human intelligent machines rather than human agents. This phenomenon encompasses the systemic racial discrimination experienced by Black individuals due to biases encoded in various digital technical systems, including AI systems that people interact with in their daily lives.

AI systems, for optimal functionality, depend heavily on access to vast and diverse data sets, which are essential for effective machine learning and algorithm development. However, if the data used in this process is biased, these biases are inevitably transferred to the resulting AI models, leading to skewed outcomes that can perpetuate existing disparities, mirroring historical injustices – a phenomenon Emsie Erastus refers to as “algorithmic apartheid.” This term highlights the systematic exclusion or misrepresentation of marginalised populations, particularly Black Africans, in the development of AI algorithms, leading to technologies that fail to serve these populations fairly. Supporting this observation, Lucila Ohno-Machado noted, “Many health care algorithms are data-driven, but if the data aren’t representative of the full population, it can create biases against those who are less represented.” This underscores the critical need to ensure that the data sets used to train AI systems are representative and unbiased, as any flaws in the data can directly impact the accuracy and fairness of the AI’s decision-making processes, particularly in automated decision-making scenarios.

Concerns about techno-racism or algorithmic apartheid, particularly in medical devices, are increasingly substantiated by evidence from multiracial contexts like the UK, suggesting that similar issues could arise in South Africa. AI-enhanced devices such as pulse oximeters have been shown to work “less well in [individuals] with darker skin,” making it more difficult to detect dangerous drops in oxygen levels in COVID-19 patients. This discrepancy arises because the algorithms are often trained on datasets predominantly drawn from populations with European ancestry and lighter skin tones, resulting in a lack of robustness when applied to more diverse groups. Consequently, the accuracy of these devices in detecting critical health metrics, such as oxygen saturation, can be significantly compromised for individuals with darker skin tones. This exacerbates existing health disparities and raises urgent ethical concerns about the fairness and inclusivity of AI in healthcare.

In South Africa, biased algorithms in healthcare present serious concerns about equity, challenging the principles of democracy and equality. This is particularly critical given the country’s long and brutal history of systemic inequalities and exclusion, which disproportionately impacted Black Africans.

When the data used to train AI systems is biased and fails to accurately represent the intended use cases, the AI may produce skewed outcomes that lead to discriminatory outcomes, misdiagnoses or less effective treatments for marginalised, predominantly Black communities.

This occurs because the data underrepresents these groups, reinforcing existing broader societal, including health disparities, and limiting their access to quality care. For instance, if an AI-driven diagnostic tool is primarily trained on racially biased data from the country’s past, it may fail to accurately identify diseases that manifest differently in Black African populations, leading to poorer health outcomes and perpetuating existing inequalities.

During an online workshop hosted by the University of KwaZulu-Natal (UKZN) School of Law on AI in healthcare in South Africa in September 2021, data and algorithmic bias were identified as one of the five key issues in the deployment of AI in healthcare. The workshop emphasised the importance of using representative training data in machine learning projects to reflect real-world diversity. It warned that biased data sets can lead to skewed outcomes, resulting in injustice, discrimination, false diagnoses, and even ineffective treatments, ultimately jeopardising patient safety. Furthermore, in Africa, and South Africa in particular, data bias resulting from the limited availability of high-quality electronic data – often due to non-uniform or incomplete data sets – could undermine the effectiveness of data-driven technologies and exacerbate existing biases. However, the UKZN workshop noted that, while using representative data is essential to reducing bias, even high-quality, accurate data cannot fully eliminate discrimination or bias if existing structural inequalities are encoded into the algorithms, leading to algorithmic bias. The issue of non-representative data is further compounded in South Africa as AI technologies are often developed outside the country, using data that may not be representative of the South African population.

Apartheid legacy and its influence on algorithmic bias

Historical apartheid in South Africa entrenched a system where Black people were systematically constructed as dehumanised objects with their very existence often reduced, as Tendayi Sithole argues, to a state of nothingness within the socio-political order.

Concerns about data and algorithmic inclusivity are especially urgent in the digital age, as populations of African ancestry, both globally and in Africa, continue to be negatively impacted by ongoing prejudice. This is reflected in the biased data used to train AI systems and algorithmic models, which are subsequently adopted in Africa.

The historical reduction of Black people into a “state of nothingness” during apartheid – a dehumanising process that systematically devalued their lives and experiences – has had far-reaching consequences that persist even today. In South Africa’s healthcare system, the legacy of apartheid persists, particularly in the adoption of medical devices and diagnostic tools designed and trained on data sets that inadequately represent Black populations. This can result in instances of medical racism, leading to serious harms such as misdiagnoses or inappropriate treatment.

Similarly, this legacy may underlie the algorithms allegedly used to racially profile Black African medical doctors, flagging them as suspects of fraud, waste, and abuse within medical aid schemes. Allegations of algorithmic biases have surfaced in automated decision-making systems employed by these schemes to detect such violations, with Black African medical practitioners being disproportionately flagged. This raises concerns about racial discrimination, as the algorithms may have been based on datasets that do not accurately represent the diversity of medical professionals. Consequently, Black African doctors may be subjected to heightened scrutiny, not due to any inherent risk, but because of the biased logic encoded into these algorithmic systems. An interim report by an independent panel highlighted these issues, raising concerns about the potential for racial discrimination against Black medical service providers.

These manifestations of algorithmic bias demonstrate that AI technology is not neutral or value-free; rather, it is shaped by racial, ethnic, gender, and class prejudices. As a result, such biases are transferred to the models that learn from this data, exacerbating existing social inequalities within society.

In the context of medical aid schemes, this bias can lead to severe consequences for Black medical doctors, including unjust investigations, reputational damage, and financial harm, all of which stem from the very stereotypes and suspicions rooted in apartheid’s racial hierarchies.

This bias in data and algorithm design perpetuates healthcare disparities and discriminatory practices, as these tools are less effective for, and discriminate against, those who were historically marginalised, further entrenching inequality.

Way forward

These issues underscore the critical need for more inclusive data that reflects the diversity of our world realities. It is essential to integrate diverse populations into the design and validation of AI-driven technologies to ensure they are effective and equitable for all. This also calls for deeper engagement with the ethical implications of technology in healthcare and a commitment to addressing the structural inequalities that continue to impact Black people, even from health technologies. Addressing these challenges requires a concerted effort to scrutinise and reform the design and implementation of these technologies, which includes diversifying the data used in training, incorporating checks for bias, and ensuring transparency and accountability in their use. Additionally, it is vital to recognise the historical context in which these technologies operate and actively work to dismantle the lingering legacies of apartheid in all areas of society, including the digital and technological realms.

The UKZN workshop proposed addressing the problem of data and algorithmic bias by establishing an institution similar to the UK’s Data Ethics and Innovation Centre, now known as the Responsible Technology Adoption Unit (RTA), which deals with ethical issues related to AI, including the quality of input data for AI processes. The RTA’s bias review program investigates algorithmic bias across various sectors through literature reviews, technical research, and public engagement workshops. The goal of this program is to produce recommendations for the government on identifying and minimising potential harms associated with AI. In line with this UK model, the South African Presidential Commission on the Fourth Industrial Revolution (4IR Commission) has recommended the establishment of an AI Institute as part of the country’s technological development plans.

Part of the SLSA Blog Series, Exploring the Intersections of Technology, Health, and Law, guest edited by Prof. Sharifah Sekalala and Yureshya Perera. Written as part of the project There is No App for This! Regulating the Migration of Health Data in Africa, funded by the Wellcome Trust (grant number: 224856/Z/21/Z).

About the Author: Johannes Machinya

Leave A Comment Cancel reply

Why not check out another post?