1 Introduction

The increasing scope of the field of AI ethics—and corresponding analyses of algorithmic fairness—reflect the ubiquitous nature of AI deployment across many spheres of daily life. A major consequence of this is an increasing need to operationalize standards of algorithmic fairness widely, in order to minimise the risks involved with the use of technology and ensure an equitable distribution of its benefits. To date, much attention has focused on issues of fairness and discrimination associated with the legally protected categories of race and gender. However, research has shown that these frameworks tend to be ill-suited for unobserved or less frequently-recorded characteristics such as sexual orientation and gender identity (Tomasev et al. 2021). Building upon this insight, the present paper focuses on a prominent and widespread form of discrimination which has received comparatively little attention in ML fairness research—xenophobia, which is discrimination directed at the other or at those who are perceived to be foreign.

Xenophobia shapes the political landscape of the modern world in a number of ways (Bowman 2021; Arredondo 2018; Kopyciok and Silver 2021; Yakushko 2018; Akinola and Klimowich 2018; Chenzi 2021)—as can be seen in the recent growth of authoritarian populist movements and anti-immigration political platforms (Jörke 2019; Chatterjee 2021). Moreover, the Covid-19 pandemic led to a sharp rise in xenophobia (Lee and Li 2020), and in particular to a rise in anti-Asian sentiment in the United States (Nguyen et al. 2020). There is, consequently, a moral imperative to develop a better understanding of the role AI technology plays in amplifying or mitigating xenophobia in society. Indeed, as David Haekwon Kim and Ronald R. Sundstrom observe, by ignoring xenophobic discrimination—as distinct from racial, sexual or gender discrimination—we risk neglecting specific patterns of disadvantage and harm associated with conceptions of “foreignness”. This omission then leads to a situation in which “a normative loophole” is created, one that holds that “when thinking about social, national justice we need not care about foreign distant others.” (Kim and Ronald 2014)

From a technical standpoint, existing bias-mitigation strategies in the domain of ML fairness tend to focus on ensuring equitable outcomes for legally protected groups. However, efforts to combat xenophobia in AI need to go further, as many groups that are deemed to be foreign lack such legal protection. An additional challenge lies in the need to understand the decisions people make when distinguishing between us and them, between those who are familiar and those who are not. Ultimately, political institutions often help define and impose such categorizations, with the history of the US census providing an example of how changes in data collection may reflect and entrench changing notions of identity over time.

In the context of AI research, the detection of “hate speech” or “dangerous speech” has served as a focal point for efforts to address xenophobia, fuelling the development of content moderation systems to tackle the spread of hateful and dangerous speech in social media (Fortuna and Nunes 2018; Cao et al. 2020; Awal et al. 2021; Röttger et al. 2020). Dangerous speech is an element of a path of escalation that may lead to conflict (Bowman 2021) and even mass atrocities (Benesch and Leader 2015; Fink 2018)—while early detection of dangerous speech can provide opportunities for timely deescalation. However, accurate hate speech detection is not without challenges, as early systems have been shown to incorporate racial bias (Xia et al. 2020), as well as over-reporting queer speech as inappropriate (Gomes et al. 2019). We believe that the existing focus on social media platforms needs to be complemented by deeper analysis of xenophobia, spanning the full range of AI use cases. Such analysis needs to account for, and be rooted in, recognition of structural barriers facing foreign nationals and immigrants. This is further exacerbated by wider issues of cultural assimilation, abandonment and erasure, and a lack of recognition of stateless persons (Mansouri 2023).

To the best of our knowledge, this paper is the first comprehensive review of the interplay between xenophobia and AI. In total we look at its impact across five domains: social media, healthcare, immigration, employment, as well as via large pre-trained models. We proceed as follows. First, in Sect. 2, we define xenophobia, distinguishing it from other forms of discriminatory attitudes and practices such as racism and sexism, and specifying its conceptual relevance to AI systems. We then, in Sect. 3 discuss how xenophobia may manifest across a number of AI application domains. We proceed, in Sect. 4 to make a moral argument towards a design of inclusive, xenophilic systems, coupled with a set of technical considerations and recommendations informed by the use cases reviewed in Sect. 3. We conclude by outlining key issues for future research.

2 On xenophobia

Xenophobia is commonly understood as a kind of hostility or prejudice directed towards foreigners, immigrants or, more broadly, those construed as “others”. It can manifest as a fear, dislike or hate towards people who are perceived to be different, including their culture and customs, and has been associated with a range of social and psychological roots, (Banton 1996; Sanchez-Mazas and Licata 2015; Wimmer 1997; Yakushko 2009) including misassociations, stereotyping and cognitive bias (Rydgren 2004), projections onto the other, as well as active processes of othering groups of people (Wimmer 1997). Moving beyond the cognitivist conception of xenophobia—which focuses primarily upon mental states and attitudes—it is important to recognize that xenophobia also has institutional and structural expressions. In these cases discriminatory practices or outcomes are sustained by rules, procedures, and prior distributions of material and symbolic resources which target specific groups of people. Xenophobic discrimination may involve both phenomena, typically involving a mix of attitudinal and institutional effects (Kim and Ronald 2014).

This conceptual core, however, leaves two important questions unanswered: the first concerns the distinction between xenophobia and other forms of intergroup prejudice, notably racism (the Distinctiveness Problem); and the second centers upon defining the threshold of what counts as xenophobia, given the ubiquitous differences in how rights and responsibilities are allocated to citizens and non-citizens in contemporary nation-states (the Threshold Problem).

Much scholarship on discrimination and disadvantage ignores the first problem, by adopting, as Kim and Sundstrom observe, “the general assumption, alive in popular culture and some segments of the academy, that racism has subsumed nativism and xenophobia” (Kim and Ronald 2014). This, they point out, is problematic, since it “merges racism, xenophobia, and nativism into one hyper-concept of prejudice and exclusion” and ignores “important distinctions to be made here between prejudice against racial outsiders, civic outsiders, and the pursuit of chauvinistic ethics and racial group-interests based on claims of indigenousness.” (Kim and Ronald 2014) While the history of xenophobia can not be decoupled from the artifacts of the colonial past and racism, xenophobia is conceptually distinct from racism, and the strategies needed to mitigate racism and xenophobia may differ (Bernasconi 2014).

We adopt a modified version of Kim’s and Sundstrom’s solution to this problem: namely, orientating our understanding of xenophobia around the notion of “civic ostracism”. Specifically, we distinguish xenophobia from other forms of prejudice and discrimination by defining it as a discriminatory orientation (whether of individuals or institutions) that penalises individuals on the basis of their perceived foreignness, understood here as the purported lack of full membership of the civic community. Such non-membership may correlate with actual legal status, but it need not do so: individuals who are in fact citizens but who are prejudicially deemed foreign (perhaps because they were originally born outside the civic community, or simply because of assumptions associated with accent, name, practices, beliefs or race) may still be subject to xenophobia in this conceptualisation.

A related set of points hold for the complicated relationship between xenophobic and racial discrimination. For while the category of "race" is socially constructed, contested, and multifaceted (Hanna et al. 2020; Benjamin 2019), it seems clear that a person can encounter racial discrimination without also necessarily having their membership of the civic community called into question. The reverse is also true: a person can be judged to be "foreign", and encounter the range of obstacles detailed in this paper, without encountering co-extensive racial discrimination. By way of illustration, consider discrimination against Irish Americans—which spanned several centuries—or against the former residents of East Germany after national re-unification (Zehring and Domahidi 2022; von and West 2022). Discrimination aimed at immigrants or refugees who are ostensibly of the same race as the host community (Gellner 2015; Anderson 2020) also represent an example of this phenomenon – for example, discrimination against Syrian refugees in Lebanon (Janmyr 2016). Yet, this picture is complicated in a number of ways. First, the term "racism" can, in ordinary language, be used ways that muddy these waters, encompassing discrimination on the basis of nationality—a move that expands the scope of what might be appropriately deemed "racist"—and bringing the concepts of racism and xenophobia closer together. This folk conception of racism is quite different from the one most often operationalised in AI bias research, which has tended to investigate discrimination based on skin color (Buolamwini and Gebru 2018) and to draw heavily upon the "protected characteristics" framework used in anti-discrimination law (Solon and Selbst Andrew 2016; Mehrabi et al. 2021). Second, racial discrimination and xenophobia, even when understood as conceptually distinct still often occur together and compound one another in practice. Third, and as a contributing factor to this trend, racial discrimination is often buttressed—on the part of people who hold these views—by propaganda or conspiracy theories that stress the "foreignness" of the group that is the target of discrimination (Mamdani 2002). Equally, those who hold xenophobic attitudes often try to amplify these views by reference to spurious notions of "race" and "racial purity" (Jardina 2019). Hence, both at the conceptual and the causal level xenophobia and racism often intertwine closely.

Kim and Sundstrom do not address, however, the second problem - which goes persistently unaddressed in most scholarship on xenophobia. By their very nature, nation-states distinguish between citizens and non-citizens, and in doing so, allocate rights, responsibilities and resources differentially between these two categories. Adherents to “cosmopolitan” ethics may lament this institutional reality of contemporary politics (Caney 2008; Pogge 2002; Benhabib 2008), yet the ethical significance of civic national groups that differentiate between citizens and non-citizens has cogent defenders, and the abandonment of civic nations is extremely unlikely for the foreseeable future. A framework which deems any differential outcomes between citizens and non-citizens to be xenophobic is, as such, unlikely to be accepted as offering relevant ethical guidance for AI systems. At the same time, differential outcomes for non-citizens can indeed arise from xenophobia, and may extend up to persecution, ethnic cleansing, and genocide (Mann 2005; Straus 2015; Brubaker and Laitin 1998). The recognition of such systemic and structural xenophobia must play a central role in understanding the potential xenophobic impact of AI systems.

Our proposed solution to these ambiguities, when assessing xenophobia in technical systems, is to distinguish between two different ways of operationalising the concept of xenophobia. In general, we think designers of AI systems will be interested, in the first instance, with what we term immanent assessments of xenophobia, by which we mean the determination of whether an AI system creates disadvantages on “foreigners” that are discriminatory according to the societal standards. Put another way, if AI systems generate disadvantages that go beyond the officially sanctioned differences in the rights, responsibilities and resources that separate citizens from non-citizens, the system is immanently xenophobic, violating the standards formally adhered to by its user. Yet those standards may, themselves, be subject to ethical critique as part of a more transcendent assessment of xenophobia - i.e. they violate ethical principles which, even if the actor or organization does not formally recognise or accept them, ought to be upheld. While we think transcendent assessments of xenophobia are crucial to ethical debate in global affairs, much of the time AI-systems may be rendered substantially fairer simply by ensuring that immanently understood xenophobia is avoided.

Whether we are assessing xenophobia immanently or transcendentally, we identify three main kinds of harms that may flow from a xenophobic AI system. First, a xenophobic system may result in discriminatory material disadvantages—allocating material resources in ways that systematically and unfairly penalise certain individuals on the basis of their actual or presumed foreignness. For example, an AI system may end up prioritising medical benefits to citizens over non-citizens, as well as those with names that are often associated with foreign heritage. Second, a xenophobic system may deny individuals proper ethical recognition, by formally or informally engaging with them in ways that communicate effective civic ostracization. An example of this would be an AI system that generates descriptions or images of citizens with a certain heritage that support negative stereotypes about their socio-economic status or purported "willingness to work". Third, a xenophobic system may restrict the effective exercise of individuals’ rights, for example, by exposing them to prejudicial treatment by the police, unfairly allocating access to forms of legal action, or giving unjustified prominence to certain groups over others in platforms for expressing speech. Such restriction of rights could include exposing individuals material harm via physical harm or violence if, for example, law enforcement officers rely on systems that contain xenophobic bias. However, what separates this category of harm from the first kind, is the additional and necessary threat to a person’s rights—which limits their ability to seek fair recourse and can compound the feeling of civic precarity. Moreover, while each of these harms may overlap with discrimination based on other attributes, such as race and gender, they are not subsumed by these categories. More obviously, for example, xenophobic discrimination may be directed at immigrant and refugee populations, even when such groups are not clearly perceived of as ‘racially’ distinct.

3 Practical considerations

We now focus on several important AI application domains to establish different ways in which xenophobic discrimination may potentially manifest, and where countervailing measures may need to be adopted. Since our aim here is to draw attention to an aspect of fairness—xenophobia—which has thus far rarely been addressed by AI researchers, and for which there is very little known testing, the evidence we draw upon is necessarily suggestive and somewhat indirect, drawing heavily upon analogies and examples from the wider sociotechnical literature. Our contention is simply that such evidence gives sufficient reason to think that xenophobia is likely to to manifest in AI systems, and that there should be more urgency in developing frameworks for evaluating potential xenophobic bias and downstream harms. Tangible algorithmic harms have been previously observed across these and other domains (O’Neil 2016) for various marginalised groups, and we believe that these established mechanisms of how bias may translate into harm set a precedent for the discussion presented here, as they imply that the entrenched xenophobic biases would similarly lead to adverse consequences for the affected groups and individuals.

3.1 Social media

Social media may amplify xenophobia (Daniels 2018), especially in communities with a certain amount of pre-existent xenophobic sentiment (Bursztyn et al. 2019; Yamaguchi 2013). Low barriers to entry, reliance on user-generated content, difficulties in moderation, and the emergence of filter bubbles (Baldauf et al. 2019; KhosraviNik 2017; Bruns 2019; Zuiderveen et al. 2016) present difficulties that may lead to the easier spread of fake news (Albright 2017; Schäfer and Schadauer 2018) and potentially dangerous and hateful speech (Benesch and Leader 2015) directed at those who are deemed foreign. At the same time, these platforms could also potentially provide mechanisms to enable more inclusive discourse, expressions of support and positive views towards marginalised groups, serving as a medium for communicating information relevant to the effective exercise of individual rights. This speaks to a dual role social media platforms occupy in shaping culture: both as a source of greater cross-cultural understanding, and also as a mechanism that can entrench social divisions, sometimes leading to abuse that reflects wider structural and cultural violence (Haunschild et al. 2022). In particular, it has been shown that there is an important link between social media and hate crimes, as the prevalence of anti-refugee sentiment on social media platforms is a strong predictor of crimes against refugees across municipalities (Müller and Schwarz 2021).

The dynamics of discourse and content sharing on social media are ultimately influenced both by the societal context as well as the design of the platform, its implemented algorithmic solutions, and the enforcement of content moderation policies. Carefully designed platforms may present safe spaces for information sharing and help facilitate collective immigrant action (Zepeda-Millán 2016). In general, given an ever-increasing proportion of time spent on social media, its overall impact on the amount and the intensity of xenophobia in society can be profound (Postill 2018).

3.1.1 Risks

All three types of xenophobic harms that we consider in this study may occur in the context of social media applications. To begin with, discrepancies across personalised social media feeds often incorporate harmful stereotypes and make implicit assumptions about people that are problematic from the standpoint of representation. Indirectly, these cultural tropes may shape the public opinion in ways that result in xenophobic discrimination offline, exacerbating the propensity to exclude individuals based on their perceived “foreignness”. Indeed, xenophobic harms pertaining to civic ostracism may follow, for example, from online public shaming (Aitchison and Meckled-Garcia 2021) and amplified hate speech aimed at marginalised communities. Finally, indirect xenophobic harm may result indirectly from the shaping of public opinion, as well as directly by silencing minority views (Oliva et al. 2021; Haimson et al. 2021) and weaponising content moderation against foreign individuals, resulting in their effective exclusion from public forums and an inability to influence future decisions. To address these concerns, greater transparency and contestability need to be incorporated in content moderation system design (Vaccaro et al. 2020), to ensure fair processes and help mitigate the consequences of potential algorithmic biases.

Machine learning plays an important role in emerging patterns of social media content consumption. Personalized news feeds may give rise to feedback loops (Jiang et al. 2019), and amplify pre-existing opinions and biases, resulting in autopropaganda (Joe et al. 2021) Moreover, repeated exposure to content that others out-groups and reinforces harmful stereotypes may lead to radicalisation (Alfano et al. 2018; O’Callaghan et al. 2015; Ribeiro et al. 2020), suggesting that a careful redesign of content consumption pathways is needed (Fabbri et al. 2022). Deepfakes (Westerlund 2019) present a recent manifestation of the risk of AI-enhanced spread of misinformation, either by malicious individuals and organisations, or as a part of state propaganda (Pavlíková et al. 2021). Emergence of deepfake detection technology (Güera and Delp 2018; Dolhansky et al. 2019; Zhao et al. 2021; Wang et al. 2021; Fagni et al. 2021) may help address this challenge, at least until the deepfake technology becomes advanced enough to be indistinguishable from authentic images, video, and audio streams. Adversarial robustness of deepfake detection models plays a crucial role in this context (Neekhara et al. 2021), given the known vulnerabilities of neural computer vision systems to adversarial perturbations. Yet, technological mitigations for deepfake dissemination also need to be accompanied by an investment in media literacy (Hwang et al. 2021) and user awareness. Disinformation and deception have been shown to play a key role in forming xenophobic narratives around otherness, aiming to associate foreigners with acts of violence, over-reliance of financial aid, and illegal immigration (José et al. 2021). has mostly involved reinterpreting existing images with inaccurate associated textual descriptions, creating fictitious references and taking imagery out of context. Yet, there is a risk that the rise of deepfake technology could amplify these trends by making it easier to create such disinformation at scale.

3.1.2 Promise

Advances in natural language processing may help develop more reliable hate speech detection (Badjatiya et al. 2017; Aluru et al. 2020; Gambäck and Sikdar 2017; Rizos et al. 2019; Sutejo et al. 2018; Plaza-Del-Arco et al. 2020; Khan et al. 2020; Alatawi et al. 2021) and sentiment prediction systems (Jin et al. 2020; Seo et al. 2020), as well as improved automated topic analysis (Gui et al. 2019; Wang et al. 2019) in social media posts, across different modalities (Peng et al. 2019). This improved monitoring may also help facilitate targeted interventions (e.g. counter-speech (Sonntag 2019; Tekiroglu et al. 2022)) to mitigate xenophobic sentiment, safeguard marginalised communities, and minimise the risk of escalation towards potential violent conflict. Complementary efforts in semi-automatic detection of fake news (Popat et al. 2018; Wang et al. 2020; Kaliyar et al. 2021), as well as bots and troll accounts (Luceri et al. 2020) are imperative in establishing safe online spaces. Human-in-the-loop (Strickland 2018; Demartini et al. 2020) filtering processes and explainability in AI-driven content moderation may be necessary to address biases in language systems.

Advances in diversification of recommender systems (Kunaver and Požrl 2017; Helberger et al. 2018; Mansoury et al. 2020) and improvements in user fairness (Leonhardt et al. 2018; Nandy et al. 2022) may help avoid the formation of filter bubbles and echo chambers and empower communities to confront and challenge xenophobic narratives. Content diversity needs to be accompanied by additional counter-measures to be effective (Stray 2021) and more research is needed to identify the best contextual approach to use. In particular, neural machine translation (Bahdanau et al. 2014; Chen et al. 2018; Aharoni et al. 2019; Pan et al. 2021) approaches may prove to be an invaluable tool in breaking down barriers in communication across groups, increasing exposure to contrasting views, and help establish multicultural online spaces. To achieve this, it is important to improve the accessibility of natural language understanding, in particular in terms of improving the performance of existing systems in low-resource languages (Gu et al. 2018b, a; Karakanta et al. 2018; Sennrich and Zhang 2019; Siddhant et al. 2020), through participatory action and community engagement. Finally, ML also offers opportunities for developing tools for monitoring large-scale shifts in social behaviour (Papakyriakopoulos 2020; Patti et al. 2017), which may help further the development of theories that aim to understand the root causes behind an increase in xenophobic sentiment. These theories may be driven by observations in retrospective traces of social media usage (Frías-Vázquez and Arcila 2019; Wahlström et al. 2021), or in simulation (Yao et al. 2021b; Lucherini et al. 2021), with controllable parameters so as to isolate individual effects.

3.2 Immigration

Perhaps the most obvious context in which AI systems may exacerbate or mitigate xenophobic harms is in the management of migration flows. The complexity of immigration policies reflects a tension between internal security and the liberal frame of humanitarianism (Lavenex 2001), generating a need for involvement of stakeholders outside of established political processes. Migration between culturally distinct countries and regions is often accompanied by a growing resentment of immigrants and refugees (Hadžić 2020). Meanwhile, denialism and complicity with populist narratives hamper the implementation of remedial measures—including the development of a coordinated strategy, spanning international, national and regional efforts, which is needed given the magnitude of the problem (Crush and Ramachandran 2010). Moreover, states have already been quick to employ new digital and AI technologies in an effort to more tightly and effectively control migration (Nalbandian 2022). As an official from the South African National Defence Force observes: “The European Union is trying to fight the issue (of immigration) with data\(\ldots\) In the modern digital world, you cannot exist without a digital presence. Why is it not feasible to control all of this immigration by tracking the behavior of this data, within the database?” (Longo 2017). While digital technology may promise significant gains in efficiency in the management of human movement across borders, this opens a vast realm of potential bias and harm created by AI systems that restrict rights of movement based on assumptions of risk or threat associated with certain data profiles. The nature of problems faced by immigrants is necessarily intersectional, as immigrants from marginalised communities face additional barriers when attempting to integrate into society (Bhagat 2018), and may employ identity concealment tactics to render themselves invisible in the face of xenophobic attitudes (Makoni 2020; Tewolde 2021).

3.2.1 Risks

Numerous technological solutions incorporating AI have been proposed for border control and migrant assessment. The collection and utilisation of data from refugees, asylum-seekers and migrants—including by obtaining access to their private social media accounts (Andreassen 2021)—poses serious risk of misuse(Ahmad 2020). AI systems have also been proposed for automatic assessment of immigration forms (Chun 2007), and the deployment of facial recognition technology is also a feature of this context (Carlos-Roca et al. 2018). This risks perpetuating the historical, national, ethnic, and racial stereotypes, and is especially concerning since these types of systems have already been demonstrated to be harmful due to disparate performance for racial minorities (Birhane 2022). Indeed, many governments have been quite explicit in their efforts to define migrants in terms of an algorithmic computation of ’risks’—with migration processing dictated by the resulting assessment of risk levels. For example, officials in the Australian Department of Immigration and Citizenship have publicly defended the employment of an electronic travel card system on the grounds that it enables a “risk-tiered approach to identity \(\ldots\) where we have a range of concentric circles around Australia and we are pushing (the border) further and further out, (and) are doing more and more checks before the person hits our shores” (Longo 2017). Biases and harms in AI systems aimed at risk profiling, have also been fairly well documented in a different use case—recidivism prediction in the criminal justice system (Angwin et al. 2022; Angwin and Larson 2022).

While the management of migration risks (SeyedSoroosh and Kiana 2020) may be legitimate in principle, such an approach creates enormous potential for harmful and unfair outcomes to individuals based on associative assumptions about risk that are built into AI assessment systems. These harms can potentially manifest in different walks of life - for example, systems for the assessment of the likelihood of the H1B visa approval in the US immigration process (Swain et al. 2018) can bias potential employers against certain immigrant groups. There is also an important distinction to be made here between immanent and transcendental standards in assessing xenophobic harms in immigration, where some practices may be legally permissible, yet palpably unjust. This raises important questions about migrants’ human rights in face of the use of AI technology in immigration enforcement (Giannakau 2021).

3.2.2 Promise

If xenophobic bias can be avoided, AI systems offer the potential for fairer, faster and more efficient management of migration flows with benefits not only for governments, in encouraging welcome forms of migration, but also for migrants in avoiding complex, lengthy and expensive migration processes. Indeed, AI systems have been proposed as a mechanism for encouraging states to more effectively and expansively house refugee populations, by speedily “matching” government and individual preferences and needs in the context of large refugee flows (Jones and Teytelboym 2017). Nonetheless, improved auditing of such systems is required to reduce the risk of unjustly excluding people from safe refuge. Large language models have recently been used to identify and analyse the inconsistencies in the immigration law applications for stays of removal at the Canadian Federal Court, showing that computational legal research may be able to help improve the established processes (Rehaag 2023). Advances in modelling and understanding trends in human migration (Robinson and Dilkina 2018) can also help with anticipatory measures and planning towards building sustainable and inclusive communities. Modelling refugee flows (Mead 2020) may prove helpful in expediting a humanitarian response.

3.3 Healthcare

Xenophobia has recently been identified as an important determinant of health (Suleman et al. 2018), with a multitude of adverse effects being reported for both individual and community-based health metrics. Moreover, policies that restrict the range of health services available to foreigners sometimes exceed the costs they purport to be saving, creating a hostile environment both for patients and migrant workers alike  (Shahvisi 2019). Exposure to repeated xenophobic prejudice in a healthcare context may erode trust and adversely affect care-seeking patient behaviours (Earnshaw et al. 2019).

Rather than being an exclusive artefact of misguided public health policy, medical xenophobia can manifest via negative attitudes and practices by health workers, and has been shown to be deeply entrenched in numerous public health systems (Crush and Tawodzera 2014; Loganathan et al. 2019). Discriminatory practices are further amplified in case of particularly vulnerable migrant populations, like refugees (Zihindula et al. 2017; Munyaneza and Euphemia 2019) and undocumented migrants (Richter 2015). Moreover, even prior to engaging with health services, xenophobia and racism may impact health via early exposure to adverse childhood experiences (Nguyen-Truong et al. 2021). In Adja et al.  (2020), the authors point out that the World Health Organization (WHO) defines health as “a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity“ (World Health Organization 1946) and that social well-being is often overlooked in conversations around health. A holistic approach for mitigating the effects of medical xenophobia must therefore incorporate notions of social well-being and account for the impact of the social determinants of health and the disparities faced by foreigners outside of the boundaries of direct medical care.

Medical xenophobia has also historically played an important role in shaping epidemiological perception related to the spread of infectious diseases. More recently, this could be seen in the overwhelmingly xenophobic response to the spread of COVID-19 (Reny Tyler and Barreto Matt 2020; Le et al. 2020) and the resulting health disparities (Hooper et al. 2020), but it is by no means a recent phenomenon. As highlighted in (Sarbu et al. 2014; Galassi and Varotto 2020), the act of blaming foreigners for the spread of infectious disease could be readily seen in the accounts related to the devastating epidemic of venereal syphilis in Europe in the 15th and 16th century. The English, Germans and the Italians blamed the French for the epidemic, referring to syphilis as morbus Gallicus (the ‘French disease”). In turn, the French accused the Germans, the Poles and the Neapolitans, the Poles blamed the Germans, the Russians blamed the Poles, etc (Sarbu et al. 2014; Galassi and Varotto 2020). Another example of historical medical xenophobia can be seen in the anti-Asian bias during the epidemic of Bubonic Plague in San Francisco’s Chinatown in 1900 (Parfett et al. 2021). Paleopathology can help alert us to past instances of medical xenophobia, raise public awareness and reduce future xenophobic discrimination in times of public health emergency. As articulated by Muscat et al.  (Natasha et al. 2017): “Public health is not a neutral scientific activity, but a political activity informed by science. Public health has a moral mandate that complements its scientific mandate.“

3.3.1 Risks

Xenophobic medical AI harms include all three types of xenophobic harm that we consider in this study. Clinical AI systems developed from retrospective clinical data may encode and perpetuate harmful biases towards migrants and ethnic minorities, potentially leading to discriminatory material disadvantages—misdiagnosis, suboptimal treatment decisions and resource allocation, and worse health outcomes (Suleman et al. 2018). Such disparities have already been identified in relation to structural racism and algorithmic clinical decision-making (Norris et al. 2021; Yates et al. 2021), making their examination in the context of xenophobia imperative. Design choices in setting up the critical health data infrastructure play a role in this, in ways in which they encode identities and represent demographics (Yi et al. 2022). AI systems that take shortcuts (DeGrave et al. 2021; Brown et al. 2023), through proxies of sensitive attributes, may further ostracize communities seen as foreign and perpetuate narratives linking foreignness to disease (Yi et al. 2022). Exclusion of ethnic groups may potentially also follow from non-inclusive designs of systems that may not be suited for safe application beyond the majority demographics in the country of deployment, resulting in disproportionate rates of deferral to alternative pathways, increased waiting times, and a deep misunderstanding of patient needs. Finally, clinical AI systems may further alienate patients and erode trust, as well as potentially reinforce harmful xenophobic stereotypes in repeated interactions with systems designed to reason based on historical data.

3.3.2 Promise

There is potential for AI technologies in healthcare to play a transformative role in improving access to care and the quality of its delivery. Advances in deep learning have led to a number of promising early research prototypes for improved diagnostics in medical imaging (De Fauw et al. 2018; McKinney et al. 2020; Baltruschat et al. 2019), diagnostics from wearables (Ravi et al. 2016), early detection of adverse events from electronic health records (Tomašev et al. 2019, 2021; Futoma et al. 2017), and others. The maturity of this technology is reflected in ongoing prospective studies and deployment (Qian et al. 2021; Dembrower et al. 2023). By way of illustration, on-device screening apps may help with earlier detection of health problems in marginalised groups, including migrants and foreign workers with limited access to health insurance and health services. For example, AI-based diabetic retinopathy detection from smartphone-based fundus photography was shown to have very high sensitivity (Rajalakshmi et al. 2018) and would potentially be applicable for mass screening of at-risk populations. Similarly, advances in AI-based dermatology screening applications (Göçeri 2020) could yield similar benefit in the future.

Yet, there are still practical obstacles for these types of systems to be safely deployed at scale (Seneviratne et al. 2020). Participatory approaches play a key role in medicine (Abma and Broerse 2010; Angel and Frederiksen 2015; Falk et al. 2019; Schinkel et al. 2018), where engaging with patient focus groups helps the clinicians align on the desired outcomes and improve the overall patient experience. Participatory approaches to developing AI solutions have similarly been identified as a key ingredient in safe and inclusive AI system design (Bondi et al. 2021), though their implementation is not without challenges (Smith et al. 2021). The outcomes of such practices hinges on the inclusion of all marginalised communities and especially those that may be facing difficulties accessing the basic healthcare services, or facing language barriers (Um and Kim 2018; Nickell et al. 2019; Hannah et al. 2019).

3.4 Employment

AI systems are increasingly used in decision-making around employment (Upadhyay and Khandelwal 2018; Sánchez-Monedero et al. 2020), in helping to identify, filter, assess and prioritise potential job candidates. This has led to the development of AI ethics frameworks for use in employment applications (Chan 2022; Evgeni et al. 2023). Given that differential outcomes in employment translate to direct material harms, it is crucial to understand how xenophobia manifests in this sector overall, and how it may impact the development of AI-driven decision-support systems (Raghavan et al. 2020).

Unequal access to employment opportunities and unfair compensation are among the key drivers of social inequality as differences in socioeconomic status tend to have a knock-on effect on community health, mental health, access to housing and physical safety. Labour market attachment is also critical to the livelihood and identity formation of individuals and groups, especially when it comes to those that have been historically marginalised (Teelucksingh et al. 2007). There are a number of different ways in which employment discrimination manifests for foreign workers (Kosny et al. 2017) at different stages in the employment process, depending upon their citizenship and immigration status. Both citizens and non-citizens can suffer xenophobic harms in the employment process, although there are commonalities and differences in their respective experiences (Bell et al. 2010). Historically, the worst jobs, with the hardest working conditions and the least pay had been reserved for immigrants, leading to the subjugation of immigrant bodies and population control, fueled by the xenophobic sentiment—which persists to this day (Longhi 2012).

Immigrants play an important role in the labour market, due to their diverse skill sets, life experience, educational profiles, and background. This diversity necessitates an intersectional analysis of employment outcomes, as it has been shown that different immigrant groups experience different levels of discrimination. For example, a study in Switzerland identified highly competitive immigrant groups from neighboring countries as being subjected to most workplace incivility (Krings et al. 2014), presenting a skills paradox (Dietz et al. 2015). Such discrimination is not restricted to first-generation immigrants, rather being rooted in assumptions about heritage, names (Midtbøen 2014; Obenauer 2023) or appearance (Fibbi et al. 2006). In France, both immigrants and their descendants face worse employment outcomes (Duguet et al. 2010; Meurs 2017). Meanwhile, in Sweden, replacing Swedish with Middle-Eastern-sounding names in CVs (Carlsson and Rooth 2007) was shown to reduce the rate of call-backs by a large margin, with the rate of discrimination being amplified by male recruiters in particular. While labour-market integration in Sweden remains straightforward for Western immigrants, immigrants from Africa, Asia and South America (Grand and Szulkin 2002) face far greater obstacles, showcasing the negative impact of xenophobia. Similar trends have been observed in the United Kingdom (Anderson et al. 2006), where employers confessed to ethnic stereotyping in their job candidate preferences. Female job-seekers (Jureidini 2005; Dlamini et al. 2012) may experience further obstacles, necessitating an intersectional analysis.

Finally, efforts to assimilate successfully play key role in modulating xenophobic employment discrimination (Kee 1994) and mitigating the resulting pay gap (Nielsen et al. 2004), though out-groups may instead experience further alienation due to xenophobic attitudes (Mayadas and Elliott 1992). Language barriers, cultural differences, lack of networks, and social capital (Behtoui and Neergaard 2010) present challenges to securing employment (Hakak et al. 2010). This lack of integration also affects foreign entrepreneurs, facing obstacles in securing capital for their businesses (Teixeira et al. 2007). A lack of elasticity contributes to the wage gap between the immigrant and native job seekers, and is reflected in monopsonistic discrimination against immigrants. Findings from a study in Germany suggest not only that the employers hold more power over immigrant workers as a result of search friction and lower flexibility, but also imply that employers profit from discriminating against immigrants (Hirsch and Jahn 2015).

Existing legal frameworks are often insufficient in protecting the rights of foreign employees and addressing the problem of xenophobia in the workplace (Handayani et al. 2021; Mubangizi 2021). This insufficiency underscores the need for a holistic approach to ensuring fair employment outcomes. Technology should aim to complement and assist numerous organisational initiatives, collective action, legislative and legal measures, administrative measures, political and educational action and international standards (Taran et al. 2004).

3.4.1 Risks

Predicting prospective employee performance from retrospective data (Mahmoud et al. 2019) carries the risk of perpetuating historical discrimination against out-groups (Kim 2018; Mujtaba et al. 2019), under the veneer of alleged objectivity. Such issues have already been identified in a number of deployed AI systems (Dastin 2018; Hsu 2020). Discriminatory outcomes may arise even in absence of direct information on ethnic and national background, via proxy features (Prince Anya and Schwarcz 2019). Additionally, there is a pressing need for establishing firm guidelines and policies governing the fair usage of employment decision support tech (Kim and Bodie 2021). Xenophobic discrimination poses unique challenges in this context, as natives and immigrant workers are often not granted the same rights and protections. This is in part due to the specific requirements for work permits in terms of skills and training, corporate sponsorship, minimum compensation etc. These requirements may differ between immigrant workers of different national and ethnic backgrounds. Polices aimed against xenophobic discrimination in employment, and the fairness on AI systems in employment, need to account for such legal distinctions. The overall impact of AI on the job market spans beyond its immediate utilisation in employment decisions, and should be seen in relation to how technology is reshaping the future of work—including job quality—across industries and regions.

3.4.2 Promise

Assistive technologies have the potential to improve the overall candidate experience as well as help overcome human biases and help improve diversity and inclusion (Daugherty et al. 2019). Preliminary studies suggest that the use of AI has in some cases already greatly increased the diversity of top selected job applicants (Avery et al. 2023). ML models may also be seen as a powerful lens into existing hiring practices (van den et al. 2020), helping audit (Liem et al. 2018) and improve such processes. Recently developed frameworks for ML explainability (Bhatt et al. 2020) and fairness (Cabrera et al. 2019) may be utilised towards this end. Simulations may also prove helpful for policy and process design (Hu and Chen 2017; Bower et al. 2017; Hu and Chen 2018; Cohen et al. 2019; Schumann et al. 2020). Based on the prevalence of xenophobic discrimination in historical hiring practice, as well as language technologies that underpin employment AI systems, we believe that there is a pressing need for improved transparency and explicit checks for fairness in decision recommendations for candidates of different national and ethnic background, as well as immigration status. If employment AI tech is ever to deliver on its promise, it needs to demonstrate a commitment to fight deeply ingrained manifestations of xenophobia in the workplace.

3.5 Stereotypes in large pretrained models

Recent trends in AI development involve systems that are increasingly general in the scope of their application. Large language models (LLM) (Li et al. 2021; Chowdhery et al. 2022), for example, can be fine-tuned (Wei et al. 2021) and adapted to a range of applications and use cases, becoming components of even larger and more complex AI systems. These systems are sometimes referred to as ”foundation models” (Bommasani et al. 2021), or ”base models”, in recognition of their emerging role as building blocks (Zeng et al. 2022) in such systems. The development of such models usually involves large-scale pre-training on increasingly large datasets which are increasingly hard to curate. This poses unique safety and ethics concerns, due to a difficulty of anticipating the totality of harmful biases and stereotypes, as well as the variety of use cases stemming from their wide applicability (Ganguli et al. 2022). Large language models may exhibit varying degrees of positive or negative sentiment across national (Rae et al. 2021; Venkit et al. 2023), societal (Khandelwal et al. 2023) and religious groups (Abid et al. 2021b; Abid et al. 2021), inheriting features of historical and contemporary discourse. Text-to-image models have recently been shown to amplify demographic stereotypes at scale (Bianchi et al. 2022; AI Image Stereotypes 2023). Multimodal systems (Jaegle et al. 2021) introduce another layer of complexity, with cross-modal stereotypes (Birhane et al. 2021) becoming detectable when considering modalities in relation to each other. Any biases in base models have the potential to emerge across their downstream applications, making it especially important to evaluate such models and their underlying training data prior to a wider release.

As an illustrative example of the types of xenophobic bias that may creep into large pretrained models, here we consider the case of media and arts. Indeed, narratives shaping the very formation of national and group identities, the us and them, are deeply embedded in media as societal cultural artifacts, either altering (Juan-José and Frutos Francisco 2017) and subverting or reinforcing and perpetuating pre-existing power dynamics within and across societies. Stereotyping and projection are central to the act of othering, as the other get associated with unwanted, negative traits. Undertaking a critical re-examination of how xenophobia is expressed and disseminated in media and arts is critical in understanding the risks of inadvertently incorporating xenophobic stereotypes in AI systems developed on broad multimedia data at scale. Recent advances in AI creativity and art co-creation (Ramesh et al. 2021; Saharia et al. 2022; Li et al. 2022) raise interesting questions around the role of AI-generated art and the cultural stereotypes potentially contained within (Jalal et al. 2021a; Cho et al. 2022).

In case of literature, xenophobia manifests not only in terms of harmful narratives leading to discrimination (Taras 2009) and violence (Minga 2015), but also via ethnolinguistic purity, centering only the works written by the favoured in-group as “true” national cultural heritage, whilst erasing contributions written in minority languages (Kamusella et al. 2021). Xenophobia may also manifest in the lack of availability and literary translation of foreign works (Dickens 2002), thereby sheltering the local xenophobic narratives from foreign critique. Portrayals of out-groups in film provide ample evidence for deeply rooted xenophobia, involving racial stereotyping (Yuen 2016), unfavourable portrayals of Arabs (Shaheen 2003), antisocial portrayal of LatinX people (Berg 2002), stereotyping queer people as villains (Long 2021; Brown 2021), etc. These works reinforce notions of superiority for the majority, while simultaneously promoting unbefitting images and traits with respect to out-groups (Inayat 2017; Benshoff and Griffin 2021). The pervasiveness of xenophobic stereotypes in cultural artifacts and their role in nationalistic narratives (Smith 2002; Joep 2006; Gordy 2010; Khandy 2021) make it imperative to audit large AI models and datasets on those grounds.

3.5.1 Risks

Recent work on enumerating risks in large language models  (Weidinger et al. 2021, 2022) categorized the potential downstream harms across six types: 1) discrimination, exclusion and toxicity; 2) information hazards; 3) misinformation harms; 4) malicious uses; 5) human-computer interaction harms; and 6) automation, access and environmental harms. These categories intersect with material, representational and wider societal harms through which we’ve been discussing the potential impact of xenophobia in sociotechnical systems, adding another layer of analysis. LLMs may potentially surface direct xenophobic discrimination, they may pose information hazard by revealing sensitive or private information about vulnerable individuals, spread misinformation that disparages out-groups, be incorporated in technologies weaponized by malicious xenophobic users, subtly reinforce discriminatory stereotypes via repeated human-computer interaction, and indirectly lead to disparate socioeconomic and environmental impact. Prior studies have identified a concerning aptitude of such systems towards accurately emulating extremist views when prompted (McGuffie and Newhouse 2020), and numerous ethnic biases (Li et al. 2020). Perhaps unsurprisingly, these problems were shown to arise in multi-modal systems as well (Srinivasan and Bisk 2021; Cho et al. 2022; Birhane et al. 2021). A recent study identified numerous stereotypes in visual representations from text-to-image models, across 135 nationality-based identity groups (Jha et al. 2024). Disparate performance in attribute representation for different nationalities was highlighted as an issue, especially for the Global South. As this type of content may potentially influence and radicalise people into far-right ideologies at an unprecedented scale and pace, more focused investments in data curation are needed to address underlying risks (Bender et al. 2021).

3.5.2 Promise

AI systems need to be designed to transcend ethno-nationalistic exclusionary cultural narratives and challenge entrenched xenophobic views. This would involve developing better ways of incorporating domain knowledge on the cross-cultural underpinnings of social friction, and breaking away from a stochastic re-enactment of historical patterns. This stands in stark contrast to the historical AI approach, centered around supervised learning and reconstructing pre-existing data. There is a pressing need to imbue general AI systems with capacities for understanding and reasoning about the world, and therefore the ability to integrate and reinterpret information, deriving less-biased and better reasoned conclusions about social phenomena. It is yet to be determined whether such capabilities lie fully within the reach of current technology, or whether they require further innovation on the path towards more generally intelligent AI systems. While there is ongoing research work on improving reasoning in large language models (Huang and Chen-Chuan 2022; Wei et al. 2022; Zhao et al. 2022; Wang et al. 2023; Long 2023; Weng et al. 2023; Lu et al. 2023; Taylor 2023; Zhang et al. 2023; Besta et al. 2023), there is still a degree of skepticism on these nascent capabilities (Xu et al. 2023; Huang et al. 2023) and additional challenges that remain to be overcome (Ji et al. 2023; Shi et al. 2023). Importantly, a culture gap has been identified in the understanding capabilities of existing models (Liu et al. 2023).

Recent advances in understanding and mitigating social biases in language models (Zmigrod et al. 2019; Huang et al. 2019b; Li et al. 2020; Liang et al. 2021; Ousidhoum et al. 2021; Nozza et al. 2021), and computer vision systems (Joo et al. 2020; Wang et al. 2020a), coupled with a development of benchmarks (Nangia et al. 2020; Nadeem et al. 2020) for identification of malignant stereotypes in AI models, may pave the way towards quantifying the extent of xenophobic bias and harm in large pre-trained AI systems. Yet, for contemporary base models, the solution likely needs to be multi-faceted and case-specific.

Multi-lingual (Huang et al. 2019; Xu et al. 2021; Srinivasan et al. 2021) models may help incorporate a wider variety of viewpoints, especially if grounded in geospatially diverse imagery aimed at improving representation—though there are still open challenges when it comes to ensuring equal performance (Wang et al. 2021a; Monojit and Amit 2021; Ramesh et al. 2023). Self-supervised learning in computer vision has shown promise in terms of reducing the amount of bias (Goyal et al. 2022), as well as presenting new ways to incorporate fairness objectives into model training (Tsai et al. 2021). Data curation (Scao et al. 2022) plays a central role in mitigating the effects of historical bias. Mechanisms for incorporating comparative and natural language human feedback to refine and improve model outputs (Ouyang et al. 2022; Scheurer et al. 2022) present another useful avenue, as do efforts for soliciting human preferences on conversational rules, and highlighting sources of information to improve factuality of claims (Glaese et al. 2022). This myriad of technical mitigations needs to be incorporated in a more holistic participatory approach, to avoid pitfalls of technosolutionism and empower marginalised communities in AI system design.

4 Towards xenophilic AI systems

4.1 A moral imperative

Crucially, in terms of scope and impact, many of the AI tools and services discussed in this paper have global reach (Kate 2021). They easily traverse national boundaries and—in the case of social media platforms—involve billions of users worldwide. Cumulatively, these technologies therefore have a powerful shaping effect on the contours of social relationships at a global level. Considered in this light, it is worth pausing to more fully assess the combined effect of the practices outlined in this paper.

Taken together, xenophobic discrimination in the domain of AI can be best understood as a compound or structural phenomenon and set of experiences: people are not only exposed to automated xenophobic bias via interaction with one or two services, but across the entire range of digital services they interact with. Crucially, in an age of growing displacement and movement of people, a person who is new to a country, a non-native speaker, or a carrier of identity traits that attract xenophobic sentiment may expect to receive, among other things: (1) higher rates of harassment on social media, (2) problematic treatment by immigration services, (3) increased chance of medical error, (4) prejudicial access to employment opportunities, and (5) algorithmically embedded stereotypes, in addition to whatever other forms of disadvantage they may encounter. They are therefore in a position of intersectional vulnerability (Crenshaw 2017): there is added precarity that comes with the notion of being “foreign” in a digital age. This is a uniquely dangerous position to occupy and one that we believe warrants special protection when it comes to the design and deployment of algorithmic systems.

Given this wider context, technologists face an important choice when it comes to xenophobia and AI. They can build systems that heighten tensions between different groups and compound existing axes of disadvantage or they can promote a different, more inclusive, ideal—helping to mitigate the impact of xenophobia by learning from those affected and by taking concrete measures to forestall these effects. This aspiration, to build better systems that are capable of promoting civic inclusion and deescalating othering dynamics, we term “xenophilic technology”. When successful, xenophilic technology helps to build relationships between people of different backgrounds and to enable the rich and productive sharing of cultural differences.

Ultimately, the risks outlined in this paper indicate that the problem of xenophobic bias in AI systems cannot go unaddressed. Instead, we believe that technologists should partner with domain experts to jointly take the lead in this area, drawing upon sources such as the Universal Declaration on Human Rights (UDHR) which is quite explicit in its opposition to discrimination on the basis of national origin or related grounds (Prabhakaran et al. 2022). On this point, Article 2 of the UDHR states that “everyone is entitled to all the rights and freedoms set forth in this Declaration without distinction of any kind, such as race, colour, sex, language, religion, political or other opinion, national or social origin, property, birth or other status.” Additionally, the global reach of technology also brings with it significant promise. By addressing the potential for material harm, representational harm, and risk to people’s rights, responsibly designed technology may help bridge the divide between communities, minimize material and representational harms to foreigners and non-citizens, ensure that these people are secure in their standing as members of the relevant communities, and help to ensure that human rights are respected.

4.2 Measurement and mitigation

Evaluating and mitigating the potential xenophobic impact of AI systems is not common practice. In this section we highlight the technical considerations in relevant areas of ML fairness research that may prove useful for addressing this unmet need and help us move away from more hegemonic ML fairness approaches (Weinberg 2022).

4.2.1 Measuring xenophobia

Most approaches to measuring fairness rely on an understanding of marginalised identities, coupled with the availability of the corresponding information in the underlying data. This is why it’s especially concerning that there are deep issues with the overall quality of race and ethnicity data across observational databases (Polubriaginof et al. 2019), which often contain insufficiently fine-grained and static racial and ethnic categories (Strmic-Pawl et al. 2018; Hanna et al. 2020), involve discrepancies between recorded and self-reported identities (Moscou et al. 2003) and come with data collection (Varcoe et al. 2009; Ford and Kelly 2005; Routen et al. 2022; Andrus and Villeneuve 2022), privacy (Ringelheim 2011) and governance challenges. Despite recent integration efforts (Müller-Crepon et al. 2021), better data (Pushkarna et al. 2022) and model (Mitchell et al. 2019; Crisan et al. 2022) documentation are required to promote the understanding of ethnic representation and data provenance. Name-ethnicity classifiers could offer a potential solution, but they tend to themselves rely on machine learning, resulting in disparate performance across ethnicities (Hafner et al. 2023).

Fairness approaches rooted in comparing model performance and impact across groups need to be more cognizant of the normative challenges that arise when defining group identities (Leben 2020), and there also needs to be a deeper engagement with domain experts to establish consistent and meaningful ways of identifying in-groups and out-groups within the context of each AI application. The lack of frameworks and benchmarks for assessing xenophobic impact makes it especially hard to evaluate the impact of systems with a potentially global scope, which is a definitive concern. In the interim, practitioners should be encouraged to include ethnicity and nationality in routine ML fairness evaluation, in intersection with other relevant sensitive attributes (Wang et al. 2022), and drive technical innovation towards the development of methods that can transcend the confines of pre-defined categorizations (Jalal et al. 2021).

Counterfactual (Kusner et al. 2017; Pfohl et al. 2019; Joo et al. 2020) and contrastive (Chakraborti et al. 2020) methodologies may prove useful for more concretely identifying the pathways of discrimination, while leveraging expert knowledge. Most such methods would still require a meaningful categorization of group identities, and may not be applicable in certain contexts where certain social categories may not lend themselves to meaningful counterfactual manipulation, making it impossible to reliably assess the truthfulness or counterfactuals (Kasirzadeh and Smart 2021). Individual fairness (Dwork et al. 2012; Dwork and Ilvento 2018; Sharifi-Malvajerdi et al. 2019; Gupta and Kamble 2021) may prove applicable in absence of reliable categorizations, and there is ongoing research on eliciting (Jung et al. 2019; Yahav et al. 2020) or learning (Ilvento 2019; Mukherjee et al. 2020) similarity measures for its practical application.

Quantitative measures of impact should be complemented by qualitative assessments (Van der Veer et al. 2013; Krumpal 2012; Olonisakin and Adebayo 2021), and comprehensive participatory approaches towards deeper and more representative assessments of xenophobic impact. Wider community participation (Martin Jr et al. 2020; Bondi et al. 2021; Birhane et al. 2022) is key when creating benchmarks, soliciting feedback, and empowering the affected communities to inform policy and AI system design. As an ideological and social phenomenon, xenophobia may be targeted at an ever-evolving conception of what is foreign, according to local histories, cultural and political context (Mamdani 2018, 2012). Understanding who is deemed foreign, by whom, on what grounds, and in light of which set of unfolding circumstances, requires knowledge that is local, situated and contemporaneous.

Evaluation of xenophobic bias should also help re-center the question of discrimination onto the perpetrators. AI systems may be used with malicious intent, towards spreading hate and amplifying discrimination against vulnerable populations. Human bias may also skew the outcomes of assistive AI tech with human in the loop. To address the root cause of xenophobia, it is important to understand the role of AI systems in in-group radicalization, as well as how AI systems shift power. In terms of better understanding outcomes, we suggest for practitioners to consider, in line with what we propose in the paper, material harms, recognition harms and rights harms. This is merely a step towards a more holistic approach, as there is a need to also engage with the deeper questions of social justice (Birhane et al. 2022b; Schwöbel and Remmers 2022).

4.2.2 Mitigation strategies

The complexity and pervasiveness of xenophobia in society and consequently in the data used to train AI systems calls for the development of both technical and non-technical mitigation strategies for preventing the adverse impact of such systems on the most vulnerable and marginalised groups in society, citizens and non-citizens alike. These are currently lacking, in part due to a systemic failure to recognize the importance of xenophobic harms, and in part due to already discussed deficiencies in recording and utilising the relevant sensitive information, compounded by insufficient investments in participatory approaches.

Until such comprehensive frameworks for identifying and quantifying xenophobic harms are fully developed, practitioners may want to consider alternative approaches, tailored for systems with missing or incomplete data, that are capable of providing partial safety and robustness guarantees without a full categorical specification. This may involve considering plausible groupings based on the available inputs, and improving worst-case or average-case performance across such plausible group structures (Kearns et al. 2018; Kim et al. 2018; Hashimoto et al. 2018). The use of proxy targets (Gupta et al. 2018; Grari et al. 2021) is an alternative, though it comes with risks of stereotyping. Practitioners need to ensure robust evaluation (Mandal et al. 2020) while accounting for uncertainty and incompleteness (Awasthi et al. 2021) in the data. Purely technical approaches rarely map onto socially acceptable mitigation outcomes (Słowik and Bottou 2021), making it imperative to avoid technosolutionism and to embrace interdisciplinary efforts.

Transparency and model audits (Raji and Buolamwini 2019; Raji et al. 2020) need to complement such fairness assessments, especially given the difficulties involved with defining comprehensive quantitative measures that would consistently capture all xenophobic harms. Improvements in AI explainability would allow us to more thoroughly examine how AI systems reach their decisions, and check if these decisions are reached in ways that imply xenophobic bias. This is especially relevant given that large-scale deep learning models that have come to permeate the field and practical applications have had a reputation for being black boxes, significantly harder to understand and interpret compared to simpler statistical models and expert systems rooted in human-defined heuristics. Yet, recent advances in AI explainability research (Gilpin et al. 2018; Bhatt et al. 2020) make us optimistic that such analyses may yield valuable insights and help with identification of discrimination and xenophobic harms. For example, numerous explainability approaches are being developed for improved understanding of large language models (Zhao et al. 2023), both in terms of the mechanistic computations in the transformer architecture (Elhage et al. 2021; Bricken et al. 2023), as well as structured inference-time approaches where models themselves provide verbose step-by-step explanations (Wei et al. 2022; Zhao et al. 2022; Wang et al. 2023) and the ability to self-critique (Valmeekam et al. 2023) in the process of formulating their reply and making a recommendation. Detailed process supervision improves reliability and the reasoning capability of such models (Lightman et al. 2023), and compositional human explanations can assist with mitigating bias (Yao et al. 2021). Notable advances have also been made in improving explainability of computer vision systems (Chefer et al. 2021), multimodal AI systems (Joshi et al. 2021), and concept discovery in super-human narrow AI systems (McGrath et al. 2022; Schut et al. 2023). All these advances showcase the promise of AI explainability and its potential usefulness for conducting audits (Abdul et al. 2018; Hall et al. 2019; Bibal et al. 2021) and enabling rapid identification of model harms (Sokol and Flach 2019; Begley et al. 2020; Verma et al. 2020). Explainability methods may also be helpful in identifying vulnerable groups (Strümke and Slavkovik 2022) within each AI application context, shortcutting through sensitive attributes in AI decision making (DeGrave et al. 2021; Brown et al. 2023), and provide a fine-grained breakdown of model performance (Sharma et al. 2020). Yet, challenges remain, in ensuring that such model explanations are grounded and faithful (Turpin et al. 2023; Lyu et al. 2023). These efforts need to be coupled with the appropriate escalation procedures and enable recourse—which may require a greater focus on designing appropriate interventions (Karimi et al. 2021). These additional escalation pathways are important, as technologists may not be able to capture all of the model harms through techniques outlined above, given that there is always some degree of approximation involved with purely technical solutions, and they need to be complemented by appropriate qualitative assessments.

5 Conclusions

Despite being one of the key drivers of discrimination and conflict worldwide, xenophobic bias is yet to be formally recognised and comprehensively assessed in AI system development. This blind spot introduces a considerable risk of amplifying harms to marginalised communities and fuelling fear and intolerance towards foreigners, which may lead to violence and conflict both at the societal level and more widely.

Our overview of several prominent technological use cases—social media, immigration, healthcare, employment and the advances in the development of foundation models—reveals how historical and contemporary xenophobic attitudes may manifest in data and AI systems. Moreover, we illustrate the effect of these tendencies by drawing upon three categories of potential xenophobic harms: those that cause direct material disadvantage to people perceived as foreign, representational harms that undermine their social standing or status, and the wider societal harms which include barriers to the successful exercise of civic and human rights. Significantly, given the scope and depth of AI services, people who experience xenophobic bias in one domain of automation can also expect to experience unequal treatment across a range of other domains, further compounding their situation.

In this context, we suggest that the development of xenophilic technology is crucial for the ethical design and deployment of AI systems, understood as systems that treat people equitable at the societal level. AI should help promote civic inclusion, oppose the malignant “othering” of marginalised groups, and be centered around the potential to cultivate the kind of rich and productive cross-cultural discourse that is appropriate for a world like our own. Coordinated interdisciplinary action, participatory frameworks, and an unwavering commitment of technologists are required to fashion such a future, along with technical innovation in addressing the existing technological bases of xenophobia in ML systems. Xenophilic technology should be understood as merely a part of a much wider set of solutions, within and across societies, avoiding the pitfalls of technosolutionism.