The DG Cities team is excited to be participating in Digital Leaders’ AI Week this month. Ed Houghton, Lara Suraci and Nima Karshenas will be jointly presenting on participatory approaches to AI on 23rd October [sign up details at the end!]. They’ll be examining how methods used in social research and service design can be applied to the development of AI tools to better serve the people they are intended to support. To underscore the importance of such collaboration, Behavioural Scientist, Lara Suraci sets out the risks of bias - and the path to a more proactive, inclusive approach.

AI in public services: the promise and peril

Artificial intelligence (AI) is increasingly woven into the fabric of public services, from simple tasks like creating images for a presentation or recording meetings, all the way to complex triage tools evaluating support needs. While innovations such as these undoubtedly hold enormous potential to make services more efficient, consistent and responsive, they also present a profound ethical challenge: AI, and especially large language models (LLMs), are predominantly trained on data that reflects human behaviour and thus frequently absorb and replicate societal biases and blind spots.

When used to inform decisions that affect people’s lives, as is the case in most public service contexts, these representational biases risk perpetuating inequality rather than reducing it. Currently, the communities most affected by biased tools are often those with the least ability to shape how they are developed or used. Ensuring equity, fairness and inclusivity requires a meaningful shift in how we think about ethical AI design; one that centres the voices of people with lived experience and treats them as experts in their own lives.

Where representation fails

Evidence across different AI applications shows that models often perform less accurately or equitably for historically marginalised groups.

In clinical contexts, for example, a study [Zack et al. 2024] has found that AI models trained on patient vignettes from published literature and medical education material can reinforce deeply harmful biases: GPT-4 was less likely to recommend advanced imaging for Black individuals, whereas Hispanic and Asian populations were overrepresented in stereotyped conditions like Hepatitis B and Tuberculosis and underrepresented elsewhere.

While adverse health outcomes are a particularly staggering example of the potential risk of biased AI models, the problem extends far beyond this.

For instance, text-to-image generators have been shown to portray Indian culture in exoticised, inaccurate ways at the expense of everyday realities [Gosh et al. 2024]; flattening rich subcultures into clichés. In another example, models tasked with generating images of disabled people fell back on narrow, negative tropes such as wheelchairs, sadness and inactivity [Mack et al. 2024]. Even seemingly neutral tools like sentiment analysis algorithms have been shown to exhibit nationality, gender, and religious biases; for instance, assigning systematically different sentiment scores to identical sentences depending on the group identity referenced [Das et al. 2024].

These examples matter for local authorities, charities, and other public sector bodies because the groups most affected are often the very people these institutions aim to support. If AI tools systematically fail to represent them fairly or reinforce stereotypes that already exist in society, they risk doing more harm than good.

The burden on users

In response to these risks, end-users are often told to simply “prompt better” – that is, to include explicit instructions against bias, to create detailed personas or to use so-called ‘chain-of-thought’ prompting to guide the model’s reasoning.

First of all, it’s important to note that efforts to correct these in-built biases through instructions alone are often unlikely to succeed: LLMs lack self-awareness, self-reflection and a stable understanding of the world that would anchor their outputs [Hastings, 2024]. Perhaps even more importantly, however, the above strategies shift the responsibility for ethical AI practices solely onto individuals; many of whom lack the AI literacy, time, or confidence to identify and correct systemic bias.

In other words, this solution is not only unrealistic but also inequitable: those most affected by biased systems are often the least empowered to challenge or change them, and we cannot have fairness and inclusivity depend on a user’s ability to correct an algorithm’s mistakes.

So, what can we do instead?

Listen to Lara, Nima and Ed discuss participatory AI this month

Rather than asking people to adapt to biased systems, we should design systems that adapt to people – and that should include all people. Crucially, ethical AI cannot be achieved through better code alone: as many researchers have emphasised, we need a sociotechnical approach that looks at relevant actors across the whole process of AI design; from those who build the tools to those who train them, and ultimately those who get to decide whether they are working as they should.

Including community voices is essential – it cannot be an afterthought, it needs to be a principle built into the early stages of design and development. Participatory and deliberative methods, such as co-design workshops or community panels, will not only improve the quality and legitimacy of AI systems but also help build public trust from the start.

Looking Ahead

AI will continue to shape the future of public services, but whether it does so equitably is up to us. Ensuring fairness cannot rest on individual users adjusting prompts or developers adding disclaimers. It requires a shift in mindset: from the post hoc mitigation of harm to the proactive inclusion of the communities most affected.

Don’t miss our session! Find out more and book your spot online: https://aiweek.digileaders.com/talks/participatory-approaches-to-defining-ai-ethics-enabling-the-public-to-decide-whats-right/

References

Zack, Travis, et al. "Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study." The Lancet Digital Health 6.1 (2024): e12-e22.
Ghosh, Sourojit, et al. "Do generative AI models output harm while representing non-Western cultures: Evidence from a community-centered approach." Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. Vol. 7. 2024.
Mack, Kelly Avery, et al. "“They only care to show us the wheelchair”: disability representation in text-to-image AI models." Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 2024.
Das, Dipto, et al. "The Colonial Impulse" of Natural Language Processing: An Audit of Bengali Sentiment Analysis Tools and Their Identity-based Biases." Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 2024.
Hastings, Janna. "Preventing harm from non-conscious bias in medical generative AI." The Lancet Digital Health 6.1 (2024): e2-e3.