AI Ethics and Online Harms

As more and more tasks and decisions are delegated to AI-enabled computers, mobile devices, and autonomous systems, it is crucial to understand the impacts this may have on people and that AI treats people ethically. Among other topics, we are working on:

1) Value-based and explainable AI, where we are developing AI models that are able to reason about human values, so that AI models act according to them. We also work on making AI models more transparent and explainable, so that users can better understand what they do and why. We have already proven that, in some specific recommendation domains, making AI value-aligned and explainable leads to more accpetable and satisfying recommendations. This also allows for a better way to scrutinise AI models in general.

2) AI Discrimination, where users may be treated unfairly or just differently based on their personal characteristics (e.g. gender, ethnicity, religion, etc.). Interestingly, AI very often reproduces existing instances of discrimination in the offline world by either inheriting the biases of prior decision makers, or simply reflecting widespread prejudices in society. Therefore, by developing methods to study AI discrimination, this also enables us to understand instances of human discrimination. For instance, we have applied our methods to discover biases in natural language processing models to discover dangerous prejudices in online communities using their own language (usually containing slang).

Our research in this domain often involves cross-disciplinary collaborations, including colleagues from the social sciences, digital humanities, law, ethics and policy/governance.

Related Projects
  • Discovering and Attesting Digital Discrimination (EPSRC) - DADD
  • National Research Centre on Privacy, Harm Reduction and Adversarial Influence Online (UKRI) - REPHRAIN
Selected Publications
  • Jazon Szabo, Natalia Criado, Jose Such, and Sanjay Modgil. Moral Uncertainty and the Problem of Fanaticism. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 19948–19955, 38, 2024.       
  • Vahid Ghafouri, Vibhor Agarwal, Yong Zhang, Nishanth Sastry, Jose Such, and Guillermo Suarez-Tangil. AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics. In The Conference on Information and Knowledge Management (CIKM), pp. In press., 2023.       
  • Mackenzie Jorgensen, Hannah Richert, Elizabeth Black, Natalia Criado, and Jose Such. Not So Fair: The Impact of Presumably Fair Machine Learning Models. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES), pp. 297–311, 2023.       
  • Xavier Ferrer, Tom van Nuenen, Jose Such, and Natalia Criado. Discovering and Interpreting Conceptual Biases in Online Communities. IEEE Transactions on Knowledge and Data Engineering (TKDE) , 2021.       
  • Francesca Mosca and Jose Such. ELVIRA: an Explainable Agent for Value and Utility-driven Multiuser Privacy. In International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 916–924, 2021.       
  • Xavier Ferrer, Tom van Nuenen, Jose Such, Mark Cote, and Natalia Criado. Bias and Discrimination in AI: a cross-disciplinary perspective. IEEE Technology and Society, 20(2):72–80, 2021.       
  • Xavier Ferrer, Tom van Nuenen, Jose Such, and Natalia Criado. Discovering and Categorising Language Biases in Reddit. In The International AAAI Conference on Web and Social Media (ICWSM), 2021.       
  • Tom van Nuenen, Xavier Ferrer, Jose Such, and Mark Cote. Transparency for Whom? Assessing Discriminatory Artificial Intelligence. IEEE Computer, 53:36–44, 2020.       

See more publications on this topic here