As autonomous agents increasingly rely on complex decision-making mechanisms, their behaviour becomes harder to interpret, validate, and trust. This tutorial provides a structured, agent-centric introduction to explainability, covering intuitive post-hoc techniques, their limitations, and the move toward causal and intention-oriented explanations.
The tutorial combines conceptual foundations with practical examples. Participants will see how explanation methods such as SHAP and LIME are commonly used in agent settings, why they are attractive in practice, and where they fall short when the goal is to explain behaviour rather than merely correlate features with outputs. From there, the tutorial introduces causal, contrastive, counterfactual, and higher-level behavioural explanations for autonomous agents.
We will discuss:
Rather than promoting a single framework, the tutorial presents a structured view of the explanation landscape for autonomous agents and clarifies the assumptions, trade-offs, and explanatory scope of each family of methods.
Universitat Politècnica de Catalunya-BarcelonaTech (UPC), Barcelona Supercomputing Center (BSC)
Lecturer at UPC and Associate Researcher at the AI Institute of BSC, where he leads the Single- and Multi-Agent Research Team (SMART). His research focuses on agents and multi-agent systems, including normative systems, large-scale agent simulation, agent reliability, and explainability.
Barcelona Supercomputing Center (BSC), Universitat Politècnica de Catalunya-BarcelonaTech (UPC)
Researcher at BSC in the Single- and Multi-Agent Research Team (SMART), and PhD student and assistant lecturer at UPC. Her research interests include agents, explainability, learning, and improvement.
Barcelona Supercomputing Center (BSC), Universitat Politècnica de Catalunya-BarcelonaTech (UPC)
Researcher at BSC in the Single- and Multi-Agent Research Team (SMART), and PhD student and assistant lecturer at UPC. His research focuses on agents, learning, and explainability, especially the interplay between these areas and the role of intentions as behavioural insight.
This tutorial is intended for a broad AAMAS audience, especially:
The tutorial is accessible to non-specialists in explainability. Basic familiarity with agents and decision-making systems is helpful, but no prior background in causal modelling or explainable AI is required. Familiarity with Python is recommended for the hands-on parts.
Motivation for explainability in autonomous agents and overview of the session.
Core definitions, perspectives for explaining agent behaviour, and why post-hoc methods are widely adopted.
Guided use of SHAP and LIME, with interpretation of example results.
Instability, correlation vs responsibility, and why causal reasoning matters for agent behaviour.
Intervention, causality, contrastive and counterfactual explanations, and their core assumptions.
Guided walkthrough of concrete causal and counterfactual examples.
Comparing explanatory scope, from local action explanations to higher-level behavioural and goal-driven accounts.
Summary of key insights, practical recommendations, and open challenges.
Materials will be added here when available.
For questions about the tutorial, please contact:
sergio.alvarez-napagao@upc.edusara.montese@bsc.esvictor.gimenez@bsc.es