Tutorial: Approaches for Explainability in Autonomous Agents

AAMAS 2026 Tutorial
Paphos, Cyprus · Half-day tutorial · 3.5 hours

Overview

As autonomous agents increasingly rely on complex decision-making mechanisms, their behaviour becomes harder to interpret, validate, and trust. This tutorial provides a structured, agent-centric introduction to explainability, covering intuitive post-hoc techniques, their limitations, and the move toward causal and intention-oriented explanations.

The tutorial combines conceptual foundations with practical examples. Participants will see how explanation methods such as SHAP and LIME are commonly used in agent settings, why they are attractive in practice, and where they fall short when the goal is to explain behaviour rather than merely correlate features with outputs. From there, the tutorial introduces causal, contrastive, counterfactual, and higher-level behavioural explanations for autonomous agents.

We will discuss:

Rather than promoting a single framework, the tutorial presents a structured view of the explanation landscape for autonomous agents and clarifies the assumptions, trade-offs, and explanatory scope of each family of methods.


Presenters

Sergio Alvarez-Napagao

Universitat Politècnica de Catalunya-BarcelonaTech (UPC), Barcelona Supercomputing Center (BSC)
Lecturer at UPC and Associate Researcher at the AI Institute of BSC, where he leads the Single- and Multi-Agent Research Team (SMART). His research focuses on agents and multi-agent systems, including normative systems, large-scale agent simulation, agent reliability, and explainability.

Sara Montese

Barcelona Supercomputing Center (BSC), Universitat Politècnica de Catalunya-BarcelonaTech (UPC)
Researcher at BSC in the Single- and Multi-Agent Research Team (SMART), and PhD student and assistant lecturer at UPC. Her research interests include agents, explainability, learning, and improvement.

Victor Gimenez-Abalos

Barcelona Supercomputing Center (BSC), Universitat Politècnica de Catalunya-BarcelonaTech (UPC)
Researcher at BSC in the Single- and Multi-Agent Research Team (SMART), and PhD student and assistant lecturer at UPC. His research focuses on agents, learning, and explainability, especially the interplay between these areas and the role of intentions as behavioural insight.


Audience

This tutorial is intended for a broad AAMAS audience, especially:

The tutorial is accessible to non-specialists in explainability. Basic familiarity with agents and decision-making systems is helpful, but no prior background in causal modelling or explainable AI is required. Familiarity with Python is recommended for the hands-on parts.


Tutorial outline

1 · 15 min

Introduction

Motivation for explainability in autonomous agents and overview of the session.

2 · 30 min

Ante-hoc vs post-hoc explanations

Core definitions, perspectives for explaining agent behaviour, and why post-hoc methods are widely adopted.

3 · 30 min

Hands-on: post-hoc methods

Guided use of SHAP and LIME, with interpretation of example results.

4 · 30 min

Limits of post-hoc explanation

Instability, correlation vs responsibility, and why causal reasoning matters for agent behaviour.

Break · 30 min
5 · 30 min

Explanation methods with causal structure

Intervention, causality, contrastive and counterfactual explanations, and their core assumptions.

6 · 30 min

Hands-on: contrastive and counterfactual explanations

Guided walkthrough of concrete causal and counterfactual examples.

7 · 30 min

What different methods actually explain

Comparing explanatory scope, from local action explanations to higher-level behavioural and goal-driven accounts.

8 · 15 min

Discussion and conclusion

Summary of key insights, practical recommendations, and open challenges.


Materials and contact

Materials will be added here when available.

For questions about the tutorial, please contact: