PhD Course: Interpretable and Explainable Machine Learning Models

Gen 102025

Interpretable and Explainable Machine Learning Models

Dr. Claudio Pomo
Politecnico di Bari

Abstract

The course focuses on methods for interpreting and explaining machine learning (ML) models, including inherently interpretable approaches and post-hoc explanation techniques. Key concepts of interpretability will be introduced, alongside the analysis of interpretable models and the application of explanation methods for complex models. The course critically evaluates existing techniques in terms of fidelity, stability, fairness, and practical utility, while addressing open challenges and future perspectives.

Schedule

The course will have a total duration of 10 hours, scheduled as follows:

March 20, 15:00-17:30 Aula II
March 21, 10:00-12:30 Aula F
March 24, 15:00-17:30 Aula II
March 25, 10:00-12:30 Sala Riunioni II piano

Exam

The final exam consists of a project analyzing a case study using the techniques and tools acquired during the course. The course will be held in person. Please contact me if you are interested in joining.

References

Lundberg, S. M., and Lee, S.-I. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 2017.
Ribeiro, M. T., Singh, S., and Guestrin, C. Why should I trust you? Explaining the predictions of any classifier. Proceedings of the ACM SIGKDD, 2016.
Molnar, C.. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. 2nd edition, 2022.
Doshi-Velez, F., and Kim, B. Towards a rigorous science of interpretable machine learning. arXiv preprint, 2017.
Agarwal, C., Krishna, S., Saxena, E., Pawelczyk, M., Johnson, N., Puri, I., … & Lakkaraju, H. Openxai: Towards a transparent evaluation of model explanations. Advances in Neural Information Processing Systems, 2022