This repo contains the code for the evaluation framework for three types of explanations regrading four metrics. For demonstration,
Under the ./examples, there are three types of highlight explanations, from a subset of SNLI, based on BERT, generated by the attention method.
./configs: base model configuration, and explanation paths.
./data : golden explanation files, and intermediate step files
./explain_interactions and ./tools: helper functions
See the main scripts and corresponding readmes for four metrics:
faithfulness_eval.py README_faithfulness.md
agreement_eval.py README_agreement.md
complexity_eval.py README_complexity.md
simulatability_eval.py README_simulatability.md
Please refer to https://github.com/copenlu/spanex for the span pair explanation generation.
Please cite our paper for the study use of this repo:
@inproceedings{sun2025evaluating, title={Evaluating Input Feature Explanations through a Unified Diagnostic Evaluation Framework}, author={Sun, Jingyi and Atanasova, Pepa and Augenstein, Isabelle}, booktitle={Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)}, pages={10559--10577}, year={2025} }