

While several works have acknowledged this difficulty, they tend to base their analysis on specific real-world 2 and synthetic 16 datasets with limited ground-truth explanations. However, explainability in graph machine learning is an emerging area lacking standardized evaluation strategies and reliable data resources to evaluate, test, and compare GNN explanations 15. With the increase in newly proposed GNN explanation methods, it is critical to ensure their reliability. To this end, previous work developed several methods to explain predictions made by GNNs 6, 7, 8, 9, 10, 11, 12, 13, 14. In addition to synthetic and real-world graph datasets with ground-truth explanations, G raphXAI provides data loaders, data processing functions, visualizers, GNN model implementations, and evaluation metrics to benchmark GNN explainability methods.Īs graph neural networks (GNNs) are being increasingly used for learning representations of graph-structured data in high-stakes applications, such as criminal justice 1, molecular chemistry 2, 3, and biological networks 4, 5, it becomes critical to ensure that the relevant stakeholders can understand and trust their functionality. We include ShapeGG en and several real-world graph datasets in a graph explainability library, G raphXAI. The flexibility to generate diverse synthetic datasets and corresponding ground-truth explanations allows ShapeGG en to mimic the data in various real-world areas. heterophilic graphs) accompanied by ground-truth explanations. Here, we introduce a synthetic graph data generator, ShapeGG en, which can generate a variety of benchmark datasets (e.g., varying graph sizes, degree distributions, homophilic vs. However, assessing the quality of GNN explanations is challenging as existing graph datasets have no or unreliable ground-truth explanations.

As explanations are increasingly used to understand the behavior of graph neural networks (GNNs), evaluating the quality and reliability of GNN explanations is crucial.
