Awesome Explainable AI

(Note XAI is highly related to Awesome Causal Inference and Physics with Machine Learning )

Controllable generative model



  • Explainable k-Means and k-Medians Clustering

Selected Papers



ICML 17 tutorial


Quanshi Zhang




Explainable Machine Learning


Simpson’s Paradox
1 Like

IJCAI 2019

Explaining Reinforcement Learning to Mere Mortals: An Empirical Study

Andrew Anderson, Jonathan Dodge, Amrita Sadarangani, Zoe Juozapaitis, Evan Newman, Jed Irvine, Souti Chattopadhyay, Alan Fern, Margaret Burnett

We present a user study to investigate the impact of explanations on non-experts? understanding of reinforcement learning (RL) agents. We investigate both a common RL visualization, saliency maps (the focus of attention), and a more recent explanation type, reward-decomposition bars (predictions of future types of rewards). We designed a 124 participant, four-treatment experiment to compare participants? mental models of an RL agent in a simple Real-Time Strategy (RTS) game. Our results show that the combination of both saliency and reward bars were needed to achieve a statistically significant improvement in mental model score over the control. In addition, our qualitative analysis of the data reveals a number of effects for further study.

Twin-Systems to Explain Artificial Neural Networks using Case-Based Reasoning: Comparative Tests of Feature-Weighting Methods in ANN-CBR Twins for XAI

Eoin M. Kenny, Mark T. Keane

In this paper, twin-systems are described to address the eXplainable artificial intelligence (XAI) problem, where a black box model is mapped to a white box “twin” that is more interpretable, with both systems using the same dataset. The framework is instantiated by twinning an artificial neural network (ANN; black box) with a case-based reasoning system (CBR; white box), and mapping the feature weights from the former to the latter to find cases that explain the ANN’s outputs. Using a novel evaluation method, the effectiveness of this twin-system approach is demonstrated by showing that nearest neighbor cases can be found to match the ANN predictions for benchmark datasets. Several feature-weighting methods are competitively tested in two experiments, including our novel, contributions-based method (called COLE) that is found to perform best. The tests consider the ”twinning” of traditional multilayer perceptron (MLP) networks and convolutional neural networks (CNN) with CBR systems. For the CNNs trained on image data, qualitative evidence shows that cases provide plausible explanations for the CNN’s classifications.

Technical, Hard and Explainable Question Answering (THE-QA)

Shailaja Sampat

The ability of an agent to rationally answer questions about a given task is the key measure of its intelligence. While we have obtained phenomenal performance over various language and vision tasks separately, ‘Technical, Hard and Explainable Question Answering’ (THE-QA) is a new challenging corpus which addresses them jointly. THE-QA is a question answering task involving diagram understanding and reading comprehension. We plan to establish benchmarks over this new corpus using deep learning models guided by knowledge representation methods. The proposed approach will envisage detailed semantic parsing of technical figures and text, which is robust against diverse formats. It will be aided by knowledge acquisition and reasoning module that categorizes different knowledge types, identify sources to acquire that knowledge and perform reasoning to answer the questions correctly. THE-QA data will present a strong challenge to the community for future research and will bridge the gap between state-of-the-art Artificial Intelligence (AI) and ‘Human-level’ AI.

Fair and Explainable Dynamic Engagement of Crowd Workers

Han Yu, Yang Liu, Xiguang Wei, Chuyu Zheng, Tianjian Chen, Qiang Yang, Xiong Peng

Years of rural-urban migration has resulted in a significant population in China seeking ad-hoc work in large urban centres. At the same time, many businesses face large fluctuations in demand for manpower and require more efficient ways to satisfy such demands. This paper outlines AlgoCrowd, an artificial intelligence (AI)-empowered algorithmic crowdsourcing platform. Equipped with an efficient explainable task-worker matching optimization approach designed to focus on fair treatment of workers while maximizing collective utility, the platform provides explainable task recommendations to workers’ personal work management mobile apps which are becoming popular, with the aim to address the above societal challenge.

Achieving Causal Fairness through Generative Adversarial Networks

Depeng Xu, Yongkai Wu, Shuhan Yuan, Lu Zhang, Xintao Wu

Achieving fairness in learning models is currently an imperative task in machine learning. Meanwhile, recent research showed that fairness should be studied from the causal perspective, and proposed a number of fairness criteria based on Pearl’s causal modeling framework. In this paper, we investigate the problem of building causal fairness-aware generative adversarial networks (CFGAN), which can learn a close distribution from a given dataset, while also ensuring various causal fairness criteria based on a given causal graph. CFGAN adopts two generators, whose structures are purposefully designed to reflect the structures of causal graph and interventional graph. Therefore, the two generators can respectively simulate the underlying causal model that generates the real data, as well as the causal model after the intervention. On the other hand, two discriminators are used for producing a close-to-real distribution, as well as for achieving various fairness criteria based on causal quantities simulated by generators. Experiments on a real-world dataset show that CFGAN can generate high quality fair data.

Co-Attentive Multi-Task Learning for Explainable Recommendation

Zhongxia Chen, Xiting Wang, Xing Xie, Tong Wu, Guoqing Bu, Yining Wang, Enhong Chen

Despite widespread adoption, recommender systems remain mostly black boxes. Recently, providing explanations about why items are recommended has attracted increasing attention due to its capability to enhance user trust and satisfaction. In this paper, we propose a co-attentive multi-task learning model for explainable recommendation. Our model improves both prediction accuracy and explainability of recommendation by fully exploiting the correlations between the recommendation task and the explanation task. In particular, we design an encoder-selector-decoder architecture inspired by human’s information-processing model in cognitive psychology. We also propose a hierarchical co-attentive selector to effectively model the cross knowledge transferred for both tasks. Our model not only enhances prediction accuracy of the recommendation task, but also generates linguistic explanations that are fluent, useful, and highly personalized. Experiments on three public datasets demonstrate the effectiveness of our model.

Explainable Fashion Recommendation: A Semantic Attribute Region Guided Approach

Min Hou, Le Wu, Enhong Chen, Zhi Li, Vincent W. Zheng, Qi Liu

In fashion recommender systems, each product usually consists of multiple semantic attributes (e.g., sleeves, collar, etc). When making cloth decisions, people usually show preferences for different semantic attributes (e.g., the clothes with v-neck collar). Nevertheless, most previous fashion recommendation models comprehend the clothing images with a global content representation and lack detailed understanding of users’ semantic preferences, which usually leads to inferior recommendation performance. To bridge this gap, we propose a novel Semantic Attribute Explainable Recommender System (SAERS). Specifically, we first introduce a fine-grained interpretable semantic space. We then develop a Semantic Extraction Network (SEN) and Fine-grained Preferences Attention (FPA) module to project users and items into this space, respectively. With SAERS, we are capable of not only providing cloth recommendations for users, but also explaining the reason why we recommend the cloth through intuitive visual attribute semantic highlights in a personalized manner. Extensive experiments conducted on real-world datasets clearly demonstrate the effectiveness of our approach compared with the state-of-the-art methods.

Counterfactuals in Explainable Artificial Intelligence (XAI): Evidence from Human Reasoning

Ruth M. J. Byrne

Counterfactuals about what could have happened are increasingly used in an array of Artificial Intelligence (AI) applications, and especially in explainable AI (XAI). Counterfactuals can aid the provision of interpretable models to make the decisions of inscrutable systems intelligible to developers and users. However, not all counterfactuals are equally helpful in assisting human comprehension. Discoveries about the nature of the counterfactuals that humans create are a helpful guide to maximize the effectiveness of counterfactual use in AI.

Explainable Deep Neural Networks for Multivariate Time Series Predictions

Roy Assaf, Anika Schumann

We demonstrate that CNN deep neural networks can not only be used for making predictions based on multivariate time series data, but also for explaining these predictions. This is important for a number of applications where predictions are the basis for decisions and actions. Hence, confidence in the prediction result is crucial. We design a two stage convolutional neural network architecture which uses particular kernel sizes. This allows us to utilise gradient based techniques for generating saliency maps for both the time dimension and the features. These are then used for explaining which features during which time interval are responsible for a given prediction, as well as explaining during which time intervals was the joint contribution of all features most important for that prediction. We demonstrate our approach for predicting the average energy production of photovoltaic power plants and for explaining these predictions.

Fair and Explainable Dynamic Engagement of Crowd Workers

Han Yu, Yang Liu, Xiguang Wei, Chuyu Zheng, Tianjian Chen, Qiang Yang, Xiong Peng

Years of rural-urban migration has resulted in a significant population in China seeking ad-hoc work in large urban centres. At the same time, many businesses face large fluctuations in demand for manpower and require more efficient ways to satisfy such demands. This paper outlines AlgoCrowd, an artificial intelligence (AI)-empowered algorithmic crowdsourcing platform. Equipped with an efficient explainable task-worker matching optimization approach designed to focus on fair treatment of workers while maximizing collective utility, the platform provides explainable task recommendations to workers’ personal work management mobile apps which are becoming popular, with the aim to address the above societal challenge.

RecoNet: An Interpretable Neural Architecture for Recommender Systems

Francesco Fusco, Michalis Vlachos, Vasileios Vasileiadis, Kathrin Wardatzky, Johannes Schneider

Neural systems offer high predictive accuracy but are plagued by long training times and low interpretability. We present a simple neural architecture for recommender systems that lifts several of these shortcomings. Firstly, the approach has a high predictive power that is comparable to state-of-the-art recommender approaches. Secondly, owing to its simplicity, the trained model can be interpreted easily because it provides the individual contribution of each input feature to the decision. Our method is three orders of magnitude faster than general-purpose explanatory approaches, such as LIME. Finally, thanks to its design, our architecture addresses cold-start issues, and therefore the model does not require retraining in the presence of new users.

Learning Interpretable Deep State Space Model for Probabilistic Time Series Forecasting

Longyuan Li, Junchi Yan, Xiaokang Yang, Yaohui Jin

Probabilistic time series forecasting involves estimating the distribution of future based on its history, which is essential for risk management in downstream decision-making. We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks and the dependency is modeled by recurrent neural nets. We take the automatic relevance determination (ARD) view and devise a network to exploit the exogenous variables in addition to time series. In particular, our ARD network can incorporate the uncertainty of the exogenous variables and eventually helps identify useful exogenous variables and suppress those irrelevant for forecasting. The distribution of multi-step ahead forecasts are approximated by Monte Carlo simulation. We show in experiments that our model produces accurate and sharp probabilistic forecasts. The estimated uncertainty of our forecasting also realistically increases over time, in a spontaneous manner.

Learning Interpretable Relational Structures of Hinge-loss Markov Random Fields

Yue Zhang, Arti Ramesh

Statistical relational models such as Markov logic networks (MLNs) and hinge-loss Markov random fields (HL-MRFs) are specified using templated weighted first-order logic clauses, leading to the creation of complex, yet easy to encode models that effectively combine uncertainty and logic. Learning the structure of these models from data reduces the human effort of identifying the right structures. In this work, we present an asynchronous deep reinforcement learning algorithm to automatically learn HL-MRF clause structures. Our algorithm possesses the ability to learn semantically meaningful structures that appeal to human intuition and understanding, while simultaneously being able to learn structures from data, thus learning structures that have both the desirable qualities of interpretability and good prediction performance. The asynchronous nature of our algorithm further provides the ability to learn diverse structures via exploration, while remaining scalable. We demonstrate the ability of the models to learn semantically meaningful structures that also achieve better prediction performance when compared with a greedy search algorithm, a path-based algorithm, and manually defined clauses on two computational social science applications: i) modeling recovery in alcohol use disorder, and ii) detecting bullying.

CVPR 2020 PR

Interpretable and Accurate Fine-grained Recognition via Region Grouping

Zixuan Huang, Yin Li


We present an interpretable deep model for fine-grained visual recognition. At the core of our method lies the integration of region-based part discovery and attribution within a deep neural network. Our model is trained using image-level object labels, and provides an interpretation of its results via the segmentation of object parts and the identification of their contributions towards classification. To facilitate the learning of object parts without direct supervision, we explore a simple prior of the occurrence of object parts. We demonstrate that this prior, when combined with our region-based part discovery and attribution, leads to an interpretable model that remains highly accurate. Our model is evaluated on major fine-grained recognition datasets, including CUB-200 [55], CelebA [36] and iNaturalist [54]. Our results compares favourably to state-of-the-art methods on classification tasks, and outperforms previous approaches on the localization of object parts.

Towards Global Explanations of Convolutional Neural Networks with Concept Attribution

Weibin Wu, Yuxin Su∗, Xixian Chen, Shenglin Zhao, Irwin King, Michael R. Lyu, Yu-Wing Tai


With the growing prevalence of convolutional neural net-works (CNNs), there is an urgent demand to explain their behaviors. Global explanations contribute to understanding model predictions on a whole category of samples, and thus have attracted increasing interest recently. However, existing methods overwhelmingly conduct separate input attribution or rely on local approximations of models, making them fail to offer faithful global explanations of CNNs. To overcome such drawbacks, we propose a novel two-stage framework, Attacking for Interpretability (AfI), which ex-plains model decisions in terms of the importance of user-defined concepts. AfI first conducts a feature occlusion analysis, which resembles a process of attacking models to derive the category-wide importance of different features. We then map the feature importance to concept importance through ad-hoc semantic tasks. Experimental results con-firm the effectiveness of AfI and its superiority in providing more accurate estimations of concept importance than existing proposals

A Programmatic and Semantic Approach to Explaining and Debugging Neural Network Based Object Detectors

Edward Kim, Divya Gopinath, Corina S. P ̆as ̆areanu, and Sanjit A. Seshia


Even as deep neural networks have become very effective for tasks in vision and perception, it remains difficult to explain and debug their behavior. In this paper, we present a programmatic and semantic approach to explaining, understanding, and debugging the correct and incorrect behaviors of a neural network-based perception system. Our approach is semantic in that it employs a high-level representation of the distribution of environment scenarios that the detector is intended to work on. It is programmatic in that scenario representation is a program in a domain-specific probabilistic programming language which can be used to generate synthetic data to test a given perception module. Our framework assesses the performance of a perception module to identify correct and incorrect detections, extracts rules from those results that semantically characterizes the correct and incorrect scenarios, and then specializes the probabilistic program with those rules in order to more precisely characterize the scenarios in which the perception module operates correctly or not. We demonstrate our results using the SCENIC probabilistic programming language and a neural network-based object detector. Our experiments show that it is possible to automatically generate compact rules that significantly increase the correct detection rate (or conversely the incorrect detection rate) of the network and can thus help with understanding and debugging its behavior.

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

Ziyu Liu, Hongwen Zhang, Zhenghao Chen, Zhiyong Wang, Wanli Ouyang


Spatial-temporal graphs have been widely used by skeleton-based action recognition algorithms to model hu-man action dynamics. To capture robust movement patterns from these graphs, long-range and multi-scale context aggregation and spatial-temporal dependency modeling are critical aspects of a powerful feature extractor. However, existing methods have limitations in achieving (1) unbiased long-range joint relationship modeling under multi-scale operators and (2) unobstructed cross-spacetime in-formation flow for capturing complex spatial-temporal dependencies. In this work, we present (1) a simple method to disentangle multi-scale graph convolutions and (2) a unified spatial-temporal graph convolutional operator namedG3D. The proposed multi-scale aggregation scheme dis-entangles the importance of nodes in different neighbor-hoods for effective long-range modeling. The proposedG3D module leverages dense cross-spacetime edges as skip connections for direct information propagation across the spatial-temporal graph. By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model1outperforms previous state-of-the-art methods on three large-scale datasets: NTU RGB+D 60,NTU RGB+D 120, and Kinetics Skeleton 400.


Explainable Reinforcement Learning through a Causal Lens

Prashan Madumal, Tim Miller, Liz Sonenberg, Frank Vetere


Prominent theories in cognitive science propose that humans understand and represent the knowledge of the world through causal relationships. In making sense of the world, we build causal models in our mind to encode cause-effect relations of events and use these to explain why new events happen by referring to counterfactuals — things that did not happen. In this paper, we use causal models to derive causal explanations of the behaviour of model-free reinforcement learning agents. We present an approach that learns a structural causal model during reinforcement learning and encodes causal relationships between variables of interest. This model is then used to generate explanations of behaviour based on counterfactual analysis of the causal model. We computationally evaluate the model in 6 domains and measure performance and task prediction accuracy. We report on a study with 120 participants who observe agents playing a real-time strategy game (Starcraft II) and then receive explanations of the agents’ behaviour. We investigate: 1) participants’ understanding gained by explanations through task prediction; 2) explanation satisfaction and 3) trust. Our results show that causal model explanations perform better on these measures compared to two other baseline explanation models.

Towards Interpretation of Pairwise Learning

Mengdi Huai, Di Wang, Chenglin Miao, Aidong Zhang


Recently, there are increasingly more attentions paid to an important family of learning problems called pairwise learning, in which the associated loss functions depend on pairs of instances. Despite the tremendous success of pairwise learning in many real-world applications, the lack of transparency behind the learned pairwise models makes it difficult for users to understand how particular decisions are made by these models, which further impedes users from trusting the predicted results. To tackle this problem, in this paper, we study feature importance scoring as a specific approach to the problem of interpreting the predictions of black-box pairwise models. Specifically, we first propose a novel adaptive Shapley-value-based interpretation method, based on which a vector of importance scores associated with the underlying features of a testing instance pair can be adaptively calculated with the consideration of feature correlations, and these scores can be used to indicate which features make key contributions to the final prediction. Considering that Shapley-value-based methods are usually computationally challenging, we further propose a novel robust approximation interpretation method for pairwise models. This method is not only much more efficient but also robust to data noise. To the best of our knowledge, we are the first to investigate how to enable interpretation in pairwise learning. Theoretical analysis and extensive experiments demonstrate the effectiveness of the proposed methods.

Regional Tree Regularization for Interpretability in Deep Neural Networks

Mike Wu, Sonali Parbhoo, Michael Hughes, Ryan Kindle, Leo Celi, Maurizio Zazzi, Volker Roth, Finale Doshi-Velez


The lack of interpretability remains a barrier to adopting deep neural networks across many safety-critical domains. Tree regularization was recently proposed to encourage a deep neural network’s decisions to resemble those of a globally compact, axis-aligned decision tree. However, it is often unreasonable to expect a single tree to predict well across all possible inputs. In practice, doing so could lead to neither interpretable nor performant optima. To address this issue, we propose regional tree regularization – a method that encourages a deep model to be well-approximated by several separate decision trees specific to predefined regions of the input space. Across many datasets, including two healthcare applications, we show our approach delivers simpler explanations than other regularization schemes without compromising accuracy. Specifically, our regional regularizer finds many more “desirable” optima compared to global analogues.

AdaCare: Explainable Clinical Health Status Representation Learning via Scale-Adaptive Feature Extraction and Recalibration

Liantao Ma, Junyi Gao, Yasha Wang, Chaohe Zhang, Jiangtao Wang, Wenjie Ruan, Wen Tang, Xin Gao, Xinyu Ma


Deep learning-based health status representation learning and clinical prediction have raised much research interest in recent years. Existing models have shown superior performance, but there are still several major issues that have not been fully taken into consideration. First, the historical variation pattern of the biomarker in diverse time scales plays a vital role in indicating the health status, but it has not been explicitly extracted by existing works. Second, key factors that strongly indicate the health risk are different among patients. It is still challenging to adaptively make use of the features for patients in diverse conditions. Third, using prediction models as the black box will limit the reliability in clinical practice. However, none of the existing works can provide satisfying interpretability and meanwhile achieve high prediction performance. In this work, we develop a general health status representation learning model, named AdaCare. It can capture the long and short-term variations of biomarkers as clinical features to depict the health status in multiple time scales. It also models the correlation between clinical features to enhance the ones which strongly indicate the health status and thus can maintain a state-of-the-art performance in terms of prediction accuracy while providing qualitative interpretability. We conduct a health risk prediction experiment on two real-world datasets. Experiment results indicate that AdaCare outperforms state-of-the-art approaches and provides effective interpretability, which is verifiable by clinical experts.

MRI Reconstruction with Interpretable Pixel-Wise Operations Using Reinforcement Learning

Wentian Li, Xidong Feng, Haotian An, Xiang Yao Ng, Yu-Jin Zhang


Compressed sensing magnetic resonance imaging (CS-MRI) is a technique aimed at accelerating the data acquisition of MRI. While down-sampling in k-space proportionally reduces the data acquisition time, it results in images corrupted by aliasing artifacts and blur. To reconstruct images from the down-sampled k-space, recent deep-learning based methods have shown better performance compared with classical optimization-based CS-MRI methods. However, they usually use deep neural networks as a black-box, which directly maps the corrupted images to the target images from fully-sampled k-space data. This lack of transparency may impede practical usage of such methods. In this work, we propose a deep reinforcement learning based method to reconstruct the corrupted images with meaningful pixel-wise operations (e.g. edge enhancing filters), so that the reconstruction process is transparent to users. Specifically, MRI reconstruction is formulated as Markov Decision Process with discrete actions and continuous action parameters. We conduct experiments on MICCAI dataset of brain tissues and fastMRI dataset of knee images. Our proposed method performs favorably against previous approaches. Our trained model learns to select pixel-wise operations that correspond to the anatomical structures in the MR images. This makes the reconstruction process more interpretable, which would be helpful for further medical analysis.

KDD 2020 Proceeding

GRACE: Generating Concise and Informative Contrastive Sample to Explain Neural Network Model’s Prediction

Thai Le, Suhang Wang, Dongwon Lee

Despite the recent development in the topic of explainable AI/ML for image and text data, the majority of current solutions are not suitable to explain the prediction of neural network models when the datasets are tabular and their features are in high-dimensional vectorized formats. To mitigate this limitation, therefore, we borrow two notable ideas (i.e., “explanation by intervention” from causality and “explanation are contrastive” from philosophy) and propose a novel solution, named as GRACE, that better explains neural network models’ predictions for tabular datasets. In particular, given a model’s prediction as label X, GRACE intervenes and generates a minimally-modified contrastive sample to be classified as Y, with an intuitive textual explanation, answering the question of “Why X rather than Y?” We carry out comprehensive experiments using eleven public datasets of different scales and domains (e.g., # of features ranges from 5 to 216) and compare GRACE with competing baselines on different measures: fidelity, conciseness, info-gain, and influence. The user-studies show that our generated explanation is not only more intuitive and easy-to-understand but also facilitates end-users to make as much as 60% more accurate post-explanation decisions than that of Lime.

DETERRENT: Knowledge Guided Graph Attention Network for Detecting Healthcare Misinformation

Limeng Cui, Haeseung Seo, Maryam Tabar, Fenglong Ma, Suhang Wang, Dongwon Lee

To provide accurate and explainable misinformation detection, it is often useful to take an auxiliary source (e.g., social context and knowledge base) into consideration. Existing methods use social contexts such as users’ engagements as complementary information to improve detection performance and derive explanations. However, due to the lack of sufficient professional knowledge, users seldom respond to healthcare information, which makes these methods less applicable. In this work, to address these shortcomings, we propose a novel knowledge guided graph attention network for detecting health misinformation better. Our proposal, named as DETERRENT, leverages on the additional information from medical knowledge graph by propagating information along with the network, incorporates a Medical Knowledge Graph and an Article-Entity Bipartite Graph, and propagates the node embeddings through Knowledge Paths. In addition, an attention mechanism is applied to calculate the importance of entities to each article, and the knowledge guided article embeddings are used for misinformation detection. DETERRENT addresses the limitation on social contexts in the healthcare domain and is capable of providing useful explanations for the results of detection. Empirical validation using two real-world datasets demonstrated the effectiveness of DETERRENT. Comparing with the best results of eight competing methods, in terms of F1 Score, DETERRENT outperforms all methods by at least 4.78% on the diabetes dataset and 12.79% on cancer dataset. We release the source code of DETERRENT at:

xGAIL: Explainable Generative Adversarial Imitation Learning for Explainable Human Decision Analysis

Menghai Pan, Weixiao Huang, Yanhua Li, Xun Zhou, Jun Luo

To make daily decisions, human agents devise their own “strategies” governing their mobility dynamics (e.g., taxi drivers have preferred working regions and times, and urban commuters have preferred routes and transit modes). Recent research such as generative adversarial imitation learning (GAIL) demonstrates successes in learning human decision-making strategies from their behavior data using deep neural networks (DNNs), which can accurately mimic how humans behave in various scenarios, e.g., playing video games, etc. However, such DNN-based models are “black box” models in nature, making it hard to explain what knowledge the models have learned from human, and how the models make such decisions, which was not addressed in the literature of imitation learning. This paper addresses this research gap by proposing xGAIL, the first explainable generative adversarial imitation learning framework. The proposed xGAIL framework consists of two novel components, including Spatial Activation Maximization (SpatialAM) and Spatial Randomized Input Sampling Explanation (SpatialRISE), to extract both global and local knowledge from a well-trained GAIL model that explains how a human agent makes decisions. Especially, we take taxi drivers’ passenger-seeking strategy as an example to validate the effectiveness of the proposed xGAIL framework. Our analysis on a large-scale real-world taxi trajectory data shows promising results from two aspects: i) global explainable knowledge of what nearby traffic condition impels a taxi driver to choose a particular direction to find the next passenger, and ii) local explainable knowledge of what key (sometimes hidden) factors a taxi driver considers when making a particular

Knowing your FATE: Friendship, Action and Temporal Explanations for User Engagement Prediction on Social Apps

Xianfeng Tang, Yozen Liu, Neil Shah, Xiaolin Shi, Prasenjit Mitra, Suhang Wang

With the rapid growth and prevalence of social network applications (Apps) in recent years, understanding user engagement has become increasingly important, to provide useful insights for future App design and development. While several promising neural modeling approaches were recently pioneered for accurate user engagement prediction, their black-box designs are unfortunately limited in model explainability. In this paper, we study a novel problem of explainable user engagement prediction for social network Apps. First, we propose a flexible definition of user engagement for various business scenarios, based on future metric expectations. Next, we design an end-to-end neural framework, FATE, which incorporates three key factors that we identify to influence user engagement, namely friendships, user actions, and temporal dynamics to achieve explainable engagement predictions. FATE is based on a tensor-based graph neural network (GNN), LSTM and a mixture attention mechanism, which allows for (a) predictive explanations based on learned weights across different feature categories, (b) reduced network complexity, and © improved performance in both prediction accuracy and training/inference time. We conduct extensive experiments on two large-scale datasets from Snapchat, where FATE outperforms state-of-the-art approaches by 10% error and 20% runtime reduction. We also evaluate explanations from FATE, showing strong quantitative and qualitative performance.

Easy Perturbation EEG Algorithm for Spectral Importance (easyPEASI): A Simple Method to Identify Important Spectral Features of EEG in Deep Learning Models

David O. Nahmias, Kimberly L. Kontson

Efforts into understanding neurological differences between populations is an active area of research. Deep learning has recently shown promising results using EEG as input to distinguish recordings of subjects based on neurological activity. However, only about one quarter of these studies investigate the underlying neurophysiological implications. This work proposes and validates a method to investigate frequency bands important to EEG-driven deep learning models. Easy perturbation EEG algorithm for spectral importance (easyPEASI) is simpler than previous methods and requires only perturbations to input data. We validate easyPEASI on EEG pathology classification using the Temple University Health EEG Corpus. easyPEASI is further applied to characterize the effects of patients’ medications on brain rhythms. We investigate classifications of patients taking one of two anticonvulsant medications, Dilantin (phenytoin) and Keppra (levetiracetam), and subjects taking no medications. We find that for recordings of subjects with clinically-determined normal EEG that these medications effect the Theta and Alpha band most significantly. For recordings with clinically-determined abnormal EEG these medications affected the Delta, Theta, and Alpha bands most significantly. We also find the Beta band to be affected differently by the two medications. Results found here show promise for a method of obtaining explainable artificial intelligence and interpretable models from EEG-driven deep learning through a simpler more accessible method perturbing only input data. Overall, this work provides a fast, easy, and reproducible method to automatically determine salient spectral features of neural activity that have been learned by machine learning models, such as deep learning.

:smiley:XGNN: Towards Model-Level Explanations of Graph Neural Networks

Hao Yuan, Jiliang Tang, Xia Hu, Shuiwang Ji

Graphs neural networks (GNNs) learn node features by aggregating and combining neighbor information, which have achieved promising performance on many graph tasks. However, GNNs are mostly treated as black-boxes and lack human intelligible explanations. Thus, they cannot be fully trusted and used in certain application domains if GNN models cannot be explained. In this work, we propose a novel approach, known as XGNN, to interpret GNNs at the model-level. Our approach can provide high-level insights and generic understanding of how GNNs work. In particular, we propose to explain GNNs by training a graph generator so that the generated graph patterns maximize a certain prediction of the model. We formulate the graph generation as a reinforcement learning task, where for each step, the graph generator predicts how to add an edge into the current graph. The graph generator is trained via a policy gradient method based on information from the trained GNNs. In addition, we incorporate several graph rules to encourage the generated graphs to be valid. Experimental results on both synthetic and real-world datasets show that our proposed methods help understand and verify the trained GNNs. Furthermore, our experimental results indicate that the generated graphs can provide guidance on how to improve the trained GNNs.

INPREM: An Interpretable and Trustworthy Predictive Model for Healthcare

Xianli Zhang, Buyue Qian, Shilei Cao, Yang Li, Hang Chen, Yefeng Zheng, Ian Davidson

Building a predictive model based on historical Electronic Health Records (EHRs) for personalized healthcare has become an active research area. Benefiting from the powerful ability of feature extraction, deep learning (DL) approaches have achieved promising performance in many clinical prediction tasks. However, due to the lack of interpretability and trustworthiness, it is difficult to apply DL in real clinical cases of decision making. To address this, in this paper, we propose an interpretable and trustworthy predictive model~(INPREM) for healthcare. Firstly, INPREM is designed as a linear model for interpretability while encoding non-linear relationships into the learning weights for modeling the dependencies between and within each visit. This enables us to obtain the contribution matrix of the input variables, which is served as the evidence of the prediction result(s), and help physicians understand why the model gives such a prediction, thereby making the model more interpretable. Secondly, for trustworthiness, we place a random gate (which follows a Bernoulli distribution to turn on or off) over each weight of the model, as well as an additional branch to estimate data noises. With the help of the Monto Carlo sampling and an objective function accounting for data noises, the model can capture the uncertainty of each prediction. The captured uncertainty, in turn, allows physicians to know how confident the model is, thus making the model more trustworthy. We empirically demonstrate that the proposed INPREM outperforms existing approaches with a significant margin. A case study is also presented to show how the contribution matrix and the captured uncertainty are used to assist physicians in making robust decisions.

Malicious Attacks against Deep Reinforcement Learning Interpretations

Mengdi Huai, Jianhui Sun, Renqin Cai, Liuyi Yao, Aidong Zhang

The past years have witnessed the rapid development of deep reinforcement learning (DRL), which is a combination of deep learning and reinforcement learning (RL). However, the adoption of deep neural networks makes the decision-making process of DRL opaque and lacking transparency. Motivated by this, various interpretation methods for DRL have been proposed. However, those interpretation methods make an implicit assumption that they are performed in a reliable and secure environment. In practice, sequential agent-environment interactions expose the DRL algorithms and their corresponding downstream interpretations to extra adversarial risk. In spite of the prevalence of malicious attacks, there is no existing work studying the possibility and feasibility of malicious attacks against DRL interpretations. To bridge this gap, in this paper, we investigate the vulnerability of DRL interpretation methods. Specifically, we introduce the first study of the adversarial attacks against DRL interpretations, and propose an optimization framework based on which the optimal adversarial attack strategy can be derived. In addition, we study the vulnerability of DRL interpretation methods to the model poisoning attacks, and present an algorithmic framework to rigorously formulate the proposed model poisoning attack. Finally, we conduct both theoretical analysis and extensive experiments to validate the effectiveness of the proposed malicious attacks against DRL interpretations.

Graph Structural-topic Neural Network

Qingqing Long, Yilun Jin, Guojie Song, Yi Li, Wei Lin

Graph Convolutional Networks (GCNs) achieved tremendous success by effectively gathering local features for nodes. However, commonly do GCNs focus more on node features but less on graph structures within the neighborhood, especially higher-order structural patterns. However, such local structural patterns are shown to be indicative of node properties in numerous fields. In addition, it is not just single patterns, but the distribution over all these patterns matter, because networks are complex and the neighborhood of each node consists of a mixture of various nodes and structural patterns. Correspondingly, in this paper, we propose Graph Structural topic Neural Network, abbreviated GraphSTONE 1, a GCN model that utilizes topic models of graphs, such that the structural topics capture indicative graph structures broadly from a probabilistic aspect rather than merely a few structures. Specifically, we build topic models upon graphs using anonymous walks and Graph Anchor LDA, an LDA variant that selects significant structural patterns first, so as to alleviate the complexity and generate structural topics efficiently. In addition, we design multi-view GCNs to unify node features and structural topic features and utilize structural topics to guide the aggregation. We evaluate our model through both quantitative and qualitative experiments, where our model exhibits promising performance, high efficiency, and clear interpretability.

The Spectral Zoo of Networks: Embedding and Visualizing Networks with Spectral Moments

Shengmin Jin, Reza Zafarani

Network embedding methods have been widely and successfully used in network-based applications such as node classification and link prediction. However, an ideal network embedding should not only be useful for machine learning, but interpretable. We introduce a spectral embedding method for a network, its Spectral Point, which is basically the first few spectral moments of a network. Spectral moments are interpretable, where we prove their close relationships to network structure (e.g. number of triangles and squares) and various network properties (e.g. degree distribution, clustering coefficient, and network connectivity). Using spectral points, we introduce a visualizable and bounded 3D embedding space for all possible graphs, in which one can characterize various types of graphs (e.g., cycles), or real-world networks from different categories (e.g., social or biological networks). We demonstrate that spectral points can be used for network identification (i.e., what network is this subgraph sampled from?) and that by using just the first few moments one does not lose much predictive power.

Diverse Rule Sets

Guangyi Zhang, Aristides Gionis

While machine-learning models are flourishing and transforming many aspects of everyday life, the inability of humans to understand complex models poses difficulties for these models to be fully trusted and embraced. Thus, interpretability of models has been recognized as an equally important quality as their predictive power. In particular, rule-based systems are experiencing a renaissance owing to their intuitive if-then representation.

However, simply being rule-based does not ensure interpretability. For example, overlapped rules spawn ambiguity and hinder interpretation. Here we propose a novel approach of inferring diverse rule sets, by optimizing small overlap among decision rules with a 2-approximation guarantee under the framework of Max-Sum diversification. We formulate the problem as maximizing a weighted sum of discriminative quality and diversity of a rule set.

In order to overcome an exponential-size search space of association rules, we investigate several natural options for a small candidate set of high-quality rules, including frequent and accurate rules, and examine their hardness. Leveraging the special structure in our formulation, we then devise an efficient randomized algorithm, which samples rules that are highly discriminative and have small overlap. The proposed sampling algorithm analytically targets a distribution of rules that is tailored to our objective.

We demonstrate the superior predictive power and interpretability of our model with a comprehensive empirical study against strong baselines.

Interpretable Deep Graph Generation with Node-edge Co-disentanglement

:smiley:Xiaojie Guo, Liang Zhao, Zhao Qin, Lingfei Wu, Amarda Shehu, Yanfang Ye

Disentangled representation learning has recently attracted a significant amount of attention, particularly in the field of image representation learning. However, learning the disentangled representations behind a graph remains largely unexplored, especially for the attributed graph with both node and edge features. Disentanglement learning for graph generation has substantial new challenges including 1) the lack of graph deconvolution operations to jointly decode node and edge attributes; and 2) the difficulty in enforcing the disentanglement among latent factors that respectively influence: i) only nodes, ii) only edges, and iii) joint patterns between them. To address these challenges, we propose a new disentanglement enhancement framework for deep generative models for attributed graphs. In particular, a novel variational objective is proposed to disentangle the above three types of latent factors, with novel architecture for node and edge deconvolutions. Qualitative and quantitative experiments on both synthetic and real-world datasets demonstrate the effectiveness of the proposed model and its extensions.

Context-Aware Attentive Knowledge Tracing

Aritra Ghosh, Neil Heffernan, Andrew S. Lan

Knowledge tracing (KT) refers to the problem of predicting future learner performance given their past performance in educational applications. Recent developments in KT using flexible deep neural network-based models excel at this task. However, these models often offer limited interpretability, thus making them insufficient for personalized learning, which requires using interpretable feedback and actionable recommendations to help learners achieve better learning outcomes. In this paper, we propose attentive knowledge tracing (AKT), which couples flexible attention-based neural network models with a series of novel, interpretable model components inspired by cognitive and psychometric models. AKT uses a novel monotonic attention mechanism that relates a learner’s future responses to assessment questions to their past responses; attention weights are computed using exponential decay and a context-aware relative distance measure, in addition to the similarity between questions. Moreover, we use the Rasch model to regularize the concept and question embeddings; these embeddings are able to capture individual differences among questions on the same concept without using an excessive number of parameters. We conduct experiments on several real-world benchmark datasets and show that AKT outperforms existing KT methods (by up to 6% in AUC in some cases) on predicting future learner responses. We also conduct several case studies and show that AKT exhibits excellent interpretability and thus has potential for automated feedback and personalization in real-world educational settings.

Easy Perturbation EEG Algorithm for Spectral Importance (easyPEASI): A Simple Method to Identify Important Spectral Features of EEG in Deep Learning Models

David O. Nahmias, Kimberly L. Kontson

Efforts into understanding neurological differences between populations is an active area of research. Deep learning has recently shown promising results using EEG as input to distinguish recordings of subjects based on neurological activity. However, only about one quarter of these studies investigate the underlying neurophysiological implications. This work proposes and validates a method to investigate frequency bands important to EEG-driven deep learning models. Easy perturbation EEG algorithm for spectral importance (easyPEASI) is simpler than previous methods and requires only perturbations to input data. We validate easyPEASI on EEG pathology classification using the Temple University Health EEG Corpus. easyPEASI is further applied to characterize the effects of patients’ medications on brain rhythms. We investigate classifications of patients taking one of two anticonvulsant medications, Dilantin (phenytoin) and Keppra (levetiracetam), and subjects taking no medications. We find that for recordings of subjects with clinically-determined normal EEG that these medications effect the Theta and Alpha band most significantly. For recordings with clinically-determined abnormal EEG these medications affected the Delta, Theta, and Alpha bands most significantly. We also find the Beta band to be affected differently by the two medications. Results found here show promise for a method of obtaining explainable artificial intelligence and interpretable models from EEG-driven deep learning through a simpler more accessible method perturbing only input data. Overall, this work provides a fast, easy, and reproducible method to automatically determine salient spectral features of neural activity that have been learned by machine learning models, such as deep learning.

Attention based Multi-Modal New Product Sales Time-series Forecasting

Vijay Ekambaram, Kushagra Manglik, Sumanta Mukherjee, Surya Shravan Kumar Sajja, Satyam Dwivedi, Vikas Raykar

Trend driven retail industries such as fashion, launch substantial new products every season. In such a scenario, an accurate demand forecast for these newly launched products is vital for efficient downstream supply chain planning like assortment planning and stock allocation. While classical time-series forecasting algorithms can be used for existing products to forecast the sales, new products do not have any historical time-series data to base the forecast on. In this paper, we propose and empirically evaluate several novel attention-based multi-modal encoder-decoder models to forecast the sales for a new product purely based on product images, any available product attributes and also external factors like holidays, events, weather, and discount. We experimentally validate our approaches on a large fashion dataset and report the improvements in achieved accuracy and enhanced model interpretability as compared to existing k-nearest neighbor based baseline approaches.

Cracking the Black Box: Distilling Deep Sports Analytics

Xiangyu Sun, Jack Davis, Oliver Schulte, Guiliang Liu

This paper addresses the trade-off between Accuracy and Transparency for deep learning applied to sports analytics. Neural nets achieve great predictive accuracy through deep learning, and are popular in sports analytics. But it is hard to interpret a neural net model and harder still to extract actionable insights from the knowledge implicit in it. Therefore, we built a simple and transparent model that mimics the output of the original deep learning model and represents the learned knowledge in an explicit interpretable way. Our mimic model is a linear model tree, which combines a collection of linear models with a regression-tree structure. The tree version of a neural network achieves high fidelity, explains itself, and produces insights for expert stakeholders such as athletes and coaches. We propose and compare several scalable model tree learning heuristics to address the computational challenge from datasets with millions of data points.

SESSION: Tutorial Abstracts

:smiley:Explainable Classification of Brain Networks via Contrast Subgraphs

Tommaso Lanciano, Francesco Bonchi, Aristides Gionis

Mining human-brain networks to discover patterns that can be used to discriminate between healthy individuals and patients affected by some neurological disorder, is a fundamental task in neuro-science. Learning simple and interpretable models is as important as mere classification accuracy. In this paper we introduce a novel approach for classifying brain networks based on extracting contrast subgraphs, i.e., a set of vertices whose induced subgraphs are dense in one class of graphs and sparse in the other. We formally define the problem and present an algorithmic solution for extracting contrast subgraphs. We then apply our method to a brain-network dataset consisting of children affected by Autism Spectrum Disorder and children Typically Developed. Our analysis confirms the interestingness of the discovered patterns, which match background knowledge in the neuro-science literature. Further analysis on other classification tasks confirm the simplicity, soundness, and high explainability of our proposal, which also exhibits superior classification accuracy, to more complex state-of-the-art methods.

Intelligible and Explainable Machine Learning: Best Practices and Practical Challenges

Rich Caruana, Scott Lundberg, Marco Tulio Ribeiro, Harsha Nori, Samuel Jenkins

Learning methods such as boosting and deep learning have made ML models harder to understand and interpret. This puts data scientists and ML developers in the position of often having to make a tradeoff between accuracy and intelligibility. Research in IML (Interpretable Machine Learning) and XAI (Explainable AI) focus on minimizing this trade-off by developing more accurate interpretable models and by developing new techniques to explain black-box models. Such models and techniques make it easier for data scientists, engineers and model users to debug models and achieve important objectives such as ensuring the fairness of ML decisions and the reliability and safety of AI systems. In this tutorial, we present an overview of various interpretability methods and provide a framework for thinking about how to choose the right explanation method for different real-world scenarios. We will focus on the application of XAI in practice through a variety of case studies from domains such as healthcare, finance, and bias and fairness. Finally, we will present open problems and research directions for the data mining and machine learning community. What audience will learn: When and how to use a variety of machine learning interpretability methods through case studies of real-world situations. The difference between glass-box and black-box explanation methods and when to use them. How to use open source interpretability toolkits that are now available

Building Recommender Systems with PyTorch

Dheevatsa Mudigere, Maxim Naumov, Joe Spisak, Geeta Chauhan, Narine Kokhlikyan, Amanpreet Singh, Vedanuj Goswami

In this tutorial we show how to build deep learning recommendation systems and resolve the associated interpretability, integrity and privacy challenges. We start with an overview of the PyTorch framework, features that it offers and a brief review of the evolution of recommendation models. We delineate their typical components and build a proxy deep learning recommendation model (DLRM) in PyTorch. Then, we discuss how to interpret recommendation system results as well as how to address the corresponding integrity and quality challenges.

Tutorial on Human-Centered Explainability for Healthcare

Prithwish Chakraborty, Bum Chul Kwon, Sanjoy Dey, Amit Dhurandhar, Daniel Gruen, Kenney Ng, Daby Sow, Kush R. Varshney

In recent years, the rapid advances in Artificial Intelligence (AI) techniques along with an ever-increasing availability of healthcare data have made many novel analyses possible. Significant successes have been observed in a wide range of tasks such as next diagnosis prediction, AKI prediction, adverse event predictions including mortality and unexpected hospital re-admissions. However, there has been limited adoption and use in the clinical practice of these methods due to their black-box nature. A significant amount of research is currently focused on making such methods more interpretable or to make post-hoc explanations more accessible. However, most of this work is done at a very low level and as a result, may not have a direct impact at the point-of-care. This tutorial will provide an overview of the landscape of different approaches that have been developed for explainability in healthcare. Specifically, we will present the problem of explainability as it pertains to various personas involved in healthcare viz. data scientists, clinical researchers, and clinicians. We will chart out the requirements for such personas and present an overview of the different approaches that can address such needs. We will also walk-through several use-cases for such approaches. In this process, we will provide a brief introduction to explainability, charting its different dimensions as well as covering some relevant interpretability methods spanning such dimensions. We will touch upon some practical guides for explainability and provide a brief survey of open source tools such as the IBM AI Explainability 360 Open Source Toolkit.

Interpreting and Explaining Deep Neural Networks: A Perspective on Time Series Data

Jaesik Choi

Explainable and interpretable machine learning models and algorithms are important topics which have received growing attention from research, application and administration. Many complex Deep Neural Networks (DNNs) are often perceived as black-boxes. Researchers would like to be able to interpret what the DNN has learned in order to identify biases and failure models and improve models. In this tutorial, we will provide a comprehensive overview on methods to analyze deep neural networks and an insight how those interpretable and explainable methods help us understand time series data.