GitHub Support CommunityModel interpretability

In the recent years neural networks furthered the state of the art in many domains like, e.g., object detection and speech recognition. Despite the success neural networks are typically still treated as black boxes. Their internal workings are not fully understood and the basis for their predictions is unclear. In the attempt to understand neural networks better several methods were proposed, e.g., Saliency, Deconvnet, GuidedBackprop, SmoothGrad, IntergratedGradients, LRP, PatternNet&-Attribution. Due to the lack of a reference implementations comparing them is a major effort. This library addresses this by providing a common interface and out-of-the-box implementation for many analysis methods. Our goal is to make analyzing neural networks’ predictions easy!


gradient: The gradient of the output neuron with respect to the input.
smoothgrad: SmoothGrad averages the gradient over number of inputs with added noise.
deconvnet: DeConvNet applies a ReLU in the gradient computation instead of the gradient of a ReLU.
guided: Guided BackProp applies a ReLU in the gradient computation additionally to the gradient of a ReLU. PatternNet estimates the input signal of the output neuron.
input_t_gradient: Input * Gradient
deep_taylor[.bounded]: DeepTaylor computes for each neuron a root point, that is close to the input, but which’s output value is 0, and uses this difference to estimate the attribution of each neuron recursively.
pattern.attribution: PatternAttribution applies Deep Taylor by searching root points along the signal direction of each neuron.
lrp.*: LRP attributes recursively to each neuron’s input relevance proportional to its contribution of the neuron output.
integrated_gradients: IntegratedGradients integrates the gradient along a path from the input to a reference.
input: Returns the input.
random: Returns random Gaussian noise.

Official website

Tutorial and documentation

Enter your contact information to continue reading