Model Perception Analysis
To perform this analysis, go to the Dashboard view on the right, and select
Analyzerfrom the list at the top.
The Tensorleap platform tracks how each learned feature, within each layer, responds to each sample. From that information, it constructs a vector that captures how the model perceives each sample. This allows the platform to create a similarity map between samples as they are interpreted by the model. A more intuitive explanation is that similar samples would activate similar learned features within the model.
Population Exploration Analysis takes these vectors and runs dimension reduction algorithms in order to visualize the similarities in the UI and provide insights on how various samples are perceived by the model.
In this section, we will perform a Population Exploration analysis. For more information, see Population Exploration.
After each epoch in the training process, a Population Exploration analysis is performed automatically, the results of which can be found under the Analysis panel in the Dashboard view.
Each dot represents a sample (hover your mouse over it to see the sample's preview), while by default, the dot's size and color represent the sample's loss. This can be changed to fit your preferences. For example, we can change the color to represent the ground truth label -
metadata_gt. By doing that, we can easily see the clusters formed for each label. In addition, we can see which label cluster is perceived to be similar to another, and which failing samples are not located within the right cluster.
Color Dots by Ground Truth
In the default view, the dot size represents the loss (error). Large dot size is highly correlated with a failed prediction for that sample.
Exploring some samples with high loss reveals possible ambiguities and mislabeling. Once a sample dot is clicked, its details are displayed on the right. In the example below, we see a sample with high loss, located within the red cluster, which correlates to
positivereviews. This sample's ground truth is marked as
negative, but reading the review shows it is clearly a positive one.
Focus on Positive Review marked as Negative
Below is another interesting sample which is labelled as
positivebut predicted as
Sample Ground Truth vs Prediction
Note that due to the randomness of the initial weights, your model could converge to a different state, thus rendering different results.
Analysis of a sample returns results from a variety of explainability algorithms. Details about these algorithms can be found at Sample Analysis.
Click a sample dot to select it, then click on
at the bottom right to analyze the selected sample.
By checking the most informative features that contribute to the prediction or the loss function, we can rank the features that contain larger impact and generate a heat map that demonstrates the area in the samples that activates those features. This area contributes to the prediction or to the loss the most.
Heat Map for Negative (click-to-zoom)
Ground Truth (click-to-zoom)
Another sample with high loss is about a vampire movie.
Sample with High Loss
After selecting the sample, scroll down to the sample's Details panel on the right and click
Once the analysis completes, we can explore the model's response to the sample. In the example, we can see that the model's prediction was
negative, even though the ground truth is
Ground Truth (click-to-zoom)
Clicking on feature_map on the right will show a heat map correlated with each output. This shows what areas in the input had a high impact on the output.
Heat map for Positive Output (click-to-zoom)
Heat-Map for Negative Output (click-to-zoom)
From the analysis above, we see the words that get high attention for each input.
A few insights we might get about our model:
- It has high attention to words separated from the sentence or neighboring words. This might point to a limited contextual understanding.
- It has high attention to positive words, e.g.,
- The model gives high significance to negative words, e.g.,
- It has a bias against vampire movies. There is a lot of
negativeattention on words like
Up until now, we had used a model with Dense layers. Another approach is building a model based on convolutional layers, e.g. CNN (Convolutional Neural Network).
The motivation is that in some cases, the convolutional layers will catch the spatial connections between words better.
The model presented in this section is based on the
tensorleap_conv_model()found in Tensorleap's Examples Repository, specifically model_infer.py.
Although we can build/push the model similar to how we handled the dense model, we will use Import Model this time.
For your convenience, you can download the model below.
After downloading the serialized model, follow these steps to import it:
- 1.Clickon the left side of the Network view.
- 2.Set Revision Name to
mnist-cnn, the Model Name to
mnist-cnn-imported, File Type to
H5_TF2, select the model above for upload and click.
- 3.After the import is finished, you will see the added version on the left. You should see the model with the Conv1D layers at the Network view.
- 4.Click the Dataset Block, then on the Dataset Details panel, connect it to the
imdbdataset instance. Then connect the block to the first Embedding layer.
- 5.Near the last layer, right-click on the background, select GroundTruth, then click and select
Ground Truth - sentiment.
- 6.Right-click on the background and select Loss->CategoricalCrossentropy, and Optimizer->Adam.
- 7.Connect the GroundTruth and the last Dense layer to the CategoricalCrossentropy block, and connect that block to the Adam optimizer block.
- 8.Clickon the Network view, select the Override Current Version checkbox and click.
The image below shows the CNN model after completing the steps above.
Loss, Optimizer and Visualizers
In this section, we will run a sample analysis on the imported CNN model.
Follow these steps:
- 1.On the Versions view, extend the
mnist-cnnversion and click thebutton to add it to the Dashboard view.
- 2.Click, choose
mnist-cnn-importedfrom the right, and for Selected Subset, select
Validation. This step collects metrics about the data in order to perform the analysis.
- 3.In the Dashboard view, click, then.
- 4.Set Dataset Slice to
Validationand Sample Index to
235, then click Analyze.
The images below show the resulting heat maps for
Heat map for Positive Output (CNN)
Heat map for Negative Output (CNN)
The CNN model pays particular attention to sentences and neighboring words. This enables the model to perceive additional context in each review.
In this section, we demonstrated how Tensorleap performs population exploration analysis and sample analysis using a dense and CNN models.
Next, we will add custom metadata to help us find more correlations in our samples and model. When you're ready, go to Advanced Metrics.