Model Perception Analysis

To perform this analysis, go to the Dashboard view on the right, and select Analyzer from the list at the top.

Population Exploration Analysis

The Tensorleap platform tracks how each learned feature, within each layer, responds to each sample. From that information, it constructs a vector that captures how the model perceives each sample. This allows the platform to create a similarity map between samples as they are interpreted by the model. A more intuitive explanation is that similar samples would activate similar learned features within the model.
For more information on model evaluation and training, see Evaluate/Train Model.
Population Exploration Analysis takes these vectors and runs dimension reduction algorithms in order to visualize the similarities in the UI and provide insights on how various samples are perceived by the model.
In this section, we will perform a Population Exploration analysis. For more information, see Population Exploration.
After each epoch in the training process, a Population Exploration analysis is performed automatically, the results of which can be found under the Analysis panel in the Dashboard view.
Each dot represents a sample (hover your mouse over it to see the sample's preview), while by default, the dot's size and color represent the sample's loss. This can be changed to fit your preferences. For example, we can change the color to represent the ground truth label - metadata_gt. By doing that, we can easily see the clusters formed for each label. In addition, we can see which label cluster is perceived to be similar to another, and which failing samples are not located within the right cluster.
Color Dots by Ground Truth
In the default view, the dot size represents the loss (error). Large dot size is highly correlated with a failed prediction for that sample.

Mislabelled Sample

Exploring some samples with high loss reveals possible ambiguities and mislabeling. Once a sample dot is clicked, its details are displayed on the right. In the example below, we see a sample with high loss, located within the red cluster, which correlates to positive reviews. This sample's ground truth is marked as negative, but reading the review shows it is clearly a positive one.
Mis-labeled Sample
Focus on Positive Review marked as Negative

Failing Samples

Below is another interesting sample which is labelled as positive but predicted as negative.
Sample Ground Truth vs Prediction
Note that due to the randomness of the initial weights, your model could converge to a different state, thus rendering different results.

Sample Analysis

Analysis of a sample returns results from a variety of explainability algorithms. Details about these algorithms can be found at Sample Analysis.
Click a sample dot to select it, then click on
at the bottom right to analyze the selected sample.
By checking the most informative features that contribute to the prediction or the loss function, we can rank the features that contain larger impact and generate a heat map that demonstrates the area in the samples that activates those features. This area contributes to the prediction or to the loss the most.
Heat Map for Negative (click-to-zoom)
Ground Truth (click-to-zoom)
Prediction (click-to-zoom)
Another sample with high loss is about a vampire movie.
Sample with High Loss
After selecting the sample, scroll down to the sample's Details panel on the right and click
Once the analysis completes, we can explore the model's response to the sample. In the example, we can see that the model's prediction was negative, even though the ground truth is positive.
Prediction (click-to-zoom)
Ground Truth (click-to-zoom)
Clicking on feature_map on the right will show a heat map correlated with each output. This shows what areas in the input had a high impact on the output.
Heat map for Positive Output (click-to-zoom)
Heat-Map for Negative Output (click-to-zoom)
From the analysis above, we see the words that get high attention for each input.
A few insights we might get about our model:
  • It has high attention to words separated from the sentence or neighboring words. This might point to a limited contextual understanding.
  • It has high attention to positive words, e.g., enjoyable, definitely, cool, and likeable.
  • The model gives high significance to negative words, e.g., disappointed, unfortunately, bad, and slow.
  • It has a bias against vampire movies. There is a lot of negative attention on words like vampire, vampires, and Dracula.

Comparison to CNN

Up until now, we had used a model with Dense layers. Another approach is building a model based on convolutional layers, e.g. CNN (Convolutional Neural Network).
The motivation is that in some cases, the convolutional layers will catch the spatial connections between words better.

Importing the CNN

The model presented in this section is based on the tensorleap_conv_model() found in Tensorleap's Examples Repository, specifically
Although we can build/push the model similar to how we handled the dense model, we will use Import Model this time.
For your convenience, you can download the model below.
After downloading the serialized model, follow these steps to import it:
  1. 1.
    on the left side of the Network view.
  2. 2.
    Set Revision Name to mnist-cnn, the Model Name to mnist-cnn-imported, File Type to H5_TF2, select the model above for upload and click
  3. 3.
    After the import is finished, you will see the added version on the left. You should see the model with the Conv1D layers at the Network view.
  4. 4.
    Click the Dataset Block, then on the Dataset Details panel, connect it to the imdb dataset instance. Then connect the block to the first Embedding layer.
  5. 5.
    Near the last layer, right-click on the background, select GroundTruth, then click and select Ground Truth - sentiment.
  6. 6.
    Right-click on the background and select Loss->CategoricalCrossentropy, and Optimizer->Adam.
  7. 7.
    Connect the GroundTruth and the last Dense layer to the CategoricalCrossentropy block, and connect that block to the Adam optimizer block.
  8. 8.
    on the Network view, select the Override Current Version checkbox and click
  9. 9.
    Add the Visualizers as was previously described in the Model Integration.
The image below shows the CNN model after completing the steps above.
Loss, Optimizer and Visualizers

Sample Analysis (CNN)

In this section, we will run a sample analysis on the imported CNN model.
Follow these steps:
  1. 1.
    On the Versions view, extend the mnist-cnn version and click the
    button to add it to the Dashboard view.
  2. 2.
    , choose mnist-cnn-imported from the right, and for Selected Subset, select Validation. This step collects metrics about the data in order to perform the analysis.
  3. 3.
    In the Dashboard view, click
    , then
  4. 4.
    Set Dataset Slice to Validation and Sample Index to 235, then click Analyze.
The images below show the resulting heat maps for positive and negative reviews.
Heat map for Positive Output (CNN)
Heat map for Negative Output (CNN)
The CNN model pays particular attention to sentences and neighboring words. This enables the model to perceive additional context in each review.

Up Next - Advanced Metrics

In this section, we demonstrated how Tensorleap performs population exploration analysis and sample analysis using a dense and CNN models.
Next, we will add custom metadata to help us find more correlations in our samples and model. When you're ready, go to Advanced Metrics.