> For the complete documentation index, see [llms.txt](https://docs.tensorleap.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.tensorleap.ai/examples/image-analysis.md).

# Image Analysis

## The MNIST Dataset

In this example, we used the [**MNIST dataset**](http://yann.lecun.com/exdb/mnist/) (handwritten digits) to train a convolutional neural network (CNN) for image classification, and reached a validation accuracy of \~0.98 on our trained model.

{% hint style="info" %}
For more information on using Tensorleap with the MNIST dataset, see [**MNIST Guide**](/guides/full-guides/mnist-guide.md).
{% endhint %}

### Population Exploration <a href="#population-exploration" id="population-exploration"></a>

Below is a **population exploration** plot. It represents a samples' similarity map based on the model's latent space, built using the extracted features of the trained model.

#### Training Set:

The samples shown below were taken from the **training set** and are colored based on their **ground truth** class. The dot's size represents the model's error loss for that specific sample. We can see that there is a clear separation of the samples based on their classes.

![Population Exploration Analysis of Training Samples](/files/uDRoX1meEF5cRYNHJv25)

#### Validation Set:

The plot below is a similarity map within the **validation set** and we can see that there is still a nice separation, but there are quite a few samples with higher error loss (larger dot size). Interestingly, those false predictions are located within clusters of classes different from their **ground truth** class.

![Population Exploration of the Validation Set](/files/DS7KZkfkWLJq3QPttXRX)

#### Similar Classes

In the visualization below we can see that the samples from class **7** *(peach-colored)* and class **1** *(light-blue)* are relatively close to each other. This makes sense since the digit **7** is visually similar to the digit **1**.

![Close Similar Clusters of Classes 1 and 7](/files/JAn0I2L2qG0oTaCKV2YG)

### Error Analysis <a href="#error-analysis" id="error-analysis"></a>

The dot's size in the visualization below represents the sample's loss. The selected sample is a challenging one, and we can also see that it has a relatively high loss. The sample's metadata shown on the right panel shows a distance of 6.83 to class `6` and 7.71 to the ground truth - class `5,`and indeed prediction was class `6`.

![Prediction: 6](/files/LDoZ6gsvd0shmfmqa9zY) ![GT:5](/files/PlqijjbjY1OKSgORDMuK)

### Sample Analysis <a href="#sample-analysis" id="sample-analysis"></a>

The Tensorleap platform provides a way to further explore the model's response to specific data samples. For example, from the results below, we can see that the model's prediction was **9** while the ground truth is **8**. Additionally, we can use the heat map to see which features of the data contributed to each of the output classes.

![Heat Maps Correlated for each Output](/files/GCgMX7u87RiLAe7xLNmr)

### Fetch Similars <a href="#fetch-similars" id="fetch-similars"></a>

As explained, Tensorleap tracks how each learned feature responds to each sample. This allows finding samples that the model considers similar, as they *activate* similar learned features.

This functionality also allows to automatically find **candidates for labeling**, detect **ambiguities** in the data and **mislabeled** samples.&#x20;

#### Fetch Similars to a Sample - Example

In the example below we ran a **Fetch Similars** operation to find similar samples to the sample <img src="/files/TzmvsIzQGBCBeco0FLiT" alt="" data-size="line">. The result is a cluster of similar samples:

![Fetch Similar Results](/files/pOs3zMVFAYrRx80xUDLQ) ![Similar Samples](/files/QWZZBOYUU7YN17dpeX4r)

#### Example - Investigating the "Closed" 4 Digit

The MNIST dataset includes samples of the digit 4 written in two ways: <img src="/files/8W6JeyXIaTG9S0F3YkIO" alt="" data-size="line"> and <img src="/files/PF6BVJ02LLLzEKrsSnkK" alt="" data-size="line">. \
Running the Fetch Similars operation to find samples that are similar to <img src="/files/8W6JeyXIaTG9S0F3YkIO" alt="" data-size="line"> returns similar samples with presented heat maps that highlight the features that define these similarities.

![Similar Samples Heat Map](/files/XESbIzFUyRoin1AtcccJ)

In the 4 digit class the majority of the samples are written in the open form <img src="/files/PF6BVJ02LLLzEKrsSnkK" alt="" data-size="line">. The operation detected four samples that are similar to the closed form <img src="/files/8W6JeyXIaTG9S0F3YkIO" alt="" data-size="line">, while the other samples are mostly the digit `9`. This provides us with useful insights about the model and data and how to improve them. For example, we can:

* Separate the 4 class into two classes - open and closed digits.
* Enrich the data with additional samples of closed digits.
* Remove the closed digits from the dataset and support only the open form.

Similar analysis can be done for the two ways of writing the digit seven: <img src="/files/pdIaEB5u1BbFQDuGk5Pj" alt="" data-size="line"> and <img src="/files/cChMn3mnY1GU3dai2GkG" alt="" data-size="line">.&#x20;

### Metrics Analysis <a href="#results-analysis" id="results-analysis"></a>

The Tensorleap Dashboard allows users to visualize the training progress and performance metrics of their models. Additionally, users can explore correlations and trends between various metadata to gain insights into their data.

**Average Euclidean Distance Metadata vs Average Loss**

For each class, its **mean image** is calculated (28x28) by averaging each pixel. For each sample, its Euclidean distance from the **mean image** of its class is calculated and set as its metadata.

The Dashboard allows us to visualize and investigate each metadata interactively. The visualization below shows the average error loss per class, vs the Euclidean distance.

![Average Euclidean Distance vs Average Loss (click-to-zoom)](/files/NWIW2NLfmG0YrA92s3hc)

This plot reveals a strong correlation between the Euclidean distance and the average loss - as the Euclidean distance increases, the loss increases with it. This is expected, as the model is less capable of generalizing samples with high irregularities.

**Loss and Accuracy Visualizations**

During training, metrics are collected and presented in the Dashboard.\
For example, a visualization of the model's Loss (left) and Accuracy(right) vs batches:

![Loss/Accuracy vs Batch](/files/VanITKqcyFOJYhhGohZu)

As we can see, the model convergences well and achieves an accuracy of \~0.98 on the **validation** set.

## Summary

The **Tensorleap** platform provides powerful tools for analyzing and understanding deep learning models. In this example, we presented only a few examples of the types of insights that can be gained using the platform.&#x20;

For more information, please refer to our additional [**Examples**](/tensorleap-integration/writing-integration-code/examples.md) and [**Guides**](/guides.md).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.tensorleap.ai/examples/image-analysis.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
