# Active Learning

### **What Is Active Learning and Why It Matters**

**Active learning** is a training strategy where the model actively guides which data should be labeled next, rather than relying on random or exhaustive annotation. In real-world systems, large portions of data are redundant, easy, or already well understood by the model, while a small subset of samples drives most errors and uncertainty.

**TensorLeap enables this process** by automatically analyzing model representations to identify a diverse and informative subset of unlabeled data to prioritize for labeling. This allows teams to focus labeling resources on under-represented and impactful regions, reducing annotation cost and time.

### How TensorLeap Helps Prioritize Labeling and Scene Selection

TensorLeap’s latent space representation powers dataset curation by automatically selecting and suggesting samples from an unlabeled dataset for labeling.

<figure><img src="https://3509361326-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F9UXeOlFqlw8pl79U2HGU%2Fuploads%2Fgit-blob-a92aa3f9e45151034cbfe9fdec85a60c1247126a%2Factive-%3Dlearning-walt.gif?alt=media" alt=""><figcaption></figcaption></figure>

**The resulting selection highlights how** even when the unlabeled data is highly concentrated and dense (yellow circles), TensorLeap is able to identify samples that expand coverage across the full problem space (blue circles).

<figure><img src="https://3509361326-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F9UXeOlFqlw8pl79U2HGU%2Fuploads%2Fgit-blob-5f58bd2844a82ba48c7fcee1b8882983415e78a2%2FScreenshot%202025-12-14%20at%2014.06.59.png?alt=media" alt=""><figcaption></figcaption></figure>

By combining active learning principles with TensorLeap’s latent space analysis, dataset curation becomes a fast, data-driven process rather than a manual trial-and-error effort. Instead of relying on intuition or random sampling, teams can systematically prioritize the most impactful unlabeled samples—improving coverage of the problem space, reducing redundancy, and making more efficient use of labeling resources. This results in faster iteration cycles and more reliable performance improvements with fewer labeled samples.

|                                             | Manual Approach                                           | With TensorLeap                                                       |
| ------------------------------------------- | --------------------------------------------------------- | --------------------------------------------------------------------- |
| **Sample-Selection Strategy**               | Hand-picked, Random or intuition-based sampling           | Automatic, model-aware, based on learned representations              |
| **Data redundancy**                         | High redundancy in labeled data                           | Diverse samples that expand coverage of the data space                |
| **Coverage of edge cases**                  | Difficult to identify, heavily dependent on domain expert | Automatic model specific discovery                                    |
| **Automation level**                        | Requires manual analysis and repeated trial-and-error     | Automated dataset curation with minimal user effort                   |
| **Labeling efficiency**                     | Inefficient use of labeling budget                        | Focused labeling that maximizes impact per sample                     |
| **Use of metadata**                         | Heavy reliance on predefined metadata and heuristics      | Metadata used as complementary signal alongside model representations |
| **Choosing the amount of samples to label** | Arbitrary and often over or undershoot model needs        | Can be automatically inferred                                         |

### Active Learning Video Tutorial

{% embed url="<https://app.guidde.com/share/playbooks/aJQRZe7ewHgJGdQce1XzNU?mode=videoAndDoc&origin=k2buG3CvzZWUzfsWk7HPoOLDKpg2>" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tensorleap.ai/getting-value-from-tensorleap/active-learning.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
