# Unlabeled Data

As a data scientist, one of the most important things you can do is label your data samples. This allows you to build models that are more accurate and can be applied to real-world data. However, with the vast amount of data out there, it can be tough to **prioritize which samples to label**.

Tensorleap constructs the model's most informative latent-space, which enables you to prioritize which samples to label in an efficient way, by utilizing the learnt features of the model.

#### Integration Script

The `unlabeled_data_preprocessing_func` *(custom name)* is a **preprocess** function that is called just once before the reading the data, similar to the [**Preprocess Function**](/tensorleap-integration/writing-integration-code/preprocess-function.md). It prepares the data for later use in **input encoders**.

```python
from code_loader import leap_binder
from code_loader.contract.datasetclasses import PreprocessResponse

# Preprocessing Function
def unlabeled_preprocessing_func() -> PreprocessResponse:
...
    return PreprocessResponse(length=len(unlabeled_df), data=unlabeled_df)

leap_binder.set_unlabeled_data_preprocess(function=unlabeled_preprocessing_func)
```

This function returns a single [**`PreprocessResponse`**](/tensorleap-integration/python-api/code_loader/datasetclasses/preprocessresponse.md) object.

#### Fetch Similar

In order to prioritize unlabeled data, choose a sample within the [**Population Exploration**](/user-interface/dashboards/dashlets/sample-analysis.md#population-exploration) analysis that correlates to a desired cluster, and request to fetch similar samples from the unlabeled data.

![Fetch Similar from Unlabeled Data (click-to-zoom)](/files/8lDUpV3ODifNy4glqkF3)

Once the `Fetch Similar` process finished, a similarity map of the found samples will be presented. You can choose to set the color and size of the dots to to `similarity` in order to indicate which were found to be the most similar to the target sample.

![Target Sample (click-to-zoom)](/files/iWhQc6u4QJk65khEfaYk) ![Fetch Similar Results (click-to-zoom)](/files/yfxx4V3Tp26RUQjMwiaC)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tensorleap.ai/tensorleap-integration/writing-integration-code/unlabeled-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
