# CelebA Object Detection (YoloV7)

This example will demonstrate how to integrate [YoloV7](https://github.com/WongKinYiu/yolov7) to the Tensorleap system. The architecture we use for this example is the [YoloV7-tiny](https://github.com/WongKinYiu/yolov7/blob/main/cfg/deploy/yolov7-tiny.yaml) model, trained on the CelebA full dataset on 1 class (faces).

The starting point for this example is having a trained model `PyTorch` weights (.pt) that was trained using the YoloV7 repository.&#x20;

## Dataset Script&#x20;

To use the CelebA dataset for object detection, we set up the CelebA dataset into an images and labels folders, and created the corresponding .txt files according to the YOLOv7 specs.

In the following entries we provide an in-depth description of the main components of our dataset, following by the complete dataset script

### Key components

Before going to each component in depth, two things are important to note:

{% hint style="info" %}

* The GT function converts a YOLO-format: \[class,X,Y,W,H] and outputs \[X,Y,W,H,class]&#x20;
* The input function return a channels-last image \[H,W,3]&#x20;
  {% endhint %}

#### YoloV7 utils

YOLOv7 requires a decoder (so we can view the images), a custom loss (that is composed of an object & class & IOU losses), and a specific grid definition to be able to map the predictions to the priors. A complete description of these elements and their configuration could be found in the [helpers](/tensorleap-integration/python-api/helpers.md) section.

Here, we set up the YOLO utils with a YOLO-tiny config. This includes the loss config (overlap threshold, maximum matches, weights), the decoder config (NMS & confidence threshold, top\_k and max bb to plot) and the Grid config (heads size, strides, and image size).

```python
from code_loader.helpers.detection.yolo.decoder import Decoder
from code_loader.helpers.detection.yolo.utils import scale_loc_prediction, reshape_output_list
from code_loader.helpers.detection.yolo.grid import Grid
from code_loader.helpers.detection.yolo.loss import YoloLoss
from code_loader.helpers.detection.utils import xywh_to_xyxy_format, xyxy_to_xywh_format, jaccard

# -------------------------------------OD Functions ----------------------------------- #
CATEGORIES = ['face']  # class names
BACKGROUND_LABEL = 1 
MAX_BB_PER_IMAGE = 30
CLASSES = 1
IMAGE_SIZE = (640, 640)
FEATURE_MAPS = ((80, 80), (40, 40), (20, 20))
BOX_SIZES = (((10, 13), (16, 30), (33, 23)),
             ((30, 61), (62, 45), (59, 119)),
                 ((116, 90), (156, 198), (373, 326))) #tiny fd
NUM_FEATURES = len(FEATURE_MAPS)
NUM_PRIORS = len(BOX_SIZES[0]) * len(BOX_SIZES) #[3*3]
OFFSET = 0
STRIDES = (8, 16, 32)
CONF_THRESH = 0.35
NMS_THRESH = 0.65
OVERLAP_THRESH = 0.0625 #might need to be 1/16
BOXES_GENERATOR = Grid(image_size=IMAGE_SIZE, feature_maps=FEATURE_MAPS, box_sizes=BOX_SIZES,
                                         strides=STRIDES, offset=OFFSET)
DEFAULT_BOXES = BOXES_GENERATOR.generate_anchors()
LOSS_FN = YoloLoss(num_classes=CLASSES, overlap_thresh=OVERLAP_THRESH,
                                default_boxes=DEFAULT_BOXES, background_label=BACKGROUND_LABEL,
                                from_logits=False , weights=[4.0, 1.0, 0.4], max_match_per_gt=10)
DECODER = Decoder(CLASSES,
                           background_label=BACKGROUND_LABEL,
                           top_k=20,
                           conf_thresh=CONF_THRESH,
                           nms_thresh=NMS_THRESH,
                           max_bb_per_layer=MAX_BB_PER_IMAGE,
                           max_bb=MAX_BB_PER_IMAGE)
```

#### Preprocess

The following method downloads our input text files from the public cloud, reads them, and parses the first NUM\_SAMPLES entries from each file.&#x20;

```python
def subset_images_list() -> List[PreprocessResponse]:
    lists_base_path = Path('celebA/celeba_full/input_lists')
    lists_names = ["train.txt", "val.txt", "test.txt"]
    NUM_SAMPELS = 100
    lists_full_path = [lists_base_path / subset for subset in lists_names]
    lists_files = [_download(str(f)) for f in lists_full_path]
    subset_image_pths = [None]*3
    subset_labels_pths = [None]*3
    for i in range(len(lists_names)):
        with open(lists_files[i], 'r') as f:
            subset_image_pths[i] = f.read().strip().splitlines()
            subset_labels_pths[i] = transform_image_list_to_labels(subset_image_pths[i])
    subset_image_pths = [sub_pth[:NUM_SAMPELS] for sub_pth in subset_image_pths]
    subset_labels_pths = [sub_pth[:NUM_SAMPELS] for sub_pth in subset_labels_pths]
    responses = [PreprocessResponse(length=len(img_pth), data={'img_path': img_pth, 'label_path': lab_pth})
                 for img_pth, lab_pth in zip(subset_image_pths, subset_labels_pths)]
    return responses

```

#### Input Images

this method downloads the images from our cloud, loads them, and then resizes them to a specific IMAGE\_SIZE&#x20;

```python
def input_image(idx: int, data: PreprocessResponse) -> NDArray[float]:
    """
    Returns a BGR image normalized and padded
    """
    data = data.data
    filepath = data['img_path'][idx]
    fpath = _download(filepath)
    image = np.array(Image.open(fpath).resize((IMAGE_SIZE[1], IMAGE_SIZE[0]), Image.BILINEAR))/255.
    # rescale
    return image
```

#### Ground Truth

This method reads the YOLO-format labels files and returns a \[X,Y,W,X,class\_idx] encoded ground truth, with a MAX\_BB\_PER\_IMAGE GT instances per image

```python
def get_bb(idx: int, data: PreprocessResponse) -> NDArray[np.double]:
    """
    returns an array shaped (MAX_BB_PER_IMAGE, 5) where the channel idx is [X,Y,W,H] normalized to [0,1]
    """
    data = data.data
    filepath = data['label_path'][idx]
    fpath = _download(filepath)
    with open(fpath, 'r') as f:
        gt_list = [x.split() for x in f.read().strip().splitlines()]
    bboxes = np.zeros([MAX_BB_PER_IMAGE, 5])
    max_anns = min(MAX_BB_PER_IMAGE, len(gt_list))
    for i, gt_entry in enumerate(gt_list):
        ann = gt_entry
        bboxes[i, :4] = np.array(ann[1:]).astype(float)
        bboxes[i, 4] = np.array(ann[0]).astype(float)
    bboxes[max_anns:, 4] = BACKGROUND_LABEL
    return bboxes
```

### Complete Dataset Script

The complete Face Detection dataset script could be found [here](https://storage.googleapis.com/example-datasets-47ml982d/celebA/celeba_full/YOLO%20example/yolov7_celeb.py).

## Exporting an ONNX Model

After the `PyTorch` training is finished, an `ONNX` model should be exported using the YOLOv7 export script.  YOLOv7 has multiple export options, but the one that would allow the easiest integration with the TensorLeap system is exporting the model **without** NMS, but with the decoder.&#x20;

To export the `PyTorch` model to `ONNX` you should execute the following command:

> python export.py --weights WEIGHTS\_PATH --grid --simplify --img-size 640 640 --max-wh 640

Where `WEIGHTS_PATH` is the .PT weights and 640 is the resolution of the input images.

<figure><img src="/files/blQqNViYc3jQEoLtpEnH" alt=""><figcaption><p>The resulting .ONNX file should have a similar head structure to the above Netron Image</p></figcaption></figure>

### Example ONNX model

Our YoloV7 exported model could be found [here](https://storage.googleapis.com/example-datasets-47ml982d/celebA/celeba_full/YOLO%20example/yolo_trained_fd.onnx).

## Model Integration

Following the [import model](/user-interface/project/versions/import-model.md) guide we can now upload the ONNX model to the platform.

### Removing last node

This model has a redundant `node` added to it at upload time - it should be removed.

<figure><img src="/files/EEHwLxYy6JnQ1lt6yDq1" alt=""><figcaption><p>Removing the last node</p></figcaption></figure>

### Setting up the model

To set up the model, we need to first move the dataset node from the left-most part of the model to the right.

We should then select the YOLO parsed dataset on the dataset node. and connect several nodes:

* The GT visualizer (visualize GT BB)
* Prediction visualizer (visualize prediction BB)
* Image visualizer (visualize input)
* Custom Loss

{% hint style="info" %}
Don't forget to choose the loss within the dropdown menu after adding the loss node
{% endhint %}

* Optimizer
* Metrics

&#x20;&#x20;

<figure><img src="/files/7LDI9dLnzJDE6dWFem0B" alt=""><figcaption><p>A valid object detection model's head</p></figcaption></figure>

After connecting these nodes you should save the model (by overriding current version)

<figure><img src="/files/Z4n0xF0YsWWnstI7Pzhy" alt=""><figcaption></figcaption></figure>

At this point both the dataset and the model is integrated into the platform. You can run evaluate and training&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tensorleap.ai/tensorleap-integration/writing-integration-code/examples/celeba-object-detection-yolov7.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
