# YOLO

This package contains multiple helpers and methods for the YOLO Objet detection task:

## YoloLoss

The YoloLoss is an implementation of the [YoloLoss](https://github.com/WongKinYiu/yolov7/blob/main/utils/loss.py) used by the Yolov7 repository.

We expose an interface to the loss that could be configured using the following properties:

```python
from code_loader.helpers.detection.yolo.loss import YoloLoss

LOSS_FN = YoloLoss(num_classes: int, default_boxes: List[NDArray[np.int32]],
                   overlap_thresh: float, background_label: int,
                   features: List[Tuple[int, int]] = [],
                   anchors: Optional[NDArray[np.int32]] = None,
                   from_logits: bool = True, weights: List[float] = [4.0, 1.0, 0.4],
                   max_match_per_gt: int = 10,
                   image_size: Union[Tuple[int, int], int] = (640, 640),
                   cls_w: float = 0.3, obj_w: float = 0.7, box_w: float = 0.05,
                   yolo_match: bool = False):
```

<table><thead><tr><th width="158.46928201888204">Args</th><th></th></tr></thead><tbody><tr><td>num_classes</td><td><em>The number of classes in the dataset</em></td></tr><tr><td>default_boxes</td><td>A List of NDArray representing the model's grid . <br>see Grid.generate_anchors()</td></tr><tr><td>overlap_thresh</td><td>The Matcher overlap threshold that sets what constitutes a match. YOLO default is 0.0625</td></tr><tr><td>background_label</td><td>If no background_label was used during training should be set to NUM_CLASSES+1</td></tr><tr><td>features</td><td>Only required if yolo_match is True.<br>The size of the predictions heads your model uses [[H1,W1],[H2,W2]...]</td></tr><tr><td>anchors</td><td>Only required if yolo_match is True.<br>The anchors used in your model</td></tr><tr><td>from_logits</td><td>True if the model was exported without a sigmoid. False if the model was exported as <a href="https://app.gitbook.com/o/BAmahBxGiWBlO37RuZf2/s/9UXeOlFqlw8pl79U2HGU/~/changes/465/guides/integration-script/examples/celeba-object-detection-yolov7">recommended</a> by us</td></tr><tr><td>weights</td><td>the <a href="https://github.com/WongKinYiu/yolov7/blob/2fdc7f14395f6532ad05fb3e6970150a6a83d290/utils/loss.py#L576">weights</a> used to scale the object loss </td></tr><tr><td>max_match_per_gt</td><td>The number of priors matches per GT. Yolov7 default is 10</td></tr><tr><td>Image_size</td><td>the size of your images.</td></tr><tr><td>cls_w</td><td>Classification loss weight</td></tr><tr><td>obj_w</td><td>The object loss weight</td></tr><tr><td>box_w</td><td>The regression loss weight</td></tr><tr><td>yolo_match</td><td>When yolo_match is True we use the same Matcher as YoloV7. <br>When yolo_match is False we use a slightly faster matcher, that approximates the YoloV7 matcher.</td></tr></tbody></table>

This loss has a `__call__` method that computes the yolo\_loss:

```python
iou_loss, obj_loss, class_loss = 
LOSS_FN(y_true: tf.Tensor, y_pred: Tuple[List[tf.Tensor], List[tf.Tensor]])
```

<table><thead><tr><th width="158.46928201888204">Args</th><th></th></tr></thead><tbody><tr><td>y_true</td><td>The ground truth encoded into a shape of [MAX_BB,5]. the 5 channels represent [X,Y,W,H,class]</td></tr><tr><td>y_pred</td><td>A tuple (loc,class) composed of:<br>   - loc.  A list the size of the number of heads. Each element is of size [Batch,#BB,4]. The channels represent [X,Y,W,H]<br>   - class. A list the size of the number of heads. Each element is of size [Batch,#BB,#classes+1]</td></tr></tbody></table>

This returns the three losses of the YOLO repo (IOU loss, Object loss, Classification loss)

## Decoder

Since we recommend exporting the model without the `NMS` and `top-k` components we need a Decoder model that can serve as a head to filter only the most confident bounding box.

We expose an interface for our default decoder:

```python
from code_loader.helpers.detection.yolo.decoder import Decoder

DECODER = Decoder(self, num_classes: int, background_label: int, top_k: int,
                  conf_thresh: float, nms_thresh: float, max_bb_per_layer: int = 20,
                  max_bb_per_layer: int = 20, max_bb: int = 20)
```

<table><thead><tr><th width="158.46928201888204">Args</th><th></th></tr></thead><tbody><tr><td>num_classes</td><td><em>The number of classes in the dataset</em></td></tr><tr><td>background_label</td><td>If no background_label was used during training should be set to NUM_CLASSES+1</td></tr><tr><td>top_k</td><td>The number of BB for the top_k param. Per-layer and per-class.</td></tr><tr><td>conf_thresh</td><td>A threshold for the confidence. BB with confidence lower with this will not be shown by the decoder</td></tr><tr><td>nms_thresh</td><td>The NMS threshold for IOU-overlap calculation. see Tensorflow's <a href="https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression">non_max_suppression</a> IOU supression param for more details. </td></tr><tr><td>max_bb_per_layer</td><td>The maximum amount of BB selected per layer</td></tr><tr><td>max_bb</td><td>The maximum amount of BB selected overall</td></tr></tbody></table>

This decoder has a `__call__` function that returns a list of the selected bounding\_boxes

<pre class="language-python"><code class="lang-python"><strong>bounding_boxes = DECODER(loc_data: List[tf.Tensor], conf_data: List[tf.Tensor],
</strong>        prior_data: List[NDArray[np.float32]],
         from_logits: bool = True, decoded: bool = False)
</code></pre>

<table><thead><tr><th width="158.46928201888204">Args</th><th></th></tr></thead><tbody><tr><td>loc_data</td><td> A list the size of the number of heads. Each element is of size [Batch,#BB,4]. The channels represent [X,Y,W,H]</td></tr><tr><td>conf_data</td><td>A list the size of the number of heads. Each element is of size [Batch,#BB,#classes+1]</td></tr><tr><td>prior_data</td><td>a List of NDArray representing the model's grid . <br>see Grid.generate_anchors()</td></tr><tr><td>from_logits</td><td> True if the model was exported without a sigmoid. False if the model was exported as <a href="https://app.gitbook.com/o/BAmahBxGiWBlO37RuZf2/s/9UXeOlFqlw8pl79U2HGU/~/changes/465/guides/integration-script/examples/celeba-object-detection-yolov7">recommended</a> by us</td></tr><tr><td>decoded</td><td>True if the model was exported with a decoder, as <a href="https://app.gitbook.com/o/BAmahBxGiWBlO37RuZf2/s/9UXeOlFqlw8pl79U2HGU/~/changes/465/guides/integration-script/examples/celeba-object-detection-yolov7">recommended</a> by us. <br>False otherwise (i.e. the predictions are still relative to anchors and are not in image coordinates)</td></tr></tbody></table>

## Grid

This class represents the YOLO priors grid.

<pre class="language-python"><code class="lang-python">from code_loader.helpers.detection.yolo.grid import Grid

<strong>BOXES_GENERATOR = Grid(image_size: Tuple[int, int], feature_maps: Tuple[Tuple[int, int], ...],
</strong>                       box_sizes: Tuple[Tuple[float, ...], ...], strides: Tuple[int, ...],
                       offset: int)
</code></pre>

<table><thead><tr><th width="158.46928201888204">Args</th><th></th></tr></thead><tbody><tr><td>image_size</td><td>the image size we use for inference</td></tr><tr><td>feature_maps</td><td>the shapes of the model heads ((H1,W1),(H2,W2))..</td></tr><tr><td>box_sizes</td><td>The shape of the anchors as set in the <a href="https://github.com/WongKinYiu/yolov7/blob/2fdc7f14395f6532ad05fb3e6970150a6a83d290/cfg/deploy/yolov7.yaml#L7">Yolov7</a> YAML</td></tr><tr><td>strides</td><td>The strides that connects the head_size to image_size (IMAGE_SIZE/HEAD_SIZE)</td></tr><tr><td>offset</td><td>0 if the grid starts from (0,0) as expected by the YOLO repo</td></tr></tbody></table>

This class has a `generate_anchors()` method that creates the grid used by the loss and decoder.<br>

```python
DEFAULT_BOXES = BOXES_GENERATOR.generate_anchors()
```

DEFAULT\_BOXES is of type List\[NDArray\[np.float32]]. each entry is an entry sized (#head-BB,4) representing the coordinates for the bounding box located in each head.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tensorleap.ai/tensorleap-integration/python-api/helpers/detection/yolo.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
