Utils for the Yolo Object Detection task

This package contains multiple helpers and methods for the YOLO Objet detection task:


The YoloLoss is an implementation of the YoloLoss used by the Yolov7 repository.

We expose an interface to the loss that could be configured using the following properties:

from code_loader.helpers.detection.yolo.loss import YoloLoss

LOSS_FN = YoloLoss(num_classes: int, default_boxes: List[NDArray[np.int32]],
                   overlap_thresh: float, background_label: int,
                   features: List[Tuple[int, int]] = [],
                   anchors: Optional[NDArray[np.int32]] = None,
                   from_logits: bool = True, weights: List[float] = [4.0, 1.0, 0.4],
                   max_match_per_gt: int = 10,
                   image_size: Union[Tuple[int, int], int] = (640, 640),
                   cls_w: float = 0.3, obj_w: float = 0.7, box_w: float = 0.05,
                   yolo_match: bool = False):

This loss has a __call__ method that computes the yolo_loss:

iou_loss, obj_loss, class_loss = 
LOSS_FN(y_true: tf.Tensor, y_pred: Tuple[List[tf.Tensor], List[tf.Tensor]])

This returns the three losses of the YOLO repo (IOU loss, Object loss, Classification loss)


Since we recommend exporting the model without the NMS and top-k components we need a Decoder model that can serve as a head to filter only the most confident bounding box.

We expose an interface for our default decoder:

from code_loader.helpers.detection.yolo.decoder import Decoder

DECODER = Decoder(self, num_classes: int, background_label: int, top_k: int,
                  conf_thresh: float, nms_thresh: float, max_bb_per_layer: int = 20,
                  max_bb_per_layer: int = 20, max_bb: int = 20)

This decoder has a __call__ function that returns a list of the selected bounding_boxes

bounding_boxes = DECODER(loc_data: List[tf.Tensor], conf_data: List[tf.Tensor],
        prior_data: List[NDArray[np.float32]],
         from_logits: bool = True, decoded: bool = False)


This class represents the YOLO priors grid.

from code_loader.helpers.detection.yolo.grid import Grid

BOXES_GENERATOR = Grid(image_size: Tuple[int, int], feature_maps: Tuple[Tuple[int, int], ...],
                       box_sizes: Tuple[Tuple[float, ...], ...], strides: Tuple[int, ...],
                       offset: int)

This class has a generate_anchors() method that creates the grid used by the loss and decoder.


DEFAULT_BOXES is of type List[NDArray[np.float32]]. each entry is an entry sized (#head-BB,4) representing the coordinates for the bounding box located in each head.

Last updated