YOLO
Utils for the Yolo Object Detection task
This package contains multiple helpers and methods for the YOLO Objet detection task:
YoloLoss
The YoloLoss is an implementation of the YoloLoss used by the Yolov7 repository.
We expose an interface to the loss that could be configured using the following properties:
from code_loader.helpers.detection.yolo.loss import YoloLoss
LOSS_FN = YoloLoss(num_classes: int, default_boxes: List[NDArray[np.int32]],
overlap_thresh: float, background_label: int,
features: List[Tuple[int, int]] = [],
anchors: Optional[NDArray[np.int32]] = None,
from_logits: bool = True, weights: List[float] = [4.0, 1.0, 0.4],
max_match_per_gt: int = 10,
image_size: Union[Tuple[int, int], int] = (640, 640),
cls_w: float = 0.3, obj_w: float = 0.7, box_w: float = 0.05,
yolo_match: bool = False):
num_classes
The number of classes in the dataset
default_boxes
A List of NDArray representing the model's grid . see Grid.generate_anchors()
overlap_thresh
The Matcher overlap threshold that sets what constitutes a match. YOLO default is 0.0625
background_label
If no background_label was used during training should be set to NUM_CLASSES+1
features
Only required if yolo_match is True. The size of the predictions heads your model uses [[H1,W1],[H2,W2]...]
anchors
Only required if yolo_match is True. The anchors used in your model
from_logits
True if the model was exported without a sigmoid. False if the model was exported as recommended by us
weights
the weights used to scale the object loss
max_match_per_gt
The number of priors matches per GT. Yolov7 default is 10
Image_size
the size of your images.
cls_w
Classification loss weight
obj_w
The object loss weight
box_w
The regression loss weight
yolo_match
When yolo_match is True we use the same Matcher as YoloV7. When yolo_match is False we use a slightly faster matcher, that approximates the YoloV7 matcher.
This loss has a __call__
method that computes the yolo_loss:
iou_loss, obj_loss, class_loss =
LOSS_FN(y_true: tf.Tensor, y_pred: Tuple[List[tf.Tensor], List[tf.Tensor]])
y_true
The ground truth encoded into a shape of [MAX_BB,5]. the 5 channels represent [X,Y,W,H,class]
y_pred
A tuple (loc,class) composed of: - loc. A list the size of the number of heads. Each element is of size [Batch,#BB,4]. The channels represent [X,Y,W,H] - class. A list the size of the number of heads. Each element is of size [Batch,#BB,#classes+1]
This returns the three losses of the YOLO repo (IOU loss, Object loss, Classification loss)
Decoder
Since we recommend exporting the model without the NMS
and top-k
components we need a Decoder model that can serve as a head to filter only the most confident bounding box.
We expose an interface for our default decoder:
from code_loader.helpers.detection.yolo.decoder import Decoder
DECODER = Decoder(self, num_classes: int, background_label: int, top_k: int,
conf_thresh: float, nms_thresh: float, max_bb_per_layer: int = 20,
max_bb_per_layer: int = 20, max_bb: int = 20)
num_classes
The number of classes in the dataset
background_label
If no background_label was used during training should be set to NUM_CLASSES+1
top_k
The number of BB for the top_k param. Per-layer and per-class.
conf_thresh
A threshold for the confidence. BB with confidence lower with this will not be shown by the decoder
nms_thresh
The NMS threshold for IOU-overlap calculation. see Tensorflow's non_max_suppression IOU supression param for more details.
max_bb_per_layer
The maximum amount of BB selected per layer
max_bb
The maximum amount of BB selected overall
This decoder has a __call__
function that returns a list of the selected bounding_boxes
bounding_boxes = DECODER(loc_data: List[tf.Tensor], conf_data: List[tf.Tensor],
prior_data: List[NDArray[np.float32]],
from_logits: bool = True, decoded: bool = False)
loc_data
A list the size of the number of heads. Each element is of size [Batch,#BB,4]. The channels represent [X,Y,W,H]
conf_data
A list the size of the number of heads. Each element is of size [Batch,#BB,#classes+1]
prior_data
a List of NDArray representing the model's grid . see Grid.generate_anchors()
from_logits
True if the model was exported without a sigmoid. False if the model was exported as recommended by us
decoded
True if the model was exported with a decoder, as recommended by us. False otherwise (i.e. the predictions are still relative to anchors and are not in image coordinates)
Grid
This class represents the YOLO priors grid.
from code_loader.helpers.detection.yolo.grid import Grid
BOXES_GENERATOR = Grid(image_size: Tuple[int, int], feature_maps: Tuple[Tuple[int, int], ...],
box_sizes: Tuple[Tuple[float, ...], ...], strides: Tuple[int, ...],
offset: int)
image_size
the image size we use for inference
feature_maps
the shapes of the model heads ((H1,W1),(H2,W2))..
box_sizes
The shape of the anchors as set in the Yolov7 YAML
strides
The strides that connects the head_size to image_size (IMAGE_SIZE/HEAD_SIZE)
offset
0 if the grid starts from (0,0) as expected by the YOLO repo
This class has a generate_anchors()
method that creates the grid used by the loss and decoder.
DEFAULT_BOXES = BOXES_GENERATOR.generate_anchors()
DEFAULT_BOXES is of type List[NDArray[np.float32]]. each entry is an entry sized (#head-BB,4) representing the coordinates for the bounding box located in each head.
Last updated
Was this helpful?