YOLO
Utils for the Yolo Object Detection task
This package contains multiple helpers and methods for the YOLO Objet detection task:
YoloLoss
The YoloLoss is an implementation of the YoloLoss used by the Yolov7 repository.
We expose an interface to the loss that could be configured using the following properties:
num_classes
The number of classes in the dataset
default_boxes
A List of NDArray representing the model's grid . see Grid.generate_anchors()
overlap_thresh
The Matcher overlap threshold that sets what constitutes a match. YOLO default is 0.0625
background_label
If no background_label was used during training should be set to NUM_CLASSES+1
features
Only required if yolo_match is True. The size of the predictions heads your model uses [[H1,W1],[H2,W2]...]
anchors
Only required if yolo_match is True. The anchors used in your model
from_logits
True if the model was exported without a sigmoid. False if the model was exported as recommended by us
weights
the weights used to scale the object loss
max_match_per_gt
The number of priors matches per GT. Yolov7 default is 10
Image_size
the size of your images.
cls_w
Classification loss weight
obj_w
The object loss weight
box_w
The regression loss weight
yolo_match
When yolo_match is True we use the same Matcher as YoloV7. When yolo_match is False we use a slightly faster matcher, that approximates the YoloV7 matcher.
This loss has a __call__
method that computes the yolo_loss:
y_true
The ground truth encoded into a shape of [MAX_BB,5]. the 5 channels represent [X,Y,W,H,class]
y_pred
A tuple (loc,class) composed of: - loc. A list the size of the number of heads. Each element is of size [Batch,#BB,4]. The channels represent [X,Y,W,H] - class. A list the size of the number of heads. Each element is of size [Batch,#BB,#classes+1]
This returns the three losses of the YOLO repo (IOU loss, Object loss, Classification loss)
Decoder
Since we recommend exporting the model without the NMS
and top-k
components we need a Decoder model that can serve as a head to filter only the most confident bounding box.
We expose an interface for our default decoder:
num_classes
The number of classes in the dataset
background_label
If no background_label was used during training should be set to NUM_CLASSES+1
top_k
The number of BB for the top_k param. Per-layer and per-class.
conf_thresh
A threshold for the confidence. BB with confidence lower with this will not be shown by the decoder
nms_thresh
The NMS threshold for IOU-overlap calculation. see Tensorflow's non_max_suppression IOU supression param for more details.
max_bb_per_layer
The maximum amount of BB selected per layer
max_bb
The maximum amount of BB selected overall
This decoder has a __call__
function that returns a list of the selected bounding_boxes
loc_data
A list the size of the number of heads. Each element is of size [Batch,#BB,4]. The channels represent [X,Y,W,H]
conf_data
A list the size of the number of heads. Each element is of size [Batch,#BB,#classes+1]
prior_data
a List of NDArray representing the model's grid . see Grid.generate_anchors()
from_logits
True if the model was exported without a sigmoid. False if the model was exported as recommended by us
decoded
True if the model was exported with a decoder, as recommended by us. False otherwise (i.e. the predictions are still relative to anchors and are not in image coordinates)
Grid
This class represents the YOLO priors grid.
image_size
the image size we use for inference
feature_maps
the shapes of the model heads ((H1,W1),(H2,W2))..
box_sizes
The shape of the anchors as set in the Yolov7 YAML
strides
The strides that connects the head_size to image_size (IMAGE_SIZE/HEAD_SIZE)
offset
0 if the grid starts from (0,0) as expected by the YOLO repo
This class has a generate_anchors()
method that creates the grid used by the loss and decoder.
DEFAULT_BOXES is of type List[NDArray[np.float32]]. each entry is an entry sized (#head-BB,4) representing the coordinates for the bounding box located in each head.
Last updated