This returns the three losses of the YOLO repo (IOU loss, Object loss, Classification loss)
Decoder
Since we recommend exporting the model without the NMS and top-k components we need a Decoder model that can serve as a head to filter only the most confident bounding box.
We expose an interface for our default decoder:
from code_loader.helpers.detection.yolo.decoder import DecoderDECODER =Decoder(self, num_classes: int, background_label: int, top_k: int, conf_thresh: float, nms_thresh: float, max_bb_per_layer: int =20, max_bb_per_layer: int =20, max_bb: int =20)
This decoder has a __call__ function that returns a list of the selected bounding_boxes
DEFAULT_BOXES is of type List[NDArray[np.float32]]. each entry is an entry sized (#head-BB,4) representing the coordinates for the bounding box located in each head.
The number of priors matches per GT. Yolov7 default is 10
Image_size
the size of your images.
cls_w
Classification loss weight
obj_w
The object loss weight
box_w
The regression loss weight
yolo_match
When yolo_match is True we use the same Matcher as YoloV7.
When yolo_match is False we use a slightly faster matcher, that approximates the YoloV7 matcher.
y_true
The ground truth encoded into a shape of [MAX_BB,5]. the 5 channels represent [X,Y,W,H,class]
y_pred
A tuple (loc,class) composed of:
- loc. A list the size of the number of heads. Each element is of size [Batch,#BB,4]. The channels represent [X,Y,W,H]
- class. A list the size of the number of heads. Each element is of size [Batch,#BB,#classes+1]
num_classes
The number of classes in the dataset
background_label
If no background_label was used during training should be set to NUM_CLASSES+1
top_k
The number of BB for the top_k param. Per-layer and per-class.
conf_thresh
A threshold for the confidence. BB with confidence lower with this will not be shown by the decoder
nms_thresh
The NMS threshold for IOU-overlap calculation. see Tensorflow's non_max_suppression IOU supression param for more details.
max_bb_per_layer
The maximum amount of BB selected per layer
max_bb
The maximum amount of BB selected overall
loc_data
A list the size of the number of heads. Each element is of size [Batch,#BB,4]. The channels represent [X,Y,W,H]
conf_data
A list the size of the number of heads. Each element is of size [Batch,#BB,#classes+1]
prior_data
a List of NDArray representing the model's grid .
see Grid.generate_anchors()
from_logits
True if the model was exported without a sigmoid. False if the model was exported as recommended by us
decoded
True if the model was exported with a decoder, as recommended by us.
False otherwise (i.e. the predictions are still relative to anchors and are not in image coordinates)
image_size
the image size we use for inference
feature_maps
the shapes of the model heads ((H1,W1),(H2,W2))..
box_sizes
The shape of the anchors as set in the Yolov7 YAML
strides
The strides that connects the head_size to image_size (IMAGE_SIZE/HEAD_SIZE)
offset
0 if the grid starts from (0,0) as expected by the YOLO repo