util#

Functions

`compute_average_precision`	Compute average precision from dataframe with precision and recall values.
`confusion_matrix`	Compute the confusion matrix for a given DataFrame.
`construct_matches_df`	From a dataframe with targets and predictions, all concatenated together, construct a list of match pairs between prediction and targets.
`display_confusion_matrix`	Display a ConfusionMatrixDisplay object for a given Dataframe.
`get_ious`	From two dataframes of annotations, generate a matrix of iou of size N x M where N is the number of predictions and M is the number targets.
`get_matches`	Get the best matching target for every prediction and return matching target (if any) for every prediction and matching prediction (if any) for every target Prediction are either reordered by confidence, or assumed already ordered in the first place.
`pr_curve`	Construct Precision Recall curve from results dataframe and minimum iou below which detection is considered invalid
`resample_count`	Take a sequence of confidence values and resample it assuming at each new original confdience value, one object is added.

compute_average_precision(pr_curve: DataFrame) → float[source]#

Compute average precision from dataframe with precision and recall values. Precision values are averaged over recall values.

Note

We compute the right Riemann sum, i.e. we only consider the value on the right for a particular recall interval.

Parameters:: pr_curve – Dataframe with precision and recall columns.
Returns:: Average precision for this particular PR curve

confusion_matrix(matches: DataFrame) → DataFrame[source]#

Compute the confusion matrix for a given DataFrame.

Parameters:

matches –

DataFrame containing the matches between groundtruth and predictions in which we expect to have the following columns :

prediction_label
groundtruth_label

corresponding to the predicted and groundtruth labels, respectively, in order to compute the confusion matrix.

Returns:

A confusion matrix as DataFrame with class names as column names and row ids.

construct_matches_df(predictions_targets_df: DataFrame, min_iou: float = 0) → DataFrame[source]#

From a dataframe with targets and predictions, all concatenated together, construct a list of match pairs between prediction and targets. Unmatched predictions or targets get a <NA> match id. Note that all bounding boxes in the input dataframe are assumed to be of the same category and the same image, the grouping must have already been done by the user before.

Parameters:

predictions_targets_df –
DataFrame comprising target and prediction info must have the following columns:
- groundtruth : bool value to know if it’s a target or a prediction
- box_x_min, box_y_min, box_width, box_height: Bounding box information to compute IoU
min_iou – IoU above which the detection is considered valid. Note that the lower bound is not inclusive. Defaults to 0.

Returns:

DataFrame of matches. Will contain prediction_id and groundtruth_id columns. Index is irrelevant. Each prediction id and target id should appear once and only once. As such, at worse (no match at all), the dataframe will be N+M rows with N the number of predictions and M the number of targets

display_confusion_matrix(confusion_matrix: DataFrame, title: str = '')[source]#

Display a ConfusionMatrixDisplay object for a given Dataframe.

Parameters:

confusion_matrix – Dataframe containing the confusion matrix data as computed by confusion_matrix()
title – Confusion matrix’s title

get_ious(groundtruth: DataFrame, predictions: DataFrame) → DataFrame[source]#

From two dataframes of annotations, generate a matrix of iou of size N x M where N is the number of predictions and M is the number targets. Rows are sorted by prediction confidence

Note that this does not check the category_id, only the bounding box coordinates.

Next, encapsulate it in a dataframe with index and columns named after prediction and target ids.

Parameters:

groundtruth – DataFrame comprising bounding box targets data. Must include at least box_x_min, box_y_min, box_width, box_height
predictions – DataFrame comprising bounding box prediction data. Must include same columns as groundtruth, plus the confidence column.

Returns:

DataFrame comprising iou values between groundtruth and predictions. Index is prediction id, column name is target id

get_matches(iou_df: DataFrame, confidence: Series | None = None, min_iou: float = 0) → tuple[DataFrame, DataFrame][source]#

Get the best matching target for every prediction and return matching target (if any) for every prediction and matching prediction (if any) for every target Prediction are either reordered by confidence, or assumed already ordered in the first place.

Parameters:

iou_df – IoU values matrix encapsulated in a dataframe to index rows with prediction ids and columns with target ids
confidence – series with the number of rows as iou_df, will be used to reorder iou_df’s rows in descending order. If not given, will assume iou_df is already ordered.
min_iou – Minimum IoU value above which a match is considered valid.

Returns:

dataframes of matching ids with corresponding ious. First df is indexed by prediction ids, second df is indexed by target id

pr_curve(results: DataFrame, min_iou: float = 0, betas: Iterable[float] = (1,), reindex_series: Series | None = None) → DataFrame[source]#

Construct Precision Recall curve from results dataframe and minimum iou below which detection is considered invalid

Additionally, computes F-score with different \(\beta\) values with the following equation.

Parameters:

results – Dataframe modelling detections, with corresponding confidence and groundtruth (whether this detection would be True positive or a False positive). Should include the columns groundtruth, iou and confidence, and rows should be sorted so that confidence values are sorted.
min_iou – Value below which the detection is considered invalid. In other words, the groundtruth becomes False. The prediction becomes a False Positive, and the corresponding groundtruth is a False negative. Defaults to 0.
betas – beta values to compute the F-Score with. Must be an iterable of floats. Defaults to (1,)
reindex_series – Recall bins to reindex the curve. before returning it.

Returns:

Precision Recall curve dataframe. Columns are precision, recall, f{beta}_score and confidence_threshold, where betas are the given \(\beta\) values in betas (see equation above). Index is irrelevant.

resample_count(original_confidences: Iterable[float], new_confidences: Iterable[float]) → Series[source]#

Take a sequence of confidence values and resample it assuming at each new original confdience value, one object is added.

Result is the number of objects that would have been detected for each value in new confidence.

Note

new_confidences must be sorted unique values.

Parameters:

original_confidences – Original set of confidence value. Each confidence value corresponds to one detected object.
new_confidences – New set of confidence values to resample the number of detected objects from. Usually, a range of N elements, from 0 to 1.

Returns:

Series named count with the same length as new_confidences, index set as new_confidences, named confidence, and values set to count values corresponding to confidence threshold given in the index.