remove_invalid_annotations#

Dataset.remove_invalid_annotations(allow_keypoints: bool = False, remove_related_images: bool = False, remove_emptied_images: bool = False) Self[source]#

Remove Invalid annotations from dataset.

Optionally, remove images that have at least one invalid annotation, or remove images that have only invalid annotations

Parameters:
  • allow_keypoints – If set to True, will keep keypoints, i.e. bounding box with height and width of 0. Otherwise, will remove them. Defaults to False.

  • remove_related_images – If set to True, will remove any image that has an invalid annotation. Defaults to False.

  • remove_emptied_images – If set to True, will remove images that are empty after removing the invalid annotations. In other word, remove images where all annotations are invalid. Note that already empty images are not removed. Defaults to False.

Returns:

The same dataset, without the invalid annotations and optionally without their related and/or emptied images.

Example

>>> from lours.utils.doc_utils import dummy_dataset
>>> example = dummy_dataset(2, 4, seed=1)
>>> example.annotations.loc[0, "box_width"] = -1
>>> example
Dataset object containing 2 images and 4 objects
Name :
    shake_effort_many
Images root :
    care/suggest
Images :
    width  height        relative_path  type  split
id
0     955     229  determine/story.jpg  .jpg   eval
1     131     840       air/method.bmp  .bmp  train
Annotations :
    image_id category_str  category_id  ...   box_y_min   box_width  box_height
id                                      ...
0          1     marriage           15  ...  276.974642   -1.000000  353.331683
1          0       listen           14  ...   64.213606  358.653949  116.336568
2          0        reach           22  ...   69.431616  525.305264   41.677117
3          1       listen           14  ...  380.938227   36.133726  442.881021

[4 rows x 8 columns]
Label map :
{14: 'listen', 15: 'marriage', 22: 'reach'}
>>> example.remove_invalid_annotations()
Removed 1 annotation, in 1 image
Dataset object containing 2 images and 3 objects
Name :
    shake_effort_many
Images root :
    care/suggest
Images :
    width  height        relative_path  type  split
id
0     955     229  determine/story.jpg  .jpg   eval
1     131     840       air/method.bmp  .bmp  train
Annotations :
    image_id category_str  category_id  ...   box_y_min   box_width  box_height
id                                      ...
1          0       listen           14  ...   64.213606  358.653949  116.336568
2          0        reach           22  ...   69.431616  525.305264   41.677117
3          1       listen           14  ...  380.938227   36.133726  442.881021

[3 rows x 8 columns]
Label map :
{14: 'listen', 15: 'marriage', 22: 'reach'}
>>> example.remove_invalid_annotations(remove_related_images=True)
Removed 1 image with invalid annotations
Dataset object containing 1 image and 2 objects
Name :
    shake_effort_many
Images root :
    care/suggest
Images :
    width  height        relative_path  type split
id
0     955     229  determine/story.jpg  .jpg  eval
Annotations :
    image_id category_str  category_id  ...  box_y_min   box_width  box_height
id                                      ...
1          0       listen           14  ...  64.213606  358.653949  116.336568
2          0        reach           22  ...  69.431616  525.305264   41.677117

[2 rows x 8 columns]
Label map :
{14: 'listen', 15: 'marriage', 22: 'reach'}
>>> from lours.utils.doc_utils import dummy_dataset
>>> example = dummy_dataset(2, 4, seed=1)
>>> example.annotations.loc[[0, 3], "box_width"] = -1
>>> example
Dataset object containing 2 images and 4 objects
Name :
    shake_effort_many
Images root :
    care/suggest
Images :
    width  height        relative_path  type  split
id
0     955     229  determine/story.jpg  .jpg   eval
1     131     840       air/method.bmp  .bmp  train
Annotations :
    image_id category_str  category_id  ...   box_y_min   box_width  box_height
id                                      ...
0          1     marriage           15  ...  276.974642   -1.000000  353.331683
1          0       listen           14  ...   64.213606  358.653949  116.336568
2          0        reach           22  ...   69.431616  525.305264   41.677117
3          1       listen           14  ...  380.938227   -1.000000  442.881021

[4 rows x 8 columns]
Label map :
{14: 'listen', 15: 'marriage', 22: 'reach'}
>>> example.remove_invalid_annotations(remove_emptied_images=True)
Removed 2 annotations, in 1 image
Dataset object containing 1 image and 2 objects
Name :
    shake_effort_many
Images root :
    care/suggest
Images :
    width  height        relative_path  type split
id
0     955     229  determine/story.jpg  .jpg  eval
Annotations :
    image_id category_str  category_id  ...  box_y_min   box_width  box_height
id                                      ...
1          0       listen           14  ...  64.213606  358.653949  116.336568
2          0        reach           22  ...  69.431616  525.305264   41.677117

[2 rows x 8 columns]
Label map :
{14: 'listen', 15: 'marriage', 22: 'reach'}