remove_invalid_images#

Dataset.remove_invalid_images(load_images: bool = True) Self[source]#

Remove invalid images from dataset.

Parameters:

load_images – If set to True, will not only check that images are valid files, but also that image can be loaded (i.e. are not corrupted files) and that their sizes match the ones included in images dataframe. Note that this makes the function significantly slower. Defaults to True.

Returns:

The same dataset, without the invalid images and their related annotations.

Example

>>> from lours.utils.doc_utils import dummy_dataset
>>> example = dummy_dataset(2, 2, seed=1, generate_real_images=True)
>>> example.images.loc[0, "relative_path"] = Path("bad_path.jpg")
>>> example
Dataset object containing 2 images and 2 objects
Name :
    shake_effort_many
Images root :
    /tmp/care/suggest
Images :
    width  height        relative_path  type  split
id
0     955     229         bad_path.jpg  .jpg  train
1     131     840       air/method.bmp  .bmp  train
Annotations :
    image_id category_str  category_id  ...   box_y_min   box_width  box_height
id                                      ...
0          1       listen           14  ...  276.974642    9.718823  184.684056
1          0        reach           22  ...    6.311037  123.141689  174.239136

[2 rows x 8 columns]
Label map :
{14: 'listen', 15: 'marriage', 22: 'reach'}
>>> example.remove_invalid_images()
Removed 1 image, with 1 annotation
Dataset object containing 1 image and 1 object
Name :
    shake_effort_many
Images root :
    /tmp/care/suggest
Images :
    width  height   relative_path  type  split
id
1     131     840  air/method.bmp  .bmp  train
Annotations :
    image_id category_str  category_id  ...   box_y_min  box_width  box_height
id                                      ...
0          1       listen           14  ...  276.974642   9.718823  184.684056

[1 rows x 8 columns]
Label map :
{14: 'listen', 15: 'marriage', 22: 'reach'}