remap_from_other#

Dataset.remap_from_other(other: Dataset, remove_not_mapped: bool = False, remove_emptied_images: bool = False) Self[source]#

Try to remap classes of dataset to match the ones in another dataset by retrieving categories with the same name.

This is useful when trying to merge together two dataset with incompatible label maps.

The mapping is constructed so that no category id represents different category labels between other dataset and remapped dataset.

This function works by first applying the mapping on objects with the same category strings as some other objects in other dataset, and reassign the other categories so that the ids don’t overlap. categories whose name is only present in the current and have the same id as some other category in the other dataset will be iteratively set to the lowest unoccupied category id of all label maps.

Note

The name of a category is ambiguous. Another method of class remapping should be preferred if possible.

See related tutorial

Parameters:
  • other – Other dataset to align the output’s label map with.

  • remove_not_mapped – If set to True, will remove classes that are in self, but not in other dataset’s class mapping. Otherwise, keep them as is. Defaults to False.

  • remove_emptied_images – If set to True, will remove from self.images the images that are now empty of annotation. Note that it will keep the images that were empty before the remapping. Defaults to False.

Raises:

AssertionError – Error raised if label map of one of the two dataset don’t have unique category names.

Returns:

New dataset with remapped classes to match the ones in other

Return type:

Dataset

Example

current dataset has label map {1: car, 2: person, 3:truck} and other dataset has label map {1: train, 2: car, 3: person}. This method will construct this mapping dictionary: {1: 2, 2: 3, 3: 4} so that the remapped dataset has the following label map: {2:car, 3:person, 4:truck} which is now compatible with other dataset’s label map (no overlap)

In the case you merge the two datasets, the resulting merged label map will be: {1: train, 2: car, 3: person, 4: truck}

>>> from lours.utils.doc_utils import dummy_dataset
>>> example1 = dummy_dataset(
...     n_imgs=2,
...     n_annot=2,
...     label_map={1: "car", 2: "person", 3: "truck"},
...     seed=3,
... )
>>> example1
Dataset object containing 2 images and 2 objects
Name :
    have_page_personal
Images root :
    draw/name
Images :
    width  height   relative_path  type  split
id
0     830     261  add/police.bmp  .bmp  train
1     177     313    ok/event.jpg  .jpg  train
Annotations :
    image_id category_str  category_id  ...   box_y_min   box_width  box_height
id                                      ...
0          0        truck            3  ...  102.110558  531.572263   22.921831
1          1       person            2  ...   49.998280   56.543521  111.741397

[2 rows x 8 columns]
Label map :
{1: 'car', 2: 'person', 3: 'truck'}
>>> example2 = dummy_dataset(
...     n_imgs=2,
...     n_annot=2,
...     label_map={1: "train", 2: "car", 3: "person"},
...     seed=1,
... )
>>> example2
Dataset object containing 2 images and 2 objects
Name :
    shake_effort_many
Images root :
    care/suggest
Images :
    width  height        relative_path  type  split
id
0     525     779   reach/marriage.jpg  .jpg  train
1     560     955  determine/story.jpg  .jpg  train
Annotations :
    image_id category_str  category_id  ...   box_y_min   box_width  box_height
id                                      ...
0          0       person            3  ...  586.986712  124.825174   57.793609
1          0       person            3  ...  318.766127  207.777851  100.447514

[2 rows x 8 columns]
Label map :
{1: 'train', 2: 'car', 3: 'person'}
>>> example1.remap_from_other(example2)
Using the following class remapping dictionary :
{1: 2, 2: 3, 3: 4}
Dataset object containing 2 images and 2 objects
Name :
    have_page_personal
Images root :
    draw/name
Images :
    width  height   relative_path  type  split
id
0     830     261  add/police.bmp  .bmp  train
1     177     313    ok/event.jpg  .jpg  train
Annotations :
    image_id category_str  category_id  ...   box_y_min   box_width  box_height
id                                      ...
0          0        truck            4  ...  102.110558  531.572263   22.921831
1          1       person            3  ...   49.998280   56.543521  111.741397

[2 rows x 8 columns]
Label map :
{2: 'car', 3: 'person', 4: 'truck'}
>>> example1.remap_from_other(example2, remove_not_mapped=True)
Using the following class remapping dictionary :
{1: 2, 2: 3}
Dataset object containing 2 images and 1 object
Name :
    have_page_personal
Images root :
    draw/name
Images :
    width  height   relative_path  type  split
id
0     830     261  add/police.bmp  .bmp  train
1     177     313    ok/event.jpg  .jpg  train
Annotations :
    image_id category_str  category_id  ... box_y_min  box_width  box_height
id                                      ...
1          1       person            3  ...  49.99828  56.543521  111.741397

[1 rows x 8 columns]
Label map :
{2: 'car', 3: 'person'}
>>> example1.remap_from_other(
...     example2, remove_not_mapped=True, remove_emptied_images=True
... )
Using the following class remapping dictionary :
{1: 2, 2: 3}
Dataset object containing 1 image and 1 object
Name :
    have_page_personal
Images root :
    draw/name
Images :
    width  height relative_path  type  split
id
1     177     313  ok/event.jpg  .jpg  train
Annotations :
    image_id category_str  category_id  ... box_y_min  box_width  box_height
id                                      ...
1          1       person            3  ...  49.99828  56.543521  111.741397

[1 rows x 8 columns]
Label map :
{2: 'car', 3: 'person'}