reset_index_from_mapping#

Dataset.reset_index_from_mapping(images_index_map: dict[int, int] | DataFrame | Series | None = None, annotations_index_map: dict[int, int] | DataFrame | Series | None = None, remove_unmapped: bool = False) Self[source]#

Reset index of images and annotations dataframe with index maps (index -> new_index) where the value is new index to apply.

The mapping can be either a dictionary, a pandas Series or a DataFrame with only one column. If the dataframe has more than 1 column, this function will raise an error

Parameters:
  • images_index_map – Mapping from original image index to new image index. If it is a DataFrame, it must have only one column. If set to None, will apply the identity mapping. Defaults to None.

  • annotations_index_map – Mapping. Same as images_index_map, but this mapping applies for annotations. If set to None, will apply the identity mapping. Default to None.

  • remove_unmapped – If set to True, will remove the entries in the original dataframes which index is not present in the given mappings. Otherwise, will apply a default mapping so that it is bijective. A range index starting at the highest mapped index+1 will be applied to the missing values in the mapping index. Defaults to False.

Returns:

new dataset instance with images and annotations dataframes which index have been remapped. The annotations will be filtered out according to removed images, and its “image_id” column will be modified to match the new image index.

Return type:

Dataset

Example

>>> from lours.utils.doc_utils import dummy_dataset
>>> example = dummy_dataset(3, 3, seed=2)
>>> example
Dataset object containing 3 images and 3 objects
Name :
    argue_be_structure
Images root :
    what/way
Images :
    width  height            relative_path   type  split
id
0     368     506        police/enter.jpeg  .jpeg  train
1     472     182          also/policy.gif   .gif  train
2     832     401  cold/responsibility.png   .png    val
Annotations :
    image_id  category_str  category_id  ...   box_y_min   box_width  box_height
id                                       ...
0          1  relationship            3  ...   27.311332   69.768824   97.006466
1          2        simply           25  ...  157.041558   20.174848   16.443389
2          2  relationship            3  ...   75.088280  337.101681  193.299936

[3 rows x 8 columns]
Label map :
{3: 'relationship', 7: 'table', 25: 'simply'}

Note that unmapped index gets remapped to a range index starting after the highest value of mapped index, hence the annotation id “2” that gets mapped to “3” even if index “1” was available.

>>> example.reset_index_from_mapping(
...     images_index_map={0: 1, 2: 0}, annotations_index_map={1: 2, 2: 0}
... )
Dataset object containing 3 images and 3 objects
Name :
    argue_be_structure
Images root :
    what/way
Images :
    width  height            relative_path   type  split
id
1     368     506        police/enter.jpeg  .jpeg  train
0     832     401  cold/responsibility.png   .png    val
2     472     182          also/policy.gif   .gif  train
Annotations :
    image_id  category_str  category_id  ...   box_y_min   box_width  box_height
id                                       ...
2          0        simply           25  ...  157.041558   20.174848   16.443389
0          0  relationship            3  ...   75.088280  337.101681  193.299936
3          2  relationship            3  ...   27.311332   69.768824   97.006466

[3 rows x 8 columns]
Label map :
{3: 'relationship', 7: 'table', 25: 'simply'}
>>> example.reset_index_from_mapping(
...     images_index_map={0: 1, 2: 0},
...     annotations_index_map={1: 2, 2: 0},
...     remove_unmapped=True,
... )
Dataset object containing 2 images and 2 objects
Name :
    argue_be_structure
Images root :
    what/way
Images :
    width  height            relative_path   type  split
id
1     368     506        police/enter.jpeg  .jpeg  train
0     832     401  cold/responsibility.png   .png    val
Annotations :
    image_id  category_str  category_id  ...   box_y_min   box_width  box_height
id                                       ...
2          0        simply           25  ...  157.041558   20.174848   16.443389
0          0  relationship            3  ...   75.088280  337.101681  193.299936

[2 rows x 8 columns]
Label map :
{3: 'relationship', 7: 'table', 25: 'simply'}