match_index#
- Dataset.match_index(other_images: DataFrame | Dataset, on: str = 'relative_path', remove_unmatched: bool = False) Self[source]#
Reindex a dataset from another images DataFrame.
The given
oncolumn is used to retrieve the index values from the reference images dataframe.Note
If index of rows which value in
oncolumn does not match any row inother_images, DataFrame’s index will be reset to a range index without sorting it.- Parameters:
other_images – images DataFrame taken from another dataset. Must have the column specified in
onon – name of the column to use to retrieve indexes. Must be present in both columns of
self.imagesandother_images. Defaults to “relative_path”.remove_unmatched – if set to True, will remove images from dataset that don’t match any row in the
other_imagesdataframe. The corresponding annotations will also be removed.
- Returns:
Dataset with updated image indexes, along with values in
image_idcolumn of annotations.
Example
>>> from lours.utils.doc_utils import dummy_dataset >>> example = dummy_dataset(5, 5, seed=2) >>> example Dataset object containing 5 images and 5 objects Name : argue_be_structure Images root : what/way Images : width height relative_path type split id 0 368 401 police/enter.jpeg .jpeg train 1 472 640 also/policy.gif .gif val 2 832 831 cold/responsibility.png .png train 3 506 755 increase/pull.jpg .jpg train 4 182 993 Mr/trade.tiff .tiff train Annotations : image_id category_str category_id ... box_y_min box_width box_height id ... 0 0 simply 25 ... 273.908994 168.756932 4.288302 1 4 table 7 ... 106.456857 19.340529 282.426602 2 0 simply 25 ... 41.921967 38.506811 33.166314 3 2 table 7 ... 167.785089 242.139038 119.708224 4 1 simply 25 ... 327.082223 234.360304 238.965568 [5 rows x 8 columns] Label map : {3: 'relationship', 7: 'table', 25: 'simply'} >>> images_modified = example.images.iloc[::2].reset_index(drop=True) >>> images_modified width height relative_path type split 0 368 401 police/enter.jpeg .jpeg train 1 832 831 cold/responsibility.png .png train 2 182 993 Mr/trade.tiff .tiff train >>> example.match_index(images_modified) Dataset object containing 5 images and 5 objects Name : argue_be_structure Images root : what/way Images : width height relative_path type split id 0 368 401 police/enter.jpeg .jpeg train 1 832 831 cold/responsibility.png .png train 2 182 993 Mr/trade.tiff .tiff train 3 472 640 also/policy.gif .gif val 4 506 755 increase/pull.jpg .jpg train Annotations : image_id category_str category_id ... box_y_min box_width box_height id ... 0 0 simply 25 ... 273.908994 168.756932 4.288302 1 2 table 7 ... 106.456857 19.340529 282.426602 2 0 simply 25 ... 41.921967 38.506811 33.166314 3 1 table 7 ... 167.785089 242.139038 119.708224 4 3 simply 25 ... 327.082223 234.360304 238.965568 [5 rows x 8 columns] Label map : {3: 'relationship', 7: 'table', 25: 'simply'}