from_template#

Dataset.from_template(reset_booleanized: bool = False, **kwargs) Self[source]#

Create a new Dataset object from an existing Dataset.

Optionally, give new values for images_root, images, annotations or label map by providing supplementary kw arguments, which are to be fed to Dataset’s __init__ function.

Note

  • Although the Dataset object is a new one, dataframes are NOT cloned

  • booleanized columns are kept from other dataset to the new one.

Parameters:
  • reset_booleanized – If set to True, will reset booleanized columns for changed dataframes (and only for changed dataframes). Otherwise, the self.booleanized_columns dictionary of sets will only be updated so that columns that are not present anymore will be removed. Defaults to False

  • **kwargs – keywords to overwrite other dataset’s data with other values in the called constructor

Returns:

Resulting dataset, constructed from other dataset’s data and optional additional data.

Example

>>> from lours.utils.doc_utils import dummy_dataset
>>> example = dummy_dataset(2, 2, seed=0)
>>> example
Dataset object containing 2 images and 2 objects
Name :
    inside_else_memory
Images root :
    such/serious
Images :
    width  height      relative_path   type  split
id
0     342     136       help/me.jpeg  .jpeg  train
1     377     167  whatever/wait.png   .png  train
Annotations :
    image_id category_str  category_id  ...  box_y_min   box_width  box_height
id                                      ...
0          0         step           15  ...  73.932999   71.552480   42.673983
1          0          why           19  ...   4.567638  248.551257  122.602211

[2 rows x 8 columns]
Label map :
{15: 'step', 19: 'why', 25: 'interview'}
>>> annotations = pd.DataFrame(
...     data={
...         "image_id": [0, 1],
...         "category_id": [12, 21],
...         "box_x_min": [10, 20],
...         "box_y_min": [30, 40],
...         "box_width": [100, 200],
...         "box_height": [200, 300],
...     },
...     index=[2, 3],
... )
>>> Dataset.from_template(example, annotations=annotations)
Dataset object containing 2 images and 2 objects
Name :
    inside_else_memory
Images root :
    such/serious
Images :
    width  height      relative_path   type  split
id
0     342     136       help/me.jpeg  .jpeg  train
1     377     167  whatever/wait.png   .png  train
Annotations :
    image_id category_str  category_id  ... box_y_min  box_width  box_height
id                                      ...
2          0           12           12  ...      30.0      100.0       200.0
3          1           21           21  ...      40.0      200.0       300.0

[2 rows x 8 columns]
Label map :
{12: '12', 15: 'step', 19: 'why', 21: '21', 25: 'interview'}