fiftyone_convert#

Functions

annotations_to_fiftyone

Convert annotations frame into a DataFrame using the same index with Detection or Keypoint object in the "fo_detection" column and other additional fiftyone columns.

create_fo_dataset

Generic function to create a fiftyone dataset from images and annotations DataFrames.

make_fiftyone_compatible

Make column names compatible with fiftyone.

annotations_to_fiftyone(annotations_frame: DataFrame, attribute_columns: Sequence[str] = (), bbox_column: str = 'bbox', allow_keypoints: bool = True) DataFrame[source]#

Convert annotations frame into a DataFrame using the same index with Detection or Keypoint object in the “fo_detection” column and other additional fiftyone columns.

Parameters:
  • annotations_frame – DataFrame containing information about annotation. Most likely coming out of a Dataset object.

  • attribute_columns – Columns describing real attributes. these attributes are differentiated from functional metadata such as image_id. Defaults to ().

  • bbox_column – column containing bounding coordinates lists. Must be already converted to fiftyone’s xywh format. Defaults to “bbox”.

  • allow_keypoints – If set to True, will deduce keypoints from bounding box of size 0. If not, every bounding box will be a detection. Defaults to True.

Returns:

DataFrame with the same index as annotations_frame, containing fo_detection, fo_id and is_keypoint columns.

  • fo_detection is the fiftyone object describing the detection, to be added to the related image sample. It can be either a Detection or Keypoint object

  • fo_id if the UUID given by fiftyone to identify the detection in the database.

  • is_keypoint is the boolean value indicating if the fiftyone object in the former column is a Detection or Keypoint object. This will make filtering much faster than looking up the type for each row.

create_fo_dataset(name: str, images_root: Path, images: DataFrame, annotations: dict[str, DataFrame], bounding_box_formats: dict[str, str] | None = None, label_map: dict[int, str] | None = None, image_tag_columns: Sequence[str] = (), annotations_attributes_columns: Sequence[str] | dict[str, Sequence[str]] = (), allow_keypoints: bool = False, existing: Literal['error', 'update', 'erase'] = 'error') tuple[Dataset, Series, dict[str, DataFrame]][source]#

Generic function to create a fiftyone dataset from images and annotations DataFrames. See dataset_to_fiftyone() and evaluator_to_fityone() for more specific functions

Parameters:
  • name – Name of the fiftyone dataset to add the samples to. If the dataset does not exist, it will be created.

  • images_root – root folder for images. Fiftyone will try to load the images by concatenating this path and the value in images’ relative_path column.

  • images – DataFrame comprising image information. Must have at least relative_path column, and the column specified in image_tag_columns

  • annotations – dictionary of DataFrames comprising detections annotations information. Each entry must have at least image_id, category_id`, category_str (if label_map argument is None) columns, and the compatible columns for bbox conversion given the corresponding input format (see convert_bbox())

  • bounding_box_formats – dictionary of format strings to convert bounding boxes of given annotations. For each dictionary entry in annotations dictionary, if its key is not included in this dictionary, the format “XYWH” (COCO cAIpy) will be assumed. If set to None, the format “XYWH” will always be assumed. Defaults to None

  • label_map – dictionary comprising the category id -> category string correspondence, similar to Dataset and Evaluator label maps. If given, will populate the category_str of each annotation DataFrame. If set to None, will assume the column is already present. Defaults to None.

  • image_tag_columns – List of column names to use for sample attributes in given images DataFrame when creating fiftyone samples. Defaults to ().

  • annotations_attributes_columns – Either List of column names or dictionary of lists of column names. Is used to give attributes to detection which are then added to the created. If it’s a dictionary, each annotation set in the annotations dictionary gets its own list of columns to use as attributes. If not, the ame list of column will be used for all annotations. Defaults to ().

  • allow_keypoints – if set to True, will convert bounding boxes of size 0, 0 to keypoints

  • existing

    What to do in case there is already a fiftyone dataset with the same name.

    • ”error”: will raise an error.

    • ”erase”: will erase the existing dataset before uploading

      this one

    • ”update”: will try to update the dataset by fusing together samples

      with the same “relative_path”

    Defaults to “error”.

Returns:

tuple with three elements
  • Fiftyone dataset that can then be used to launch the webapp with fiftyone.launch_app()

  • Series with the same index as images input dataframe, containing the fiftyone index of each image’s corresponding sample. Useful when the image needs to be modified.

  • Dictionary of Dataframes, with the same keys as annotations input dictionary. Each value is a DataFrame with the same index as its corresponding value in annotations. Its columns are fo_id and is_keypoint

    • fo_id is the fiftyone index of each annotation (whether it is a Keypoint or a Detection)

    • is_keypoint is a bool column indicating if the annotation is a Keypoint object or simply a Detection object.

make_fiftyone_compatible(input_df: DataFrame, column_names: Sequence[str] = (), replacement_string: str = '->') tuple[DataFrame, list[str]][source]#

Make column names compatible with fiftyone.

Fiftyone is incompatible with names with a “.”, so replace them with a proper character.

Fiftyone is also incompatible with names starting with ‘attributes’, which is the case for libia.Dataset.annotations attributes columns, so we replace the string ‘attributes’ with ‘attr’ in each column name.

Note

If no name in column_names has a forbidden character, this function simply return its inputs.

Parameters:
  • input_df – DataFrame for which column names will be replaced

  • column_names – Column names to rename. If the names are not present in the input, no error will be raised. Defaults to ().

  • replacement_string – string used to replace forbidden characters. Defaults to ->.

Returns:

tuple with 2 elements
  • DataFrame with modified column names from input

  • List of modified names.