coco#

Functions

dataset_to_coco

Save dataset to COCO format.

from_coco

Load a coco json file into a dictionary.

from_coco_keypoints

Special coco loading function for crowds, it will assume point box format (either XY or xy), only one category, with an id of 0, and a category name of person (unless specified otherwise in the coco format)

dataset_to_coco(dataset: Dataset, output_path: Path | str, copy_images: bool = False, to_jpg: bool = True, overwrite_images: bool = False, overwrite_labels: bool = False, add_split_suffix: bool | None = None, box_format: str = 'XYWH', version: str = '0', contributor: str = 'XXII') None[source]#

Save dataset to COCO format. Note that by default, no image or image path is manipulated

Parameters:
  • dataset – Dataset object to save

  • output_path – output folder where to save the json file, and optionally the images. Can also be the name of the output json file when there is only one split value.

  • copy_images – If True, will copy images linked by annotations in a “data” folder, similar to 51. Defaults to False.

  • to_jpg – if True, along with previous option, will convert images to jpg if needed. Defaults to True.

  • overwrite_images – if False with copy_images True, will skip images that are already copied. Defaults to True.

  • overwrite_labels – if False, will skip annotation that are already created. Defaults to True.

  • add_split_suffix – if True, will add the split name to the name of the json file. cannot be False if dataset has multiple splits. If not set, will add suffix only if dataset has multiple splits.

  • box_format – what type of annotation the json file will have. It will be converted from XYWH. Defaults to XYWH

  • version – Arbitrary version number for dataset metadata. Defaults to “0”.

  • contributor – Arbitrary contributor info for dataset metadata. Defaults to “XXII”.

from_coco(coco_json: Path | str, images_root: Path | str | None = None, dataset_name: str | None = None, split: str | None = None, label_map: dict[int, str] | None = None, box_format: str = 'XYWH', drop_columns: Iterable[str] = ('iscrowd', 'segmentation')) Dataset[source]#

Load a coco json file into a dictionary. Note that there is only one split per file, which needs to be given by caller. See specifications (only Object detection)

Notes

  • from_coco is compatible with bounding box annotations without category_id field, but then you will need to have a label map of only one entry, which will be assigned to every bounding box.

  • If split value is not given, it will try to deduce it from the file name. More specifically, it will search a <name>_<split>.json pattern and assign name to the dataset name and split to the split value.

Parameters:
  • coco_json – path of json file

  • images_root – folder which file_name of images are relative to

  • dataset_name – If specified, will be the dataset name, used when showing the dataset or exporting in other formats such as fiftyone. If not specified, the dataset name will be deduced from the name of the json file.

  • split – split of given json file. If not set, will try to deduce from filename. Defaults to None.

  • label_map – Optional dictionary to specify the name of each category id. If not set, will try to deduce it from the json itself, in the field categories at its root.

  • box_format – what type of annotation the json file will have. It will be converted back to XYWH. Defaults to XYWH

  • drop_columns – list of names of columns that need to be dropped from the parsed json dictionary.

Returns:

Loaded dataset object

from_coco_keypoints(coco_json: Path | str, images_root: Path | str | None = None, dataset_name: str | None = None, split: str | None = None, box_format: str = 'XY', category_name: str | None = 'head')[source]#

Special coco loading function for crowds, it will assume point box format (either XY or xy), only one category, with an id of 0, and a category name of person (unless specified otherwise in the coco format)

Parameters:
  • coco_json – path of json file

  • images_root – folder which file_name of images are relative to

  • dataset_name – If specified, will be the dataset name, used when showing the dataset or exporting in other formats such as fiftyone. If not specified, the dataset name will be deduced from the name of the json file.

  • split – split of given json file. If not set, will try to deduce from filename. Defaults to None.

  • box_format – what type of annotation the json file will have. It will be converted back to XYWH, with box width and height set to 0. Defaults to XY

  • category_name – name of the only category of this coco json file. It will then call the from_coco original version with a label map option set to {0: category_name}. If set to None, will deduce it from coco file. Defaults to “person”.

Returns:

Loaded dataset object