from_caipy_generic#
- from_caipy_generic(images_folder: Path | str | None, annotations_folder: Path | str, dataset_name: str | None = None, split: str | None = None, splits_to_read: str | Iterable[str] | None = None, use_schema: bool = False, json_schema: dict | str | Path | None = 'default', booleanize: bool = True) Dataset[source]#
Load a dataset stored in the cAIpy format, but you can specify images and annotations folders rather than giving a folder with Images and Annotations sub-folders. This gives much more flexibility, especially when working predictions and annotations variations.
See specifications
this will error if
two annotations have the same
category_idbut not the samecategory_strtwo annotations have a different
category_idbut the samecategory_strtwo images have the same
file_name, but not the sameid
- Parameters:
images_folder – folder root of images.
annotations_folder – folder root of annotations.
dataset_name – If specified, will be the dataset name, used when showing the dataset or exporting in other formats such as fiftyone.
split – if data is at the root of Images and Annotations folder, the split will be set to this option. Defaults to
Nonesplits_to_read – if given, will only read the specified splits. Useful for a faster loading.
use_schema – If set to True, and
json_schemais not None, will use schema for validation and formatting (see optionjson_schema)json_schema – schema dictionary or Path to a schema that json files will be tested against for compliance. If its not a dictionary, it can be either a url or a local path. If set to None, or
use_schemais set to False, will not perform any test. Defaults to default schema.booleanize – In the case some attributes are array of enum with unique elements, they will be booleanized (see
booleanize()). Note that this option is only used if json_schema` is not None anduse_schemais set to True. Defaults to True.
- Raises:
ValueError – Inconsistency between two annotations or images (see description above)
- Returns:
Loaded dataset object