from_caipy#
- from_caipy(dataset_path: Path | str, dataset_name: str | None = None, split: str | None = None, splits_to_read: str | Iterable[str] | None = None, use_schema: bool = False, json_schema: dict | str | Path | None = 'default', booleanize: bool = True) Dataset[source]#
Load a dataset stored in the cAIpy format
See specifications
This will error if
two annotations have the same
category_idbut not the samecategory_strtwo annotations have a different
category_idbut the samecategory_strtwo images have the same
file_name, but not the sameid
- Parameters:
dataset_path – folder root of dataset. Should contain the folders “Images” and “Annotations”.
dataset_name – If specified, will be the dataset name, used when showing the dataset or exporting in other formats such as fiftyone. If not specified, the dataset name will be the name of the root folder.
split – if data is at the root of Images and Annotations folder, the split will be set to this option. Defaults to
Nonesplits_to_read – if given, will only read the specified splits. Useful for a faster loading.
use_schema – If set to True, and
json_schemais not None, will use schema for validation and formatting (see optionjson_schema)json_schema – schema dictionary or Path to a schema that json files will be tested against for compliance. If its not a dictionary, it can be either a url, or a local path. If set to None, or
use_schemais set to False, will not perform any test. Defaults to default schema.booleanize – In the case some attributes are array of enum with unique elements, they will be booleanized (see
booleanize()). Note that this option is only used if json_schema` is not None anduse_schemais set to True. Defaults to True.
- Raises:
ValueError – Inconsistency between two annotations or images (see description above)
- Returns:
Loaded dataset object