optical.converter.createml.CreateML¶
-
class
optical.converter.createml.CreateML(root: Union[str, os.PathLike])[source]¶ Bases:
optical.converter.base.FormatSpecClass to handle createML json annotation transformations
- Parameters
root (Union[str, os.PathLike]) –
path to root directory. Expects the
rootdirectory to have either of the following layouts:root ├── images │ ├── train │ │ ├── 1.jpg │ │ ├── 2.jpg │ │ │ ... │ │ └── n.jpg │ ├── valid (...) │ └── test (...) │ └── annotations ├── train.json ├── valid.json └── test.json
or,
root ├── images │ ├── 1.jpg │ ├── 2.jpg │ │ ... │ └── n.jpg │ └── annotations └── label.json
-
__init__(root: Union[str, os.PathLike])[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__(root)Initialize self.
bbox_scatter([split, category, limit])plots scatter of width and height of bounding boxes
bbox_stats([split, category])computes bbox descriptive stats e.g., mean, std etc.
convert(to[, output_dir, save_under, …])describe()shows basic data distribution in different split
save(output_dir[, export_to, copy_images])Just another api for convert.
Plots distribution of labels in different splits of the dataset
split([test_size, stratified, random_state])splits the dataset into train and validation sets
Attributes
formatsplits-
bbox_scatter(split: Optional[str] = None, category: Optional[str] = None, limit: int = 1000) → altair.vegalite.v4.api.Chart¶ plots scatter of width and height of bounding boxes
- Parameters
split (Optional[str]) – split of the dataset e.g.,
train,validetc. Defaults to None.category (Optional[str]) – category to filter out. Defaults to None.
limit (int, optional) – number of samples to plot. Defaults to 1000.
-
bbox_stats(split: Optional[str] = None, category: Optional[str] = None) → pandas.core.frame.DataFrame¶ computes bbox descriptive stats e.g., mean, std etc.
- Parameters
split (Optional[str]) – split of the dataset e.g.,
train,validetc. Defaults to None.category (Optional[str]) – category to filter out. Defaults to None.
- Returns
stats of the bounding boxes
- Return type
pd.DataFrame
-
describe() → pandas.core.frame.DataFrame¶ shows basic data distribution in different split
-
save(output_dir: Optional[Union[str, os.PathLike]], export_to: Optional[str] = None, copy_images: bool = True)¶ Just another api for convert. Similar to export
-
show_distribution() → altair.vegalite.v4.api.Chart¶ Plots distribution of labels in different splits of the dataset
-
split(test_size: float = 0.2, stratified: bool = False, random_state: int = 42)¶ splits the dataset into train and validation sets
- Parameters
test_size (float, optional) – Fraction of total images to be kept for validation. Defaults to 0.2.
stratified (bool, optional) – Whether to stratify the split. Defaults to False.
random_state (int, optional) – random state for the split. Defaults to 42.
- Returns
Returns an instance of FormatSpec class
- Return type