optical.converter.tfrecord.Tfrecord¶
- class optical.converter.tfrecord.Tfrecord(root: Union[str, os.PathLike])[source]¶
Bases:
optical.converter.base.FormatSpecRepresents a tfrecord annotation object.
- Parameters
root (Union[str, os.PathLike]) –
path to root directory. Expects the
rootdirectory to have either of the following layouts:root ├──train.tfrecord ├──test.tfrecord ├──valid.tfrecord
Methods
__init__(root)bbox_scatter([split, category, limit])plots scatter of width and height of bounding boxes
bbox_stats([split, category])computes bbox descriptive stats e.g., mean, std etc.
convert(to[, output_dir, save_under, ...])describe()shows basic data distribution in different split
save(output_dir[, export_to, copy_images])Just another api for convert.
Plots distribution of labels in different splits of the dataset
split([test_size, stratified, random_state])splits the dataset into train and validation sets
Attributes
formatsplits- bbox_scatter(split: Optional[str] = None, category: Optional[str] = None, limit: int = 1000) altair.vegalite.v4.api.Chart¶
plots scatter of width and height of bounding boxes
- Parameters
split (Optional[str]) – split of the dataset e.g.,
train,validetc. Defaults to None.category (Optional[str]) – category to filter out. Defaults to None.
limit (int, optional) – number of samples to plot. Defaults to 1000.
- bbox_stats(split: Optional[str] = None, category: Optional[str] = None) pandas.core.frame.DataFrame¶
computes bbox descriptive stats e.g., mean, std etc.
- Parameters
split (Optional[str]) – split of the dataset e.g.,
train,validetc. Defaults to None.category (Optional[str]) – category to filter out. Defaults to None.
- Returns
stats of the bounding boxes
- Return type
pd.DataFrame
- describe() pandas.core.frame.DataFrame¶
shows basic data distribution in different split
- save(output_dir: Optional[Union[str, os.PathLike]], export_to: Optional[str] = None, copy_images: bool = True)¶
Just another api for convert. Similar to export
- show_distribution() altair.vegalite.v4.api.Chart¶
Plots distribution of labels in different splits of the dataset
- split(test_size: float = 0.2, stratified: bool = False, random_state: int = 42)¶
splits the dataset into train and validation sets
- Parameters
test_size (float, optional) – Fraction of total images to be kept for validation. Defaults to 0.2.
stratified (bool, optional) – Whether to stratify the split. Defaults to False.
random_state (int, optional) – random state for the split. Defaults to 42.
- Returns
Returns an instance of FormatSpec class
- Return type