optical.converter.tfrecord.Tfrecord¶

class optical.converter.tfrecord.Tfrecord(root: Union[str, os.PathLike])[source]¶

Represents a tfrecord annotation object.

Parameters

root (Union[str, os.PathLike]) –

path to root directory. Expects the root directory to have either of the following layouts:

root
├──train.tfrecord
├──test.tfrecord
├──valid.tfrecord

Methods

`__init__`(root)
`bbox_scatter`([split, category, limit])	plots scatter of width and height of bounding boxes
`bbox_stats`([split, category])	computes bbox descriptive stats e.g., mean, std etc.
`convert`(to[, output_dir, save_under, ...])
`describe`()	shows basic data distribution in different split
`save`(output_dir[, export_to, copy_images])	Just another api for convert.
`show_distribution`()	Plots distribution of labels in different splits of the dataset
`split`([test_size, stratified, random_state])	splits the dataset into train and validation sets

Attributes

`format`
`splits`

bbox_scatter(split: Optional[str] = None, category: Optional[str] = None, limit: int = 1000) → altair.vegalite.v4.api.Chart¶

plots scatter of width and height of bounding boxes

Parameters

split (Optional[str]) – split of the dataset e.g., train, valid etc. Defaults to None.
category (Optional[str]) – category to filter out. Defaults to None.
limit (int, optional) – number of samples to plot. Defaults to 1000.

bbox_stats(split: Optional[str] = None, category: Optional[str] = None) → pandas.core.frame.DataFrame¶

computes bbox descriptive stats e.g., mean, std etc.

Parameters

split (Optional[str]) – split of the dataset e.g., train, valid etc. Defaults to None.
category (Optional[str]) – category to filter out. Defaults to None.

Returns

stats of the bounding boxes

Return type

pd.DataFrame

describe() → pandas.core.frame.DataFrame¶: shows basic data distribution in different split

save(output_dir: Optional[Union[str, os.PathLike]], export_to: Optional[str] = None, copy_images: bool = True)¶: Just another api for convert. Similar to export

show_distribution() → altair.vegalite.v4.api.Chart¶: Plots distribution of labels in different splits of the dataset

split(test_size: float = 0.2, stratified: bool = False, random_state: int = 42)¶

splits the dataset into train and validation sets

Parameters

test_size (float, optional) – Fraction of total images to be kept for validation. Defaults to 0.2.
stratified (bool, optional) – Whether to stratify the split. Defaults to False.
random_state (int, optional) – random state for the split. Defaults to 42.

Returns

Returns an instance of FormatSpec class

Return type

FormatSpec