optical.converter.pascal.Pascal

class optical.converter.pascal.Pascal(root: Union[str, os.PathLike])[source]

Bases: optical.converter.base.FormatSpec

Represents a Pascal annotation object.

Parameters

root (Union[str, os.PathLike]) –

path to root directory. Expects the root directory to have either of the following layouts:

    root
    ├── images
    │   ├── train
    │   │   ├── 1.jpg
    │   │   ├── 2.jpg
    │   │   │   ...
    │   │   └── n.jpg
    │   ├── valid (...)
    │   └── test (...)
    │
    └── annotations
        ├── train
        |   ├── 1.xml
        │   ├── 2.xml
        │   │   ...
        │   └── n.xml
        ├── valid (...)
        └── test (...)

or,

.. code-block:: bash

    root
    ├── images
    │   ├── 1.jpg
    │   ├── 2.jpg
    │   │   ...
    │   └── n.jpg
    │
    └── annotations
        ├── 1.xml
        ├── 2.xml
        │   ...
        └── n.xml

__init__(root: Union[str, os.PathLike])[source]

Methods

__init__(root)

bbox_scatter([split, category, limit])

plots scatter of width and height of bounding boxes

bbox_stats([split, category])

computes bbox descriptive stats e.g., mean, std etc.

convert(to[, output_dir, save_under, ...])

describe()

shows basic data distribution in different split

save(output_dir[, export_to, copy_images])

Just another api for convert.

show_distribution()

Plots distribution of labels in different splits of the dataset

split([test_size, stratified, random_state])

splits the dataset into train and validation sets

Attributes

format

splits

bbox_scatter(split: Optional[str] = None, category: Optional[str] = None, limit: int = 1000) altair.vegalite.v4.api.Chart

plots scatter of width and height of bounding boxes

Parameters
  • split (Optional[str]) – split of the dataset e.g., train, valid etc. Defaults to None.

  • category (Optional[str]) – category to filter out. Defaults to None.

  • limit (int, optional) – number of samples to plot. Defaults to 1000.

bbox_stats(split: Optional[str] = None, category: Optional[str] = None) pandas.core.frame.DataFrame

computes bbox descriptive stats e.g., mean, std etc.

Parameters
  • split (Optional[str]) – split of the dataset e.g., train, valid etc. Defaults to None.

  • category (Optional[str]) – category to filter out. Defaults to None.

Returns

stats of the bounding boxes

Return type

pd.DataFrame

describe() pandas.core.frame.DataFrame

shows basic data distribution in different split

save(output_dir: Optional[Union[str, os.PathLike]], export_to: Optional[str] = None, copy_images: bool = True)

Just another api for convert. Similar to export

show_distribution() altair.vegalite.v4.api.Chart

Plots distribution of labels in different splits of the dataset

split(test_size: float = 0.2, stratified: bool = False, random_state: int = 42)

splits the dataset into train and validation sets

Parameters
  • test_size (float, optional) – Fraction of total images to be kept for validation. Defaults to 0.2.

  • stratified (bool, optional) – Whether to stratify the split. Defaults to False.

  • random_state (int, optional) – random state for the split. Defaults to 42.

Returns

Returns an instance of FormatSpec class

Return type

FormatSpec