Dataset and Loading Functions

Module: cutoop.data_loader.

classDataset

Bases: object

This class can be used as a base class for dataloader construction.

methodglob_prefix​(root: str): list[str]

Recursively find the prefix list of data (by RGB image).

Since we use the glob package for image searching, you may use the asterisk symbol * for pattern matching. However, since we internally use the ** pattern, you’re not encouraged to use ** to avoid duplicated prefix outputs.

Parameters: root – the root directory for searching.

>>> cutoop.data_loader.Dataset.glob_prefix("../../misc")
['../../misc/sample/0000_', '../../misc/sample/0001_', '../../misc/sample/0002_', '../../misc/sample_real/000000_']
>>> cutoop.data_loader.Dataset.glob_prefix("../../*/sample")
['../../misc/sample/0000_', '../../misc/sample/0001_', '../../misc/sample/0002_']
methodload_color​(path: str): ndarray

Load RGB image in RGB order.

methodload_coord​(path: str): ndarray

Read NOCS image (PNG). This function does the following things:

  1. Read, normalize, and transform the image into RGB order.
  2. Due to historical reasons, the B channel is transformed to 1 - B after reading the image, which is a common transformation for other NOCS image in prior datasets.
  3. Minus 0.5 to change the range of pixel values from 0, 1 to -0.5, 0.5 .

Returns: float array of shape (Height, Width) ranged -0.5, 0.5, denothing xyz coordinates in NOCS space

methodload_depth​(path: str): ndarray

This function read the depth image, selecting the first channel if multiple channels are detected.

Returns: A 2D float array of shape (Height, Width). For Omni6DPose, the unit of pixel value is meter.

methodload_ir​(path: str): ndarray

Load the rendered IR image

methodload_mask​(path: str): ndarray

Load the mask image.

Returns: uint8 array of shape (Height, Width), whose values are related to the objects’ mask ids ( image_meta.ObjectPoseInfo.mask_id ).

methodload_mask_sam​(path: str): tuple[ndarray, ndarray]

Load the mask generated by SAM.

Returns: (masks, mask_ids) where masks is bool array of shape (n_objects, Height, Width) denoting the binary mask of each objects corresponds to mask_ids, whose shape is (n_objects,).

methodload_meta​(path: str): ImageMetaData
methodload_normal​(path: str): ndarray

Read normal image (exr)

Returns: float array of shape (H, W, 3) ranged -1, 1 containing normal vectors

methodvisualize​(out_path = './visual.png', show_id = True, show_box = True, show_nocs_pred = False, show_axes_len = None)

A convenient helper for data visualization.

>>> cutoop.data_loader.Dataset.visualize(
...     prefix="../../misc/sample_real/000000_",
...     out_path="source/_static/gr_5.png",
... )