VolumeCollection
A collection of volumes sharing the same modality or acquisition type (e.g., all T1-weighted scans). Provides pandas-like indexing with iloc and loc accessors.
radiobject.VolumeCollection
TileDB-backed volume collection indexed by obs_id. Supports uniform or heterogeneous shapes.
iloc
cached
property
Integer-location based indexing for selecting volumes by position.
index
property
Volume index for bidirectional ID/position lookups.
is_uniform
property
Whether all volumes in this collection have the same shape.
is_view
property
True if this VolumeCollection is a filtered view of another.
loc
cached
property
Label-based indexing for selecting volumes by obs_id.
name
property
Collection name (if set during creation).
obs
property
Observational metadata per volume.
obs_ids
property
All obs_id values in index order (respects view filter).
obs_subject_ids
property
Get obs_subject_id values for this collection (respects view filter).
shape
property
Volume dimensions (X, Y, Z) if uniform, None if heterogeneous.
subjects
cached
property
Subject-level index (obs_subject_id) for this collection.
uri
property
URI of the underlying storage (raises if view without storage).
__getitem__(key)
Index by int, str, slice, or list. Slices/lists return views.
__iter__()
Iterate over volumes in index order (respects view filter).
__len__()
Number of volumes in collection (respects view filter).
__repr__()
Concise representation of the VolumeCollection.
append(niftis=None, dicom_dirs=None, reorient=None, progress=False)
Append new volumes atomically.
Volume data and obs metadata are written together to maintain consistency.
Cannot be called on views - use write() first.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
niftis
|
Sequence[tuple[str | Path, str]] | None
|
List of (nifti_path, obs_subject_id) tuples. |
None
|
dicom_dirs
|
Sequence[tuple[str | Path, str]] | None
|
List of (dicom_dir, obs_subject_id) tuples. |
None
|
reorient
|
bool | None
|
Reorient to canonical orientation (None uses config default). |
None
|
progress
|
bool
|
Show tqdm progress bar during volume writes. |
False
|
Example
Append new NIfTI files:
radi.T1w.append(
niftis=[
("sub101_T1w.nii.gz", "sub-101"),
("sub102_T1w.nii.gz", "sub-102"),
],
)
copy()
Create detached copy of this collection (views remain views).
filter(expr)
Filter volumes using TileDB QueryCondition on obs. Returns view.
from_dicoms(uri, dicom_dirs, obs=None, reorient=None, validate_dimensions=True, valid_subject_ids=None, name=None, ctx=None, progress=False)
classmethod
Create VolumeCollection from DICOM series with full metadata capture.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uri
|
str
|
Target URI for the VolumeCollection. |
required |
dicom_dirs
|
Sequence[tuple[str | Path, str]]
|
List of (dicom_dir, obs_subject_id) tuples. |
required |
obs
|
DataFrame | None
|
Per-volume metadata with custom obs_id values. Positionally aligned with input files. Requires obs_id and obs_subject_id columns. Imaging metadata is always extracted; raises ValueError on column collisions. |
None
|
reorient
|
bool | None
|
Reorient to canonical orientation (None uses config default). |
None
|
validate_dimensions
|
bool
|
Raise if dimensions are inconsistent. |
True
|
valid_subject_ids
|
set[str] | None
|
Optional whitelist for FK validation. |
None
|
name
|
str | None
|
Collection name (stored in metadata). |
None
|
ctx
|
Ctx | None
|
TileDB context. |
None
|
progress
|
bool
|
Show tqdm progress bar during volume writes. |
False
|
Returns:
| Type | Description |
|---|---|
VolumeCollection
|
VolumeCollection with obs containing DICOM metadata. |
from_niftis(uri, niftis, obs=None, reorient=None, validate_dimensions=True, valid_subject_ids=None, name=None, ctx=None, progress=False)
classmethod
Create VolumeCollection from NIfTI files with full metadata capture.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uri
|
str
|
Target URI for the VolumeCollection. |
required |
niftis
|
Sequence[tuple[str | Path, str]]
|
List of (nifti_path, obs_subject_id) tuples. |
required |
obs
|
DataFrame | None
|
Per-volume metadata with custom obs_id values. Positionally aligned with input files. Requires obs_id and obs_subject_id columns. Imaging metadata is always extracted; raises ValueError on column collisions. |
None
|
reorient
|
bool | None
|
Reorient to canonical orientation (None uses config default). |
None
|
validate_dimensions
|
bool
|
Raise if dimensions are inconsistent. |
True
|
valid_subject_ids
|
set[str] | None
|
Optional whitelist for FK validation. |
None
|
name
|
str | None
|
Collection name (stored in metadata). |
None
|
ctx
|
Ctx | None
|
TileDB context. |
None
|
progress
|
bool
|
Show tqdm progress bar during volume writes. |
False
|
Returns:
| Type | Description |
|---|---|
VolumeCollection
|
VolumeCollection with obs containing NIfTI metadata. |
get_obs_row_by_obs_id(obs_id)
Get observation row by obs_id string identifier.
groupby_subject()
Group volumes by obs_subject_id. Yields (subject_id, view) pairs.
head(n=5)
Return view of first n volumes.
lazy()
Enter lazy mode for deferred transform pipelines.
map(fn)
Apply fn(volume, obs_row) to each volume eagerly. Returns EagerQuery for chaining.
map_batches(fn, batch_size=8)
Apply fn to batches of (volume, obs_row) pairs. Returns EagerQuery.
sample(n=5, seed=None)
Return view of n randomly sampled volumes.
sel(*, subject)
Select volumes by obs_subject_id.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
subject
|
str | list[str]
|
Single subject ID (returns Volume if exactly one match) or list of subject IDs (returns VolumeCollection view). |
required |
tail(n=5)
Return view of last n volumes.
to_dataset(patch_size=None, labels=None, transform=None)
Create PyTorch Dataset from this collection.
Convenience method for ML training integration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
patch_size
|
tuple[int, int, int] | None
|
If provided, extract random patches of this size. |
None
|
labels
|
DataFrame | dict | str | None
|
Label source. Can be:
- str: Column name in this collection's |
None
|
transform
|
Callable[..., Any] | None
|
Transform function applied to each sample. MONAI dict transforms (e.g., RandFlipd) work directly. |
None
|
Returns:
| Type | Description |
|---|---|
VolumeCollectionDataset
|
VolumeCollectionDataset ready for use with |
Examples:
Full volumes with labels from obs column:
dataset = radi.CT.to_dataset(labels="has_tumor")
Patch extraction:
dataset = radi.CT.to_dataset(patch_size=(64, 64, 64), labels="grade")
With MONAI transforms:
from monai.transforms import NormalizeIntensityd
dataset = radi.CT.to_dataset(
labels="has_tumor",
transform=NormalizeIntensityd(keys="image"),
)
to_obs()
Return obs DataFrame (respects view filter).
validate()
Validate internal consistency of obs vs volume metadata.
write(uri=None, name=None, ctx=None)
Write this collection (or view) to new storage.
Creates a new VolumeCollection at the target URI containing all volumes in this view. For views, only the filtered volumes are written.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uri
|
str | None
|
Target URI. If None, generates adjacent to source collection. |
None
|
name
|
str | None
|
Collection name. Also used to derive URI when uri is None. |
None
|
ctx
|
Ctx | None
|
TileDB context. |
None
|