RadiObject
The top-level container for multi-modal radiology data. A RadiObject contains multiple VolumeCollections (e.g., T1-weighted, T2-weighted, segmentations) and subject-level metadata.
radiobject.RadiObject
Top-level container for multi-collection radiology data with subject metadata.
RadiObject can be either "attached" (backed by storage at a URI) or a "view" (filtered subset referencing a source RadiObject). Views are created by filtering operations and read data from their source with filters applied.
Views are immutable. To persist a view, use write(uri).
Examples:
Attached (has URI):
radi = RadiObject("s3://bucket/dataset")
radi.is_view # False
radi.uri # "s3://bucket/dataset"
View (filtered, no URI):
subset = radi.filter("age > 40")
subset.is_view # True
subset.uri # None
all_obs_ids
cached
property
All obs_ids across all collections (for uniqueness checks).
collection_names
cached
property
Names of all VolumeCollections.
iloc
cached
property
Integer-location based indexing for selecting subjects by position.
index
property
Subject index for bidirectional ID/position lookups.
is_view
property
True if this RadiObject is a filtered view of another.
loc
cached
property
Label-based indexing for selecting subjects by obs_subject_id.
n_collections
property
Number of VolumeCollections.
obs_meta
property
Subject-level observational metadata.
Returns Dataframe for attached RadiObject, pd.DataFrame for views.
obs_subject_ids
property
All obs_subject_id values in index order.
uri
property
URI of this RadiObject, or None if this is a view.
__getattr__(name)
Attribute access to collections (e.g., radi.T1w).
__getitem__(key)
Bracket indexing for subjects by obs_subject_id.
Alias for .loc[] - allows radi["BraTS001"] as shorthand for radi.loc["BraTS001"].
__iter__()
Iterate over collection names.
__len__()
Number of subjects.
__repr__()
Concise representation of the RadiObject.
add_collection(name, vc)
Register an existing VolumeCollection into this RadiObject.
Links the collection if it's already at the expected URI ({uri}/collections/{name}), otherwise copies it. Updates obs_meta with any new subjects found in the collection.
append(images, obs_meta=None, reorient=None, format_hint=None, progress=False)
Append new subjects and their volumes atomically.
collection(name)
Get a VolumeCollection by name.
copy()
Create an independent in-memory copy, detached from the view chain.
Useful when you want to break the reference to the source RadiObject. Note: This does NOT persist data. Call write(uri) to write to storage.
describe()
Return a summary: subjects, collections, shapes, and label distributions.
filter(expr)
Filter subjects using a query expression on obs_meta.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
str
|
TileDB QueryCondition string (e.g., "tumor_grade == 'HGG' and age > 40") |
required |
Returns:
| Type | Description |
|---|---|
RadiObject
|
RadiObject view filtered to matching subjects |
from_collections(uri, collections, obs_meta=None, ctx=None)
classmethod
Create RadiObject from existing VolumeCollections.
Links collections without copying when they're already at expected URIs ({uri}/collections/{name}). Copies collections that are elsewhere.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uri
|
str
|
Target URI for RadiObject. |
required |
collections
|
dict[str, VolumeCollection | str]
|
Dict mapping collection names to VolumeCollection objects or URIs. |
required |
obs_meta
|
DataFrame | None
|
Subject-level metadata keyed by obs_subject_id (one row per subject). If None, derived from collections. |
None
|
ctx
|
Ctx | None
|
TileDB context. |
None
|
Examples:
Collections already at expected locations (no copy):
ct_vc = radi.CT.map(transform).write(uri=f"{URI}/collections/CT")
seg_vc = radi.seg.map(transform).write(uri=f"{URI}/collections/seg")
radi = RadiObject.from_collections(
uri=URI,
collections={"CT": ct_vc, "seg": seg_vc},
)
Collections from elsewhere (will be copied):
radi = RadiObject.from_collections(
uri="./new_dataset",
collections={"T1w": existing_t1w_collection},
)
from_images(uri, images, validate_alignment=False, obs_meta=None, reorient=None, format_hint=None, ctx=None, progress=False)
classmethod
Create RadiObject from NIfTI or DICOM images with auto-format detection.
Each collection's format is auto-detected (or set via format_hint).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uri
|
str
|
Target URI for RadiObject. |
required |
images
|
dict[str, str | Path | Sequence[tuple[str | Path, str]]]
|
Dict mapping collection names to image sources. Sources can be a glob pattern, directory path, or pre-resolved list of (path, subject_id). |
required |
validate_alignment
|
bool
|
If True, verify all collections have same subject IDs. |
False
|
obs_meta
|
DataFrame | None
|
Subject-level metadata keyed by obs_subject_id (one row per subject). |
None
|
reorient
|
bool | None
|
Reorient to canonical orientation (None uses config default). |
None
|
format_hint
|
dict[str, str] | None
|
Dict mapping collection names to format strings ("nifti" or "dicom"). |
None
|
ctx
|
Ctx | None
|
TileDB context. |
None
|
progress
|
bool
|
Show tqdm progress bar. |
False
|
Examples:
NIfTI with glob patterns:
radi = RadiObject.from_images(
uri="./dataset",
images={
"CT": "./imagesTr/*.nii.gz",
"seg": "./labelsTr/*.nii.gz",
},
)
DICOM with pre-resolved tuples:
radi = RadiObject.from_images(
uri="./ct_study",
images={
"CT_head": [(Path("/dicom/sub01/head"), "sub-01")],
},
)
Mixed format with explicit hints:
radi = RadiObject.from_images(
uri="./study",
images={"CT": "/dicom_dir/", "seg": "/labels/*.nii.gz"},
format_hint={"CT": "dicom"},
)
get_obs_row_by_obs_subject_id(obs_subject_id)
Get obs_meta row by obs_subject_id string identifier.
get_volume(obs_id)
Get a volume by obs_id from any collection.
head(n=5)
Return view of first n subjects.
rename_collection(old_name, new_name)
Rename a collection.
sample(n=5, seed=None)
Return view of n randomly sampled subjects.
sel(*, subject)
Select subjects by obs_subject_id. Named-parameter alias for .loc[].
select_collections(names)
Create a view with only specified collections.
tail(n=5)
Return view of last n subjects.
validate()
Validate internal consistency of the RadiObject and all collections.
write(uri, ctx=None)
Write this RadiObject (or view) to storage.
For attached RadiObjects, this copies the entire dataset. For views, this writes only the filtered subset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uri
|
str
|
Target URI for the new RadiObject |
required |
ctx
|
Ctx | None
|
TileDB context |
None
|
Returns:
| Type | Description |
|---|---|
RadiObject
|
New attached RadiObject at the target URI |