Skip to content

Dataframe

TileDB-backed DataFrame for subject and observation metadata. Supports lazy loading and efficient queries on large datasets.

radiobject.Dataframe

TileDB-backed sparse dataframe for observation metadata.

Used internally for obs_meta (subject-level, 1-dim) and obs (volume-level, 2-dim) storage. Index dimensions are configurable via create()/from_pandas() and read dynamically from schema.

Example

df = dataframe.read(columns=["age"], value_filter="age > 40")

all_columns property

All column names including index columns.

columns property

Attribute column names (excluding index columns).

dtypes cached property

Column data types (attributes only).

index_columns cached property

Dimension column names, read from TileDB schema.

shape property

(n_rows, n_columns) dimensions.

add_column(name, dtype, fill=None)

Add a new attribute column to the dataframe.

Parameters:

Name Type Description Default
name str

Column name (must not conflict with index columns or existing columns).

required
dtype dtype | type

NumPy dtype for the column.

required
fill object

If provided, write this value to all existing rows.

None

create(uri, schema, ctx=None, index_columns=INDEX_COLUMNS) classmethod

Create an empty sparse Dataframe with configurable index dimensions.

delete(cond)

Delete rows matching a TileDB query condition.

drop_column(name)

Remove an attribute column from the dataframe.

from_pandas(uri, df, ctx=None, index_columns=INDEX_COLUMNS) classmethod

Create a new Dataframe from a pandas DataFrame with required index columns.

read(columns=None, value_filter=None, include_index=True)

Read data with optional column selection and value filtering.

update(df)

Upsert rows from a pandas DataFrame.

Existing coordinates are overwritten, new coordinates are appended. All non-index columns in df must already exist in the schema.