Summary
DynamicTable.addColumn currently works at the individual dataset level, but users often need to add logical columns that come with dependent datasets or attributes. Two cases stand out:
- ragged columns, where a primary column depends on one or more
*_index datasets
- enum columns, where
EnumData depends on its elements reference and often also benefits from a sibling *_elements dataset for interoperability with PyNWB
I think it would be valuable to add explicit convenience APIs for these cases rather than expecting users to assemble all companion objects manually and pass them through addColumn.
Current pain points
Ragged columns
Today users either construct VectorData/DynamicTableRegion plus one or more VectorIndex objects manually, or use util.create_indexed_column as a separate helper.
Relevant examples:
util.create_indexed_column
tutorials/dynamic_tables.mlx
tutorials/icephys.mlx
This works, but it means the API for adding a logical ragged column is split across multiple objects and helper functions.
Enum columns
EnumData already models the enum relationship via its required elements reference, but users still need to manually manage the supporting VectorData dataset that stores the actual element values. The tutorial also notes that the <name>_elements dataset layout is useful/required for compatibility with PyNWB.
Relevant example:
tutorials/dynamic_tables.mlx
addColumn is not a great fit for these cases
addColumn currently behaves like a low-level dataset insertion helper. That makes some cases awkward:
*_index datasets are companions of a logical column rather than user-facing columns
*_elements datasets are support datasets and should not participate in row-height validation the same way as true columns
- some
DynamicTable subclasses define schema-backed column properties, while others define non-column properties, so property-name handling also needs care
Proposal
Add explicit convenience methods on DynamicTable:
addRaggedColumn(...)
addEnumColumn(...)
These methods would build and install the necessary typed objects and companion datasets in a consistent way, while leaving addColumn available as the lower-level API for already-constructed typed objects.
Possible design direction
addRaggedColumn
Accept row-wise data and construct/store:
- the primary column (
VectorData or DynamicTableRegion)
- one or more
VectorIndex companions as needed
Possible behavior:
- update
colnames only for the primary logical column
- use the terminal index for row-height validation
- support both simple and nested ragged columns
- optionally support table references similar to
util.create_indexed_column(..., table)
addEnumColumn
Accept enum values plus the allowed element set and construct/store:
- the
EnumData primary column
- the backing elements dataset
- the
elements object reference
Possible behavior:
- keep the enum column itself as the logical user-facing column
- optionally store
<name>_elements explicitly for interoperability/documented compatibility with PyNWB
- avoid treating the support dataset as a row-aligned table column for validation purposes
Why separate helpers seem preferable
I think dedicated methods are clearer than making addColumn infer too much from heterogeneous inputs.
That would preserve a nice split:
addColumn for low-level typed-object insertion
addRaggedColumn for ragged logical columns
addEnumColumn for enum logical columns
This feels easier to understand and document than expanding addColumn until it tries to infer all companion-dataset semantics from arbitrary name/value pairs.
Open questions
- Should
addRaggedColumn support doubly-ragged columns from the start, or just single-index ragged columns initially?
- Should
addEnumColumn always materialize <name>_elements, or make that optional?
- Should
util.create_indexed_column remain as a lower-level utility, or eventually delegate to the new helper(s)?
I’m opening this as an enhancement suggestion based on the current DynamicTable behavior and tutorial patterns, since these workflows already exist conceptually but are not yet first-class in the table API.
Written by GPT-5.4
Summary
DynamicTable.addColumncurrently works at the individual dataset level, but users often need to add logical columns that come with dependent datasets or attributes. Two cases stand out:*_indexdatasetsEnumDatadepends on itselementsreference and often also benefits from a sibling*_elementsdataset for interoperability with PyNWBI think it would be valuable to add explicit convenience APIs for these cases rather than expecting users to assemble all companion objects manually and pass them through
addColumn.Current pain points
Ragged columns
Today users either construct
VectorData/DynamicTableRegionplus one or moreVectorIndexobjects manually, or useutil.create_indexed_columnas a separate helper.Relevant examples:
util.create_indexed_columntutorials/dynamic_tables.mlxtutorials/icephys.mlxThis works, but it means the API for adding a logical ragged column is split across multiple objects and helper functions.
Enum columns
EnumDataalready models the enum relationship via its requiredelementsreference, but users still need to manually manage the supportingVectorDatadataset that stores the actual element values. The tutorial also notes that the<name>_elementsdataset layout is useful/required for compatibility with PyNWB.Relevant example:
tutorials/dynamic_tables.mlxaddColumnis not a great fit for these casesaddColumncurrently behaves like a low-level dataset insertion helper. That makes some cases awkward:*_indexdatasets are companions of a logical column rather than user-facing columns*_elementsdatasets are support datasets and should not participate in row-height validation the same way as true columnsDynamicTablesubclasses define schema-backed column properties, while others define non-column properties, so property-name handling also needs careProposal
Add explicit convenience methods on
DynamicTable:addRaggedColumn(...)addEnumColumn(...)These methods would build and install the necessary typed objects and companion datasets in a consistent way, while leaving
addColumnavailable as the lower-level API for already-constructed typed objects.Possible design direction
addRaggedColumnAccept row-wise data and construct/store:
VectorDataorDynamicTableRegion)VectorIndexcompanions as neededPossible behavior:
colnamesonly for the primary logical columnutil.create_indexed_column(..., table)addEnumColumnAccept enum values plus the allowed element set and construct/store:
EnumDataprimary columnelementsobject referencePossible behavior:
<name>_elementsexplicitly for interoperability/documented compatibility with PyNWBWhy separate helpers seem preferable
I think dedicated methods are clearer than making
addColumninfer too much from heterogeneous inputs.That would preserve a nice split:
addColumnfor low-level typed-object insertionaddRaggedColumnfor ragged logical columnsaddEnumColumnfor enum logical columnsThis feels easier to understand and document than expanding
addColumnuntil it tries to infer all companion-dataset semantics from arbitrary name/value pairs.Open questions
addRaggedColumnsupport doubly-ragged columns from the start, or just single-index ragged columns initially?addEnumColumnalways materialize<name>_elements, or make that optional?util.create_indexed_columnremain as a lower-level utility, or eventually delegate to the new helper(s)?I’m opening this as an enhancement suggestion based on the current DynamicTable behavior and tutorial patterns, since these workflows already exist conceptually but are not yet first-class in the table API.
Written by GPT-5.4