Skip to content

Core

Data(*, store_kind, uuid='', description='')

Bases: Generic[T]

Connects the user-facing containers to data in stores. Is generically typed so that it can interface with any store-supported data type (e.g., NumPy arrays and polars DataFrames).

description = description

label = None

parent_chain

Chain of object references to get from the Project and any other Containers to this Data object. This does not include the Data object itself or the user-specified run_id.

An atomea Project is the root container for any and all data for a specific project. We often nest Containers to group related information together and provide an intuitive interface. However, the root Project is the only container that keeps track of the Store backends for arrays and tables in its Project._stores attribute. This ensures that there is only one storage interface per project. All other nested containers contain a parent reference in Container._parent.

Thus, Data.parent_chain traverses these Container._parent attributes backwards until it reaches the root Project to get access to Project._stores. However, we store it in order of Project to Data as you would to access this Data object. For example, (Project, Ensemble, Microstates).

store_kind = store_kind

uuid = uuid

__repr__()

__set_name__(owner, name)

append(data, run_id=None, **kwargs)

bind_to_container(container)

Explicitly bind this Data object to a container.

get(run_id=None, **kwargs)

Get the store-specific object that represents the data stored here.

For example, if the data is stored on disk using some type of memory map, this would return the memory map object, not the in-memory data. If you want to guarantee the data is loaded into memory, use values.

get_store_info(run_id=None)

Determines information needed to access this data from a store.

RETURNS DESCRIPTION
Store

Table or Array store from the project owning this data.

str

Path needed to get this data out of the store.

iter(run_id=None, elements=None, view=None, chunk_size=1, **kwargs)

Yield chunks of data instead of reading all into memory.

PARAMETER DESCRIPTION

run_id

TYPE: str | None DEFAULT: None

elements

Slice for the first dimension. Usually used to specific certain microstates or all microstates

TYPE: OptionalSliceSpec DEFAULT: None

view

Slice spec for all but the first dimension.

TYPE: OptionalSliceSpec DEFAULT: None

chunk

Number of data points of the first axis to yield at each time.

next_micro_id(ens_id, run_id)

Determine the next micro_id by adding one to the currently largest one for this ens_id and run_id.

PARAMETER DESCRIPTION

path

Path to table.

ens_id

ID of the ensemble to query.

TYPE: str

run_id

ID of the run to query.

TYPE: str

RETURNS DESCRIPTION
int

The next micro_id.

prep_dataframe(ens_id, run_id, micro_id_next, data)

read(view=None, run_id=None, **kwargs)

write(data, view=None, run_id=None, **kwargs)