Datasets

Datasets are the data abstraction in pondrs. Every piece of data flowing through a pipeline — whether it’s a CSV file, an in-memory value, or a hardware register — is a dataset.

The `Dataset` trait

pub trait Dataset: serde::Serialize {
    type LoadItem;
    type SaveItem;
    type Error;

    fn load(&self) -> Result<Self::LoadItem, Self::Error>;
    fn save(&self, output: Self::SaveItem) -> Result<(), Self::Error>;
    fn is_param(&self) -> bool { false }
}

LoadItem — the type produced when loading (e.g. DataFrame, String, f64)
SaveItem — the type accepted when saving (often the same as LoadItem)
Error — the error type for I/O operations. Use core::convert::Infallible for datasets that never fail (like Param)
is_param() — returns true for read-only parameter datasets. The pipeline validator uses this to prevent writing to params.
Serialize supertrait — enables automatic YAML serialization of dataset configuration for the viz and catalog indexer.

Datasets in the minimal example

The catalog uses three dataset types:

#[derive(Serialize, Deserialize)]
struct Catalog {
    readings: PolarsCsvDataset,
    summary: MemoryDataset<f64>,
    report: JsonDataset,
}

`PolarsCsvDataset`

Reads and writes CSV files as Polars DataFrames. Requires the polars feature. Configured with a file path and optional CSV options like separator:

readings:
  path: data/readings.csv
  separator: ","

Thread-safe in-memory storage for intermediate values. Starts empty — loading before any save returns DatasetNotLoaded. Requires the std feature. Uses Arc<Mutex<Option<T>>> internally, so it works safely with the ParallelRunner.

summary: {}

`JsonDataset`

Reads and writes JSON files as serde_json::Value. Requires the json feature.

report:
  path: data/report.json

🤔 pondrs

Datasets

The `Dataset` trait

Datasets in the minimal example

`PolarsCsvDataset`

`MemoryDataset<T>`

`JsonDataset`

Further reading

Keyboard shortcuts

🤔 pondrs

Datasets

The Dataset trait

Datasets in the minimal example

PolarsCsvDataset

MemoryDataset<T>

JsonDataset

Further reading

The `Dataset` trait

`PolarsCsvDataset`

`MemoryDataset<T>`

`JsonDataset`