YAML Configuration
pondrs uses YAML files for catalog and parameter configuration. The App framework loads, patches, and deserializes these files automatically.
Catalog YAML
The catalog file maps dataset field names to their configuration:
# catalog.yml
readings:
path: data/readings.csv
separator: ","
summary: {}
report:
path: data/report.json
Each top-level key corresponds to a field in your catalog struct. The configuration under each key is deserialized into the dataset type for that field.
- File-backed datasets need at least a
path - In-memory datasets use
{} - Nested catalog structs create nested YAML sections
Nested catalogs
#[derive(Serialize, Deserialize)]
struct InputData {
raw: PolarsCsvDataset,
reference: YamlDataset,
}
#[derive(Serialize, Deserialize)]
struct Catalog {
input: InputData,
output: JsonDataset,
}
input:
raw:
path: data/raw.csv
separator: ","
reference:
path: data/ref.yml
output:
path: data/output.json
Parameters YAML
# params.yml
threshold: 0.5
model:
learning_rate: 0.01
epochs: 100
Each key maps to a Param<T> field. Nested structs create nested YAML sections.
CLI overrides
Both catalog and parameter values can be overridden from the command line using dot notation:
# Override parameters
$ my_app run --params threshold=0.8
$ my_app run --params model.learning_rate=0.001
# Override catalog configuration
$ my_app run --catalog output.path=/tmp/result.json
$ my_app run --catalog readings.separator=;
# Multiple overrides
$ my_app run --params threshold=0.8 --params model.epochs=200
How overrides work
- The YAML file is loaded into a
serde_yaml::Valuetree - Each
KEY=VALUEoverride is parsed and applied to the tree using dot notation - Values are parsed as YAML scalars (auto-detecting numbers, bools, strings, null)
- The patched tree is deserialized into the target struct
Overrides create missing intermediate keys if needed β you can override deeply nested values even if the parent keys donβt exist in the file.
File paths
Default paths (when using App::from_args or App::from_cli):
- Catalog:
conf/base/catalog.yml - Params:
conf/base/parameters.yml
Override with CLI flags:
$ my_app --catalog-path my/catalog.yml --params-path my/params.yml run
Or specify paths directly with App::from_yaml:
App::from_yaml("conf/catalog.yml", "conf/params.yml")?