Nodes
A Node is a single computation unit in the pipeline. It connects a function to its input and output datasets. For a deeper look at NodeInput/NodeOutput and other details, see Pipelines & Nodes — Nodes.
Definition
pub struct Node<F, Input: NodeInput, Output: NodeOutput>
where
F: StableFn<Input::Args>,
F::Output: CompatibleOutput<Output::Output>,
{
pub name: &'static str,
pub func: F,
pub input: Input,
pub output: Output,
}
name— a human-readable label used in logging, hooks, and visualization.func— the function to execute. Can be a closure or a named function.input— a tuple of dataset references to load before callingfunc.output— a tuple of dataset references to save the return value to.
In the minimal example
The “summarize” node reads a CSV file, computes the mean, and stores it in memory:
Node {
name: "summarize",
func: |df: DataFrame| {
let mean = df.column("value").unwrap().f64().unwrap().mean().unwrap();
(mean,)
},
input: (&cat.readings,),
output: (&cat.summary,),
},
The “report” node reads the mean and a parameter, then writes a JSON report:
Node {
name: "report",
func: |mean: f64, threshold: f64| {
(json!({ "mean": mean, "passed": mean >= threshold }),)
},
input: (&cat.summary, ¶ms.threshold),
output: (&cat.report,),
},
Input and output tuples
Inputs and outputs are tuples of dataset references. The function’s arguments must match the LoadItem types of the input datasets, and its return value must match the SaveItem types of the output datasets.
// Single input, single output
input: (&cat.readings,),
output: (&cat.summary,),
// Multiple inputs, single output
input: (&cat.summary, ¶ms.threshold),
output: (&cat.report,),
// No inputs (side-effect node)
input: (),
output: (&cat.result,),
Tuples of up to 10 elements are supported for both inputs and outputs.
Return values
Node functions return tuples matching the output datasets. A single-output node returns a 1-tuple:
func: |x: i32| (x * 2,), // note the trailing comma
Multi-output nodes return larger tuples:
func: |x: i32| (x + 1, x * 2),
output: (&cat.incremented, &cat.doubled),
Fallible nodes
Node functions can return Result to signal errors:
Node {
name: "parse",
func: |text: String| -> Result<(i32,), PondError> {
let n = text.trim().parse::<i32>().map_err(|e| PondError::Custom(e.to_string()))?;
Ok((n,))
},
input: (&cat.raw_text,),
output: (&cat.parsed_value,),
}
The CompatibleOutput trait allows both bare tuples and Result<tuple, E> as return types. Type mismatches between the function return and the output datasets are caught at compile time.
See Node Errors for more details.
Type safety
The Node struct uses compile-time checks to ensure:
- The function’s argument types match the input datasets’
LoadItemtypes - The function’s return type matches the output datasets’
SaveItemtypes - Mismatches produce compile errors at node construction, not at runtime