Keyboard shortcuts

Press ← or β†’ to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Pipeline

The Pipeline struct groups related steps into a named container with declared input/output dataset contracts.

Definition

pub struct Pipeline<S: StepInfo, Input: NodeInput, Output: NodeOutput> {
    pub name: &'static str,
    pub steps: S,
    pub input: Input,
    pub output: Output,
}
  • name β€” label for logging, hooks, and visualization
  • steps β€” a tuple of nodes (and/or nested pipelines)
  • input β€” datasets this pipeline expects to be available when it runs
  • output β€” datasets this pipeline guarantees to produce

Pipelines are containers β€” runners never call them directly. Instead, they recurse into the pipeline’s steps. The PipelineInfo::is_leaf() method returns false for pipelines, signaling the runner to descend into children.

Example

fn pipeline<'a>(cat: &'a Catalog, params: &'a Params) -> impl Steps<PondError> + 'a {
    (
        Node { name: "load", func: |v| (v,), input: (&params.input,), output: (&cat.raw,) },
        Pipeline {
            name: "processing",
            steps: (
                Node { name: "clean", func: |v| (v,), input: (&cat.raw,), output: (&cat.clean,) },
                Node { name: "transform", func: |v| (v * 2,), input: (&cat.clean,), output: (&cat.result,) },
            ),
            input: (&cat.raw,),
            output: (&cat.result,),
        },
    )
}

Input/output contracts

The input and output declarations are validated by check():

  • Every declared input must be consumed by at least one child node
  • Every declared output must be produced by at least one child node

If these contracts are violated, check() returns CheckError::UnusedPipelineInput or CheckError::UnproducedPipelineOutput.

Nesting

Pipelines can be nested arbitrarily. The validator and runners recurse through the tree:

Pipeline {
    name: "outer",
    steps: (
        Pipeline {
            name: "inner",
            steps: (/* nodes */),
            input: (/* ... */),
            output: (/* ... */),
        },
        Node { /* ... */ },
    ),
    input: (/* ... */),
    output: (/* ... */),
}

Hooks and visualization

Pipeline boundaries fire their own hook events (before_pipeline_run, after_pipeline_run, on_pipeline_error) distinct from node events. In the visualization, pipelines appear as expandable containers that group their child nodes.

When to use Pipeline vs flat tuples

Use a flat tuple of nodes when the pipeline is simple and linear. Use Pipeline when you want to:

  • Name a group of related steps for logging and visualization
  • Declare input/output contracts that are validated by check()
  • Organize large pipelines into logical sections