Joins and Cycles¶
Numaflow Pipeline Edges can be defined such that multiple Vertices can forward messages to a single vertex.
Quick Start¶
Please see the following examples:
Why do we need JOIN¶
Without JOIN¶
Without JOIN, Numaflow could only allow users to build pipelines where vertices could only read from previous one vertex. This meant that Numaflow could only support simple pipelines or tree-like pipelines. Supporting pipelines where you had to read from multiple sources or UDFs were cumbersome and required creating redundant vertices.
With JOIN¶
Join vertices allow users the flexibility to read from multiple sources, process data from multiple UDFs, and even write to a single sink. The Pipeline Spec doesn't change at all with JOIN, now you can create multiple Edges that have the same “To” Vertex, which would have otherwise been prohibited.
There is no limitation on which vertices can be joined. For instance, one can join Map or Reduce vertices as shown below:
Benefits¶
The introduction of Join Vertex allows users to eliminate redundancy in their pipelines. It supports many-to-one data flow without needing multiple vertices performing the same job.
Examples¶
Join on Sink Vertex¶
By joining the sink vertices, we now only need a single vertex responsible for sending to the data sink.
Example¶
Join on Map Vertex¶
Two different Sources containing similar data that can be processed the same way can now point to a single vertex.
Example¶
Join on Reduce Vertex¶
This feature allows for efficient aggregation of data from multiple sources.
Example¶
Cycles¶
A special case of a "Join" is a Cycle (a Vertex which can send either to itself or to a previous Vertex.) An example use of this is a Map UDF which does some sort of reprocessing of data under certain conditions such as a transient error.
Cycles are permitted, except in the case that there's a Reduce Vertex at or downstream of the cycle. (This is because a cycle inevitably produces late data, which would get dropped by the Reduce Vertex. For this reason, cycles should be used sparingly.)
The following examples are of Cycles: