Scale Subresource is enabled in
Vertex Custom Resource, which makes it possible to scale vertex pods. To be specifically, it is enabled by adding following comments to
Vertex struct model, and then corresponding CRD definition is automatically generated.
Pods management is done by vertex controller.
scale subresource implemented,
vertex object can be scaled by either horizontal or vertical pod autoscaling.
The out of box Numaflow autoscaling is done by a
scaling component running in the controller manager, you can find the source code here. The autoscaling strategy is implemented according to different type of vertices.
For source vertices, we define a target time (in seconds) to finish processing the pending messages based on the processing rate (tps) of the vertex.
pendingMessages / processingRate = targetSeconds
For example, if
targetSeconds is 3, current replica number is
tps is 10000/second, and the pending messages is 60000, so we calculate the desired replica number as following:
desiredReplicas = 60000 / (3 * (10000 / 2)) = 4
Numaflow autoscaling does not work for those source vertices that can not calculate pending messages.
UDF and Sink Vertices¶
Pending messages of a UDF or Sink vertex does not represent the real number because of the restrained writing caused by back pressure, so we use a different model to achieve autoscaling for them.
For each of the vertices, we calculate the available buffer length, and consider it is contributed by all the replicas, so that we can get each replica's contribution.
availableBufferLength = totalBufferLength * bufferLimit(%) - pendingMessages singleReplicaContribution = availableBufferLength / currentReplicas
We define a target available buffer length, and then calculate how many replicas are needed to achieve the target.
desiredReplicas = targetAvailableBufferLength / singleReplicaContribution
Back Pressure Impact¶
Back pressure is considered during autoscaling (which is only available for Source and UDF vertices).
We measure the back pressure by defining a threshold of the buffer usage. For example, the total buffer length is 50000, buffer limit is 80%, and the back pressure threshold is 90%, if in the past period of time, the average pending messages is more than
36000 (50000 * 80% * 90%), we consider there's back pressure.
When the calculated desired replicas is greater than current replicas:
- For vertices which have back pressure from the directly connected vertices, instead of increasing the replica number, we decrease it by 1;
- For vertices which have back pressure in any of its downstream vertices, the replica number remains unchanged.
Numaflow autoscaling can be tuned by updating some parameters, find the details at the doc.