Skip to content

Maximum Message Size

The default maximum message size is 1MB. There's a way to increase this limit in case you want to, but please think it through before doing so. The safest action might be to enable compression.

The max message size is determined by:

  • Max messages size supported by gRPC (default value is 64MB in Numaflow).
  • Max messages size supported by the Inter-Step Buffer implementation.

If JetStream is used as the Inter-Step Buffer implementation, the default max message size for it is configured as 1MB. You can change it by setting the spec.jetstream.settings in the InterStepBufferService specification.

apiVersion: numaflow.numaproj.io/v1alpha1
kind: InterStepBufferService
metadata:
  name: default
spec:
  jetstream:
    settings: |
      max_payload: 8388608 # 8MB

It's not recommended to use values over 8388608 (8MB) but max_payload can be set up to 67108864 (64MB).

Please be aware that if you increase the max message size of the InterStepBufferService, you probably will also need to change some other limits. For example, if the size of each messages is as large as 8MB, then 100 messages flowing in the pipeline will make each of the Inter-Step Buffer need at least 800MB of disk space to store the messages, and the memory consumption will also be high, that will probably cause the Inter-Step Buffer Service to crash. In that case, you might need to update the retention policy in the Inter-Step Buffer Service to make sure the messages are not stored for too long. Check out the Inter-Step Buffer Service for more details.

Enable Compression

Numaflow supports automatic compression while writing and reading the messages to and from the Inter-Step Buffer, this can help to reduce the storage and network cost to ISB. Enabling compression will help in ISB stability and should be used if the payload is large (e.g, > 1MB). This is transparent to the user-defined functions, compression and decompression is taken care by Numaflow before writing to the ISB and after reading from the ISB.

Available compression types are: - none (default) - gzip - zstd - lz4

Performance Numbers

The tests were run with fixed CPU 300m CPU using random 1KB payload.

Compression Throughput (msg/s) Disk Usage by ISB (GB)
None 1000 7 ~ 7.2
GZIP 132 1.2 ~ 1.4
ZSTD 900 4.5 ~ 4.7
LZ4 1000 2.8 ~ 3

Clearly the best compression (least disk usage) is gzip, but it has the lowest throughput. lz4 has the best throughput and zstd is in the middle. If you want to use gzip, you might need to increase the CPU of numa container to get better performance.

Configuration

You can enable it by setting the compression field in the Pipeline specification.

apiVersion: numaflow.numaproj.io/v1alpha1
kind: Pipeline
metadata:
  name: my-pipeline
spec:
  interStepBuffer:
    compression:
      type: COMPRESSION_TYPE