Gateway Capacity Planning

This page explains how to use Eden's current gateway capacity-knee benchmarks for planning Eve interlays and direct gateway deployments.

Use it when you need to answer:

How much traffic can one Eve gateway instance handle for a specific workload shape?
Which result should be used for production sizing?
When is the backend the bottleneck instead of Eve?
What should operators watch as traffic approaches saturation?

These numbers are controlled benchmark results, not universal hardware guarantees. Use them as starting points, then validate against your own payload sizes, client behavior, backend configuration, network path, telemetry settings, and SLOs.

Benchmark Scope

The current benchmark set measures Eve gateway behavior across 1, 2, 4, and 8 vCPU container budgets. Synthetic rows remove backend storage and execution work so they are closer to the gateway forwarding limit. Real-backend rows include the behavior of the database or application workload behind Eve.

The source benchmark snapshot is benchmark/GATEWAY_VCPU_SCALING_REPORT_2026-06-21.md.

Terms

Term	Meaning
Clean knee	The highest measured offered load that completed without benchmark errors, Eden shedding, or status failures.
First overload	The next measured phase where the gateway, backend, or load generator reported backpressure, errors, shedding, or long queue behavior.
Peak completed throughput	The highest completed throughput observed after overload began. Useful for diagnosis, but not a production target.
Real backend	A benchmark that includes a real database or application workload behind Eve.
Synthetic backend	A fixed-response wire-protocol backend that removes most backend execution and storage work.

For production planning, size from the clean knee. Treat peak completed throughput as saturation evidence, not as usable capacity.

Current Clean Knees

Endpoint workload	1 vCPU clean knee	2 vCPU clean knee	4 vCPU clean knee	8 vCPU clean knee	8 vCPU first overload	8 vCPU peak completed	Planning read
Redis, real backend, mixed GET/SET	200k req/s	400k req/s	400k req/s	400k req/s	500k req/s	501k req/s	The single Redis backend caps clean throughput after 2 vCPU.
Redis, synthetic RESP backend	below 400k req/s	500k req/s	550k req/s	600k req/s	650k req/s	616k req/s	Eve continues scaling past 400k; above 600k the bounded queue begins shedding.
Postgres, real TPC-B workload	3.3k TPS	4.2k TPS	4.2k TPS	4.2k TPS	not reached	4.2k TPS	The real workload and backend dominate after 2 vCPU.
Postgres, synthetic select-only	17.4k TPS	35.5k TPS	67.2k TPS	120.2k TPS	not reached	120.2k TPS	Gateway forwarding scales through 8 vCPU when backend work is removed.
Mongo, synthetic OP_MSG find	8.3k req/s	13.9k req/s	23.2k req/s	27.6k req/s	not reached	27.6k req/s	The Mongo gateway path flattens before the synthetic backend does.
LLM, synthetic OpenAI chat	2.0k req/s	3.0k req/s	4.0k req/s	5.0k req/s	6.0k req/s	5.6k req/s	The direct HTTP/LLM path queues quickly after the clean knee.

How To Use The Table

1. Match The Workload Shape

Pick the row that most closely matches the production traffic path:

Use Redis real-backend numbers when sizing a Redis interlay connected to a similar single-backend Redis deployment.
Use Redis synthetic numbers when evaluating Eve's Redis protocol forwarding ceiling independent of backend storage.
Use Postgres real-backend numbers for transaction-heavy PostgreSQL workloads.
Use Postgres synthetic numbers when profiling gateway forwarding overhead for prepared select-only traffic.
Use Mongo synthetic numbers as current direct-path gateway diagnostics, not as a full application benchmark.
Use LLM synthetic numbers only for small-response, non-streaming OpenAI-compatible request-rate sizing.

If your workload has larger payloads, streaming responses, high fanout, command mirroring, request analysis, migration comparison, or heavier telemetry export, run a workload-specific benchmark.

2. Size From Clean Capacity

Use the clean knee as the maximum observed clean point for that benchmark shape. Production should normally run below that level with enough headroom for:

traffic bursts,
backend latency variation,
connection churn,
larger payloads,
telemetry export,
migration mirroring or comparison,
noisy neighbors on shared infrastructure,
and planned failover capacity.

Do not size from the peak completed column. A gateway that completes more requests under overload may still be violating the customer SLO through queue growth, request shedding, client errors, or latency spikes.

3. Separate Gateway Limits From Backend Limits

Compare real-backend and synthetic rows:

If the real-backend row is much lower than the synthetic row, scale or tune the backend first.
If the synthetic row is low compared with the direct synthetic backend, profile the gateway protocol path.
If both rows flatten together, inspect the local network, file descriptor limits, CPU pinning, client concurrency, and load-generator capacity.

4. Use First Overload As A Warning Boundary

The first overload column is not a target. It is an early-warning boundary for that benchmark shape. A production dashboard should make it obvious when offered load, queue depth, client errors, or gateway shedding approaches that region.

Endpoint Guidance

Redis

Redis is the most mature measured gateway path in this benchmark set.

The real Redis row is backend-limited after 2 vCPU. The synthetic RESP row shows Eve itself continues scaling: the 8 vCPU synthetic run stayed clean through 600k offered req/s and peaked near 616k completed req/s after overload began.

For low-latency Redis workloads:

benchmark with the same command mix, payload size, connection count, and pipeline depth as production,
leave headroom below the clean knee,
watch backend CPU and Redis latency alongside Eve latency,
and rerun sizing tests when enabling migration mirroring, comparison, request analysis, or heavier observability.

PostgreSQL

The real Postgres TPC-B row flattens near 4.2k TPS because the application workload and backend dominate. The synthetic select-only row reaches about 120k TPS at 8 vCPU, which shows the direct gateway path can scale much higher when backend execution is removed.

Use real workload benchmarks for customer sizing. Use the synthetic row when evaluating gateway changes or protocol forwarding overhead.

MongoDB

The synthetic Mongo gateway reaches about 27.6k req/s at 8 vCPU while the direct synthetic backend is much higher. That makes Mongo a current direct-path optimization target.

Use this row as a gateway-development ceiling, not as a full MongoDB application benchmark. Production sizing should include actual query shape, document size, driver behavior, connection pools, replica set or sharded topology, and target provider limits.

LLM And OpenAI-Compatible Traffic

The small-response OpenAI-compatible path stayed clean through 5k req/s at 8 vCPU and peaked near 5.6k completed req/s under overload. Above the clean knee, client errors and long queue tails appear quickly.

Use separate benchmarks for:

streaming responses,
larger response bodies,
tool-calling workloads,
provider retries,
prompt or response analysis,
and model-provider latency variance.

For model traffic, request rate alone is often not the limiting dimension. Wire throughput, streaming duration, token accounting, provider latency, and concurrent open responses can dominate.

Operational Signals To Watch

When running near a capacity knee, watch both Eve and the backend.

Gateway-side signals:

completed request rate,
client error rate,
gateway status failures,
request shedding or queue overflow,
p95, p99, and p99.9 latency,
in-flight request or lane pressure,
connection count and connection churn,
CPU saturation per worker or shard,
and telemetry exporter lag or drops.

Backend-side signals:

backend CPU and memory,
database latency,
connection pool saturation,
slow queries or slow commands,
file descriptor limits,
network throughput,
and provider-side throttling or rate limits.

For Redis direct-lane deployments, metrics such as gateway.lane_pool_waiters and gateway.lane_pool_acquire_wait_microseconds help identify requests waiting on lane-pool capacity.

Reproducing The Runs

The benchmark harnesses live in benchmark/.

Workload	Harness
Redis max req/s	`benchmark/profile-eve-vcpu-knee-sweep.sh` with `benchmark/scenarios/redis-synthetic-max.toml`
Postgres synthetic	`benchmark/profile-postgres-synthetic-vcpu-sweep.sh`
Mongo synthetic	`benchmark/profile-mongo-synthetic-vcpu-sweep.sh`
LLM synthetic	`benchmark/profile-llm-synthetic-vcpu-sweep.sh`

For methodology and raw result paths, see benchmark/GATEWAY_VCPU_SCALING_REPORT_2026-06-21.md.

Caveats

Benchmark results depend on host CPU, kernel, Docker networking, CPU pinning, file descriptor limits, load-generator capacity, backend configuration, and scenario shape.
Synthetic backend rows remove important parts of real application behavior.
Real-backend rows may be capped by backend limits before Eve reaches its own forwarding ceiling.
Large payloads and streaming workloads should be sized separately from small-response request-rate tests.
Migration runs can add mirroring, comparison, data movement, and validation work. Do not size migration traffic from a plain forwarding benchmark alone.

Last updated: October 20, 2018