Step-by-Step Tuning Methodology

Don’t tune everything at once. The fastest path to a well-tuned Aeron Transport is incremental: start from sensible defaults, add load, watch which metric degrades first, then turn the one knob that addresses it.

This page is the workflow. For the why behind each knob — the internals of windows, terms, and NAKs — defer to The Aeron Files.

The five steps at a glance

Start with sensible defaults.
Validate your message size.
Load test incrementally.
Tune based on symptoms.
Apply the L3 cache sizing rule.

Step 1: Start with sensible defaults

Begin with stock settings. Resist the urge to pre-optimize.

128K initial window size.
Default term buffer size.
Ensure OS / ENA driver send and receive buffers match.

A mismatch between Aeron’s buffers and the underlying OS or ENA driver buffers is a common source of silent throughput loss. Align them before you change anything else.

Step 2: Validate message size

Keep each message smaller than the MTU (typically 1500 bytes).

You do not need application-level batching. Aeron Transport handles smart batching for you — adding your own batching layer on top usually hurts more than it helps.

Step 3: Load test incrementally

Start with a small load and keep increasing it.

Monitor as you ramp until you see the first sign of stress:

End-to-end latency climbing.
p99 latency climbing.
NAKs (negative acknowledgments) appearing.

The metric that degrades first tells you which knob to reach for next. That’s the whole point of ramping slowly — it isolates the bottleneck.

Step 4: Tune based on symptoms

Match the symptom to the action. Change one thing, then re-test.

Symptom	Action
p50 latency increases	Tune initial window size and send/recv buffers
p99 latency increases	Tune NAK delay and term buffer size

Read this table as a diagnostic. A rising p50 points at steady-state flow control — the window and OS buffers. A rising p99 points at tail events — recovery behavior (NAK delay) and how much in-flight data a term buffer holds. Throughput follows: once p50 and p99 are stable under load, push the load higher and repeat.

Step 5: The L3 cache sizing rule

This is the critical guardrail.

This ensures the active term plus other working-set data all fit in L3, avoiding DRAM spills on the hot path. DRAM spills are exactly what wrecks p99.

Worked example: if your L3 is 36 MB, keep term buffers ≤ 12 MB. The remaining cache leaves room for other hot data — connection state, application objects — to stay resident.

Putting it together

The methodology is deliberately incremental. Start with defaults, add load, observe which metric degrades first, then tune the corresponding parameter — and never violate the 1/3 L3 rule while you do it.

For deeper mechanics of any individual knob, see The Aeron Files.