Measuring Server-Sent Events delivery across Origin and Akamai CDN
Server-Sent Events — a W3C spec from 2006 — has become the protocol powering every major LLM streaming interface.
| Provider | Protocol | Use Case |
|---|---|---|
| OpenAI (ChatGPT) | text/event-stream | Token streaming for completions |
| Anthropic (Claude) | text/event-stream | Streaming message responses |
| Google (Gemini) | text/event-stream | Real-time model output |
| xAI (Grok) | text/event-stream | Streaming chat completions |
| Cohere | text/event-stream | Streaming chat + embeddings |
Standard web delivery assumptions break down with Server-Sent Events. Most organizations deploying LLM applications inherit SSE infrastructure requirements they don't understand.
Akamai delivery products (Ion, DSA) support SSE out of the box — with a few critical metadata changes. But SSE behaves fundamentally differently than request/response HTTP, and the CDN’s value proposition shifts accordingly.
A normal HTTP transaction is short-lived: request in, response out, connection recycled. CDNs are built around this model — cache the response, serve it to the next client, offload the origin.
SSE flips this. The connection stays open for minutes or hours. The response is never “complete” — the server pushes events indefinitely. There is no cacheable object. Every client requires a dedicated, persistent connection all the way back to origin. The CDN can’t coalesce, cache, or collapse these connections.
The CDN can’t cache SSE, but it still delivers real value: TLS termination at the nearest edge PoP (lower handshake latency), DDoS absorption before traffic reaches origin, bot management and WAF inspection on the request path, SureRoute for optimized mid-mile transport, and DataStream observability without origin instrumentation.
The tradeoff is configuration complexity. By default, Akamai buffers responses before forwarding — designed for efficiency with cacheable objects, but fatal for real-time SSE delivery. Disabling this requires advanced metadata (not in Property Manager UI) via a Professional Services engagement.
Two delivery paths measured with identical SSE payloads. Each event carries a server-side high-resolution timestamp for precise latency measurement.
Simultaneous SSE streams on both paths. Same event IDs, same client clock — which path delivers each event first?
| ID | Origin | Akamai | Delta | Winner |
|---|---|---|---|---|
| Start a comparison to see matched event pairs... | ||||
Connection phase breakdown for both paths via Resource Timing API. Run a comparison to populate.
CDN delivery for SSE provides real security and connectivity value, but the traditional caching model doesn't apply. Here's what you gain and what you give up.
| Capability | Detail |
|---|---|
| TLS Termination at Edge | Clients negotiate TLS with nearest edge PoP — lower handshake latency, reduced origin TLS load |
| DDoS Absorption | Edge absorbs volumetric and connection-flood attacks before they reach origin |
| Bot Management | Behavioral detection and challenge mechanisms filter malicious clients at edge |
| WAF (Request Inspection) | Request headers, query params, and POST bodies inspected at edge before forwarding |
| Observability (DataStream) | Per-connection logging, timing data, and real-time telemetry without instrumenting origin |
| Global DNS & Routing | Proprietary internet mapping routes clients to optimal edge PoP — lower RTT for geographically distributed users |
| Limitation | Detail |
|---|---|
| Zero Connection Offload | Every client SSE connection requires a corresponding origin connection — 0% cache hit rate |
| Buffering by Default | Without advanced metadata configuration, edge buffers responses — breaking real-time delivery |
| 1:1 Origin Connections | No connection coalescing — origin must be sized for full concurrent client count |
| WAF Can't Inspect Streams | Response body inspection fails on chunked, long-lived streams — must exempt SSE paths |
| Advanced Config Required | Critical settings (buffer-response-v2, chunk-to-client) require PS engagement — not in Property Manager UI |
Critical configuration required to deliver SSE through Akamai without buffering. Advanced metadata requires Professional Services engagement.
| Behavior | Setting | Purpose | UI Available? |
|---|---|---|---|
| Caching | NO_STORE | Prevent edge caching of SSE | Yes |
| Downstream Cache | BUST | Prevent client/intermediate caching | Yes |
| Allow Transfer Encoding | enabled | Enable chunked transfer | Yes |
| Response Buffer | off | Disable edge response buffering | No (metadata) |
| Origin Timeout | 600s | Allow long-lived SSE connections | Product-dependent |
These behaviors are not available in Property Manager UI and require Akamai PS or TAM engagement. Without them, the edge buffers SSE responses, breaking real-time delivery.
| Metadata Tag | Value | Effect |
|---|---|---|
http.buffer-response-v2 | off | Disables edge response buffering — events forwarded immediately |
http.chunk-to-client | on | Enables chunked encoding to the client |
http.forward-chunk-boundary-alignment | on | Preserves chunk boundaries from origin |
lma.origin-edge | off | Disables last-mile acceleration (origin to edge) |
lma.edge-browser | off | Disables last-mile acceleration (edge to browser) |
| Concern | Impact on SSE | Recommendation |
|---|---|---|
| Response body inspection | Cannot buffer streaming response | Exempt /events from response inspection |
| Rate limiting | 1 SSE connection = 1 HTTP request | Limit per-connection, not per-request |
| Bot detection | API clients lack browser signals | Whitelist SSE client User-Agents |
| Slow POST detection | Long-lived connections appear idle | Increase thresholds for SSE paths |
| Layer | Default | SSE Setting | Configuration |
|---|---|---|---|
| Edge → Client | 120s | <120s heartbeat | Server sends : heartbeat every 15-30s |
| Edge → Origin | 120s | 600s | Origin timeout behavior |
| HAProxy | 30s | 600s+ | timeout-server, timeout-tunnel |
| Browser EventSource | auto-reconnect | 3000ms | Server sends retry: 3000 |
Each SSE stream is an unbroken, long-lived socket from the CDN edge to origin. Unlike a page load — where the browser opens a connection, fetches a response, and returns the socket to the pool — an SSE connection is held open for the lifetime of the subscription. That connection is never available for reuse. With HTTP/1.1 to origin, this means every concurrent SSE subscriber consumes a dedicated origin TCP socket.
HTTP/1.1 allows only one in-flight request per TCP connection. For short-lived request/response traffic, the CDN recycles connections via persistent connection (PCONN) pools — one socket serves many sequential requests. SSE breaks this model. The response never completes, so the connection is pinned for the stream’s entire duration.
At scale this becomes an origin capacity problem: 10,000 concurrent SSE subscribers = 10,000 origin TCP sockets, each consuming kernel file descriptors, memory, and load balancer state. The CDN passes every connection through without reduction.
HTTP/2 multiplexes many logical streams onto a single TCP connection. With http2Enabled: true on the origin behavior, the Akamai edge opens one H2 connection per origin node per SureRoute parent server — then maps every concurrent SSE subscriber flowing through that edge server onto shared H2 streams on the same socket.
10,000 concurrent SSE subscribers flowing through 3 edge parent servers = 3 origin TCP connections, not 10,000. Origin connection pressure scales with edge topology, not client count.
Enabling H2 to origin: Set http2Enabled: true on the origin behavior in property rules JSON (PAPI field name — not http2: true). Use a separate origin hostname (e.g., h2-origin-sse) to prevent H2 connection pools from interfering with H1 traffic on the same property.
Verification: NGINX access logs with $connection and $connection_requests variables prove multiplexing. Look for the same connection ID across multiple request log lines. In production testing, connection 1493070 served requests 1 through 9 on a single TCP socket.
Client-to-edge H2: Enabled separately via the http2 behavior with empty options: {}. This is H2 between the browser and the Akamai edge — orthogonal to H2 between edge and origin.
Open N concurrent SSE streams via both H1 and H2 origin paths through Akamai simultaneously. Compares per-stream event delivery latency and jitter between the two L7 protocols. On production, the H2 path multiplexes concurrent streams onto shared L4 TCP connections; the H1 path opens 1 connection per stream.
Browser limit: ~50 concurrent EventSources before client-side JS processing becomes the bottleneck. For higher concurrency, use a server-side test client.
*Latency is measured as inter-event arrival time on the client (performance.now). Subject to client clock drift and browser event-loop scheduling. Use for relative H1/H2 comparison, not absolute measurement.
CDN-only SSE funnels every client stream through a single origin. That origin must hold open one long-lived HTTP connection per subscriber — connections that can’t be cached, coalesced, or offloaded by the CDN. Distributed delivery solves this by placing stateless NATS leaf nodes at 8 global Linode regions. The origin publishes each event once to NATS core; leaf nodes fan it out locally to their own subscribers. The origin never sees individual client connections.
The origin Express server publishes events to an in-cluster NATS core node. Each leaf VM runs an embedded NATS leaf server connected to core over a single TCP connection (port 30422, NodePort). When a client opens an SSE stream on the leaf, the leaf subscribes to the sse.events subject — NATS interest-based routing then begins forwarding events from core to that leaf. When all clients disconnect, the subscription drops and event traffic to that leaf stops entirely.
Akamai GTM resolves distributed-sse.connected-cloud.io to the nearest healthy leaf based on performance routing. The Akamai CDN property routes /distributed/events to this GTM-resolved origin, applying the same streaming optimizations (chunked transfer, buffer bypass) as the CDN-only path.
Leaf nodes: 8 regions (us-ord, us-lax, us-mia, fr-par, br-gru, jp-osa, in-maa, ap-southeast). Each is a single Go binary (~15MB, embedded NATS leaf + HTTPS SSE adapter) on a g6-dedicated-2 Linode. Stateless, horizontal, identical. Add or remove a region by adding or removing one Terraform block.
Open concurrent SSE streams via both CDN-only and distributed paths through Akamai simultaneously. Both paths deliver the same events from the same SharedEventBus — the difference is origin routing. CDN path goes to us-ord origin; distributed path goes to the nearest NATS leaf via GTM.
Hint: Steady-state performance looks similar. Increase the stream count and enable the churn rate slider to simulate clients connecting and disconnecting during the test — observe how connection management overhead affects each path differently at scale.
*Latency is measured as inter-event arrival time on the client (performance.now). Subject to client clock drift and browser event-loop scheduling. Use for relative CDN/distributed comparison, not absolute measurement.