dkduckkit.dev

Cold start (serverless)

Latency & SRE

A cold start is the extra initialisation latency incurred when a serverless function (AWS Lambda, Google Cloud Run, Azure Functions) is invoked on a new execution environment that must be provisioned from scratch. The runtime must download the function package, start the language runtime, run initialisation code, and establish any database or SDK connections — all before handling the first request. Cold starts are invisible in load tests that keep functions warm, but appear as severe outliers in production latency distributions.

Formula

Typical values: Node.js/Python: 100–300 ms. JVM (Java/Kotlin without SnapStart): 1–5 s for typical frameworks (Spring Boot, Micronaut). Go: 50–150 ms. AWS Lambda SnapStart (JVM): 10–100 ms.

Why it matters in practice

Cold starts turn a 20 ms function into a 500 ms response for the affected percentile of requests. In high-traffic systems this is rare enough to hide in p99 but severe enough to break p99.9 SLOs. The pattern is particularly dangerous for latency-sensitive paths (payment processing, authentication) that scale from zero during off-peak hours and then receive a burst of traffic. Provisioned concurrency (AWS) or minimum instances (Cloud Run) eliminate cold starts at the cost of always-on billing.

Common mistakes

  • Not including cold start time in latency budget calculations for functions that scale to zero.
  • Using mean latency in load tests to validate serverless performance — cold starts inflate p99, not mean.
  • Initialising SDK clients inside the handler function rather than at module load time — this adds cold start latency without any benefit.