Surprising Economics of Load-Balanced Systems
tech
Adding more servers to your load-balanced system doesn't actually make things worse—and that's thanks to queuing theory. According to Marc Brooker of AWS, when you scale a service horizontally while keeping per-server load constant, mean request latency asymptotically approaches a single second as server count increases. The secret is Erlang's C formula, a classic result in teletraffic engineering. As systems grow, fewer requests hit the queue, so at double the load with double the servers, you handle ninety-six percent of traffic without queuing instead of eighty-seven percent. This creates a rare win: better latency at the same utilization, or better utilization at the same latency, both without sacrificing throughput. It's one of the few scaling problems that genuinely get easier as your system grows.
Source: https://brooker.co.za/blog/2020/08/06/erlang.html
Listen to this story
Hear this and more stories in a personalized audio briefing.
Open The Chonkerton