What is a Load Balancer?

When dealing with the scalability of our software systems, we may often come across the challenge of dealing with an influx of requests. For example, imagine running a popular web app that sells dog apparel. We just released a new product (a fashionable dog sweater) and have noticed a significant increase of requests hitting our single server. Users really want to purchase our new dog sweater for their pups! Right now, our web app architecture looks something like this:

In this architecture, our single server is becoming a bottleneck and is putting our application at risk of performing sub-optimally. In order to alleviate the load on our single server, we decide to scale our web app horizontally and purchase a few additional servers. Each server will host a replica of our app now we will be able to distribute request load more effectively. However, with multiple servers, we have the following issue:

We need a way to direct the traffic! Our app won’t know where to send the requests unless we provide guidance on which server to send the request to. This is where a load balancer comes into play.

A load balancer is a piece of hardware or software (and sometimes both) that helps distribute requests between different system resources. Load balancers are not just an essential aspect when scaling a system horizontally; they also help prevent specific system resources from getting overloaded and possibly going offline. In addition, load balancers are flexible enough to be placed in various places in a software systems architecture. In our web app example, since we are primarily trying to distribute the load between our servers, here is what our new web app architecture will look like with a load balancer:

When we examine the above image, the way requests route to individual servers for our web app may seem a bit like magic. How exactly is the load balancer deciding which server is best fit to handle the incoming request? How does the load balancer make sure one server doesn’t end up taking all the requests by accident? These questions are all decided based on the load balancing algorithm that the load balancer uses. Let’s explore what these algorithms are and how they work!

Load Balancing Algorithms

A load-balancing algorithm is the programmatic logic that a load balancer uses to decide how to distribute requests between a software system’s resources. While not an exhaustive list, we will take a look at the following five algorithms:

Least Connection
Least Response Time
Least Bandwidth
Round Robin
Weighted Round Robin

Least Connection

The least connection (LC) load-balancing algorithm is where requests are distributed to the server with the least number of active connections at the time the request is received. This algorithm assumes all requests generate approximately an equal amount of load.

Least Response Time

The least response time (LRT) load balancing algorithm is a more sophisticated version of the least connection algorithm. This algorithm provides two balancing layers by checking both the resource with the least number of active connections and the least average response time.

Least Bandwidth

The least bandwidth (LB) load-balancing algorithm is where requests are distributed to the server serving the least amount of traffic (usually measured in Mbps).

Round Robin

The round-robin (RR) load-balancing algorithm is considered a circular algorithm because requests are distributed to servers one at a time. Once the last server is reached, the algorithm tells the load balancer to start at the first server it sent a request to and continue the process again.

Weighted Round Robin

The weighted round-robin (WRR) load balancing algorithm is a more advanced version of the round-robin algorithm. This algorithm allows us to assign weights to specific servers and sends requests to the servers with the higher weights.

Load Balancer Placement

In our dog apparel example, our sever quickly became a bottleneck for the increase in requests we were receiving. This meant we needed to place the load balancer between the users and our server. This isn’t always the case! In fact, if for example, our database had become a bottleneck, we could have placed the load balancer between the server and the database. In more realistic architectures, a load balancer is commonly used in both places. Here is what it would look like:

Load Balancing Basics: An Introductory Guide