Application Load Balancing In Google Cloud

A load balancer is a tool used to distribute the traffic (client requests) to multiple servers on which the applications are deployed so that the response time of the application can improve by spreading the work across multiple servers. Consider an application that needs to serve images to the clients’ requests concurrently. By design it was able to serve 100K requests at a time, with the load balancer it can scale horizontally and can serve much more (200K, 300K or even more) than that.

The goal of a load balancer is to maximize speed and utilize the server efficiently so that all the servers can be efficiently utilized and there is no degradation of the performance. A load balancers can be hardware based or software based. Hardware based load balancers run on physical hardware while software load balancers run like an application on standard operating systems.

Traffic Flow through a Load Balancer

Major Benefits of Load Balancing

A Load Balancer is generally critical for applications that serve high end traffic (data, text, images etc.) efficiently without compromising any performance. There are many benefits of using it and some are mentioned below:

  • High Availability of the application – A Load Balancer can be configured in a way that the traffic always goes to the healthy nodes of the system without compromising the availability of the application. In this way, even though some of the server/nodes are not available or unhealthy due to any reason, the Load Balancer will route the traffic to only those servers that can serve the traffic making it best suited for enabling high availability of your application.
  • Avoiding a Single Point of Failure (SPOF) – Load Balancers uses health check techniques in such a way that unhealthy nodes can be detected right away, and client requests are routed directly to healthy nodes to avoid single point of failure of the applications. There are some configurations wherein even though the request was taken by the server but before sending the response it went down, the response can be provided by a different server and there is no session lost.
  • Better Utilization and Scalability – Client traffic is distributed in such a way that no single node serves all the traffic but its redirected to all the servers utilizing the environment most efficiently, giving maximum throughput and avoiding overload. Load Balancer also helps in scaling of the environment horizontally by adding more servers or removing unhealthy servers making it suitable for very critical applications.

There are many more benefits of using a load balancer like security, avoiding server overload, redundancy etc.

Types of Load Balancers in Google Cloud Platform

Google Cloud Platform provides cloud services like computing, storage, data analytics, machine learning, databases etc. over the internet where Google manages all these services. Cloud Load Balancing is one of the services provided that helps in balancing/distributing the traffic for other cloud services like Google Compute, Kubernetes Engine, App Engine, Cloud Functions etc.

There are 2 types of Load Balancers available.

  • Application Load Balancer (HTTP/HTTPS)
  • Network Load Balancer (TCP/UDP/Other IP Protocols)

In the following sections let’s go into more detail about each of these load balancers

Application Load Balancer (HTTP/HTTPS)

If your application has HTTP/HTTPS traffic that needs to be distributed, Application Load Balancers (ALB) are being used as they work on proxy-based Layer 7 (L7) traffic. An ALB can be used to distribute traffic on various Google Cloud Services like applications deployed on Google Compute Engine, Google Kubernetes Engine, Cloud Run or even applications that are hosted outside of Google Cloud. There are two types of ALB that can be used based on whether the application is internet facing or internal.

External Application Load Balancer

This can be used when your application is internet facing which means if the application

needs to serve the traffic over the internet or outside of organization then this load balancer is being use. It provides you with a public IP address that can be used to serve the application hosted on Google Cloud Services. This load balancer can be created for applications deployed Globally or Regionally but Global deployment requires Premium subscription of Google Cloud while Regional can be deployed using both Premium subscription and Standard Subscription

Internal Application Load Balancer

As the name suggests, this type of load balancer is suitable for the applications that are internal to the organizations and doesn’t require accessibility over the internet. Although the application can be accessed from any region of the Google Cloud, a feature provided to ensure global access of the application. This load balancer is Envoy proxy-based regional Layer 7 load balancer that can serve HTTP traffic, but SSL can be enabled on the load balancer to secure the web traffic.

Internal/External Load Balancer on GCP

Network Load Balancer

Network Load Balancers (NLB) are created to handle the IP Protocols (TCP/UDP or others) traffic. Basically, it can route the traffic between similar or different backends within the same region as Virtual Machines, Kubernetes Engines, Containers, etc. The traffic can be generated from different sources like clients that are on the internet, Google Compute virtual machines that have internal or external IPs, etc. There are 2 types of NLBs provided by Google, a Proxy based, and a Passthrough Load Balancer.

Proxy Load Balancer

It’s a TCP Based Proxy Load Balancer that allows the client traffic to be proxied through the load balancer and terminated at the load balancer itself and a new connection is generated to take it to backend VMs. It uses either Google Front Ends or Envoy proxies to do the same. This load balancer can be global or regional. Global External LB can be deployed under Premium or Standard Tier. Global Internal can only be deployed under Premium Tier.

Passthrough Load Balancer

It’s a load balancer that allows the client traffic to be saved and not terminate the connections at load balancer level. External IP for load balancer is reserved which will be used to route the traffic on a configured port and then forwarded to the Target TCP/SSL Proxy as defined in the forwarding rule. This will help preserve the IP Address of the source clients in the header of the request. This helps to keep the client’s IP Address for authentication, troubleshooting, logging or whitelisting purposes. This load balancer can be External or Internal but will always be regional. External Load Balancer can be deployed under both Premium or Standard Tier but Internal Load Balancer can be deployed only under Premium Tier.

Key Differences between Application and Network Load Balancer

An ALB is deployed for Layer 7 traffic which means the application layer. The main purpose of ALB is to route the traffic based on application content and protocol HTTP or HTTPS.

An NLB, on the other hand is deployed for Layer 4 traffic which means the network layer of the infrastructure like VMs, containers, etc. The rule for NLB is to route the traffic based on the IP protocols like TCP or UDP.

Demonstration on creating a HTTP L7 load balancer

We are now going to demonstrate a Layer 7 HTTP Load Balancer in the similar fashion as we have done for the Network Load Balancer. The main change of this demonstration is to show how it can route the traffic of one set of rules to one set of instances while the other set of rules to other sets of instances. Following are some assumptions in this demonstration.

  • Understanding of
    • Creating an instance template in the compute engine service
    • Creating a Managed Instance Group (MIG) and creating instances using that
  • Same HTTP Server is the application which is already deployed in Google Compute Engine Instances using instance template.

To execute all the commands and create the load balancer, you need to sign up with Google Cloud Platform and create an account. A first time signup will give you $300 in free credit to try Google Cloud Products. The first reference link takes you to sign up on Google Cloud Platform. Once the sign up is done, first cloud project needs to be created. The second reference link helps you to create the project. Once the project is created, run all the commands mentioned in this tutorial in Google Cloud shell that can be enabled using below icon. Google Cloud shell is free and provides you the ability to run the commands without installing any Software Development Kit (SDK).

To enable Cloud Shell, click on the icon on the top right corner as shown below:

We are going to create three Linux instances using MIG and run a web server and add an HTML page to each of them.

Create an instance template

This step will create a template for the instance with the pre-installed Web server (httpd) so that we can spin the managed instance group. As the name states, it’s a template that can be conveniently used to save the configuration of the virtual machine and is used when multiple similar type of virtual machines are going to be spin up.

Create a managed instance group based on the template

Now, we are going to use the template created in the previous step so that we can spin the individual instances. For this a managed instance group needs to be created.

Create a firewall rule for health check

This is a google Google-defined health check required by Google for all the load balancers. In the below command, we are using 2 subnets of /22 and /16. Since load balancer is a managed service, Google health check probes ensure that traffic coming from the client reaches the backend instances created. Google provides these two ranges that will ensure the health of the load balancer using Google Front Ends (GFE).

Configurations of Load Balancer

To create a load balancer, we first need to reserve a static IP for it and then ensure that the instances created earlier are setup as a backend for it. This is because the traffic coming from the client will be routed by the load balancer to one of the backend instances one at a time. Here is the process to implement it.

Reserve IP for Load Balancer

Create a static IP address for Load Balancer. This will be IPV4 type and global. This will be the IP address for the load balancer.

Get the IPV4 address.

To get the IPV4 address of the load balancer, issue the following command.

Create a health check.

Create a health check for the load balancer. A health check can be created on any port, we need to do it for port 80 which aligns with the application port.

Create a backend service.

Backend service will allow which protocol to check when the requests are being sent.

Add the MIG to backend service.

Now, add the instance group (mig-demo-http-lb) that was created earlier to this new backend service so that all the instances can be monitored.

Create a URL Map.

URL Map will allow all the traffic requests to end up going to backend service. It is possible to create multiple backend services based on different MIGs so that the requests can do different URLs of the application based on URL Map.

Create a HTTP Proxy

A HTTP Proxy target needs to be created that will route the traffic to the URL map.

Create a forwarding rule

Create a global forwarding rule to route incoming requests to the proxy. A global forwarding rule will allow routing of all the incoming traffic to the proxy, so it needs to be created.

Testing of HTTP Load Balancer

The purpose of this testing is to see whether the load balancer created above is now able to distribute the traffic or not. So we will test the load balancer that will eventually test the configurations we created earlier. For this, go to any web browser and open the URL containing the IP of the load balancer. The format will look like below.

Now, every time you reload the browser, you will get a different message as it can come from different instances.

Message will look like below:

Cost of creating the above infrastructure

The above infrastructure will incur costs on the Google Cloud Platform that can be calculated based on the pricing calculator provided by Google. The link to the calculator has been added in the reference. This cost will be deducted from the trial credits. Here is the breakup.


Demonstration on creating a Network Load Balancer

We are now going to demonstrate Network Load Balancer. Following are some assumptions in this demonstration.

  • Understanding of spinning a Compute instance
  • HTTP Server is the application which is already deployed in Google Compute Engine Instance

We are going to create three Linux instances and run a web server and add an HTML page to each of them. Here are some commands to create a Linux instance and deploy a httpd web server.

Like Application Load Balancer commands and working setup, we will use Cloud Shell to run all the following commands and the cost can be taken care of from the trial credits after signing up on the platform.

Create first Linux instance

The below command will spin up a VM and install the httpd web server in it. We need to create 3 such instances. For this I have used Centos Linux v7 to keep it simple but any Linux family like Debian, RedHat, Ubuntu provided by Google can be used here.Here is the instance # 1.

Create second Linux instance

As stated above, here is the instance # 2.

Create third Linux instance

As stated above, here is the instance # 3

Create a firewall

Now we will add a firewall rule for the load balancer to reach our instances created earlier.

List all the instances

Below command will list all the instances created above.

Run a curl command to view the output of the Web Server

Configurations of Network Load Balancer

Load Balancer requires a static IP that needs to be configured so that the client requests that are received on it will be sent to a specific instance IP.

Static IP Address

First, we need to create a static IP Address for the load balancer.

Health Check

Configure a health check on the basis of the target application instance. This will check the health of the load balancer on a certain port.

Target Pool

Create a target pool using the same health check that was created in the previous step and ensure that the target pool and the instances are in the same region.

Add instances

Add all the three instances to the target pool that were created earlier.

Forwarding Rule

Create a forwarding rule so that the requests can be sent to the instances.

Cost of creating the above infrastructure

Similar to ALB, the above infrastructure will incur some costs on the Google Cloud that can be calculated based on the pricing calculator provided by Google. Here is the breakup.


Testing Network Load Balancer

All the configuration of load balancer is now completed and now we need to send the traffic to the load balancer and see whether it will forward the request to different instances or not, this will be done with the help of commands.

Get IP Address

Find out the IP address of the forwarding rule which is going to be used by the load balancer. Below will first describe the forwarding rule and then we need to extract the IP address from it.

Extract the IP Address part.

This will output the IP address.

Test the response

Access the load balancer using curl command and check the response of it. It should randomly show different outputs as described below.

The output should look similar to this, which will show the network style load balancing between VM instances is working.

Why choose Google Cloud Load Balancing Service over your own Load Balancing?

Google Cloud Load Balancer (GCLB) offers numerous benefits to which it is being used against deploying your own infrastructure.

  • Enhanced Traffic Management and Disaster Recovery Benefits – GCLB makes the traffic management easy to handle with and with load balance globally available, it provides the modern applications recovery benefits so that it doesn’t let the application have availability issues.
  • Faster time to respond – With a different pool of servers for different countries or regions, a single GCLB can ensure that traffic geographically closed to the users can be served.
  • Mix workloads – Content based load balancing, cross regional load balancing, etc. can be managed with GCLB. You have to select the right type of load balancer to ensure different workloads can be served based on the requirements with a single load balancer.
  • Private Load Balancer – Architects can choose to deploy even the internal load balancer for internal applications that are not exposed to internal use, making it best suitable for private workloads within the organization.

There are many more use cases and benefits like you don’t have to manage the hardware, upgrades, patching etc. that can be avoided by using GCLB.