Load Balancing in Nginx

       ——《Nginx From Beginner to Pro》

Now that you have learned about the basics of load balancing and advantages of using a software load balancer, let’s move forward and work on the Nginx servers you already created in the previous chapters.
Clean Up the Servers
Before setting up anything new, clean up the previous applications so that you can start afresh. This is to keep things simpler. You will be settings up applications in different ways in the upcoming sections of this chapter. The idea is to give you information about different scenarios from a practical perspective.
1. Log on to the WFE1 using ssh -p 3026 user1@127.0.0.1
2. Remove everything from the Nginx home directory.
sudo rm -rf /usr/share/nginx/html/*
3. Reset your configuration ( sudo vi /etc/nginx/nginx.conf ) to the following:
user nginx;
worker_processes 1;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}

http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main ‘$remote_addr – $remote_user – [$time_local] – $document_root –
$document_uri – ‘
‘$request – $status – $body_bytes_sent – $http_referer’;
access_log /var/log/nginx/access.log main;
sendfile on;
keepalive_timeout 65;
index index.html index.htm;
include /etc/nginx/conf.d/*.conf;
}
4. Now, remove the entries in conf.d by using the following command:
sudo rm -f /etc/nginx/conf.d/*.conf
5. Repeat the steps for WFE2.

Create Web Content
Let’s create some content so that it is easy to identify which server served the request. In practical situations, the content on the WFE1 and WFE2 will be same for the same application. Run the following command on
both WFE1 and WFE2:
uname -n | sudo tee /usr/share/nginx/html/index.html
This command is pretty straightforward. It uses the output of uname -n and dumps it in a file called index.html in the default root location of Nginx. View the content and ensure that the output is different on both the servers.
$cat /usr/share/nginx/html/index.html
wfe1.localdomain

Configure WFE1 and WFE2
The content is available on both servers now, but since you have already cleaned up the configuration you will need to re-create the configuration file by using the following command:
sudo cp /etc/nginx/conf.d/default.template /etc/nginx/conf.d/main.conf

The command will create a copy of the configuration for a default website. If you recall, the default.
template contained the following text:
server {
listen 80;
server_name localhost;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
• Restart the service: sudo systemctl restart nginx.
• Repeat the steps on WFE2.
• Once done, you should be able to execute curl localhost on both servers, and you should get output as wfe1.localdomain and wfe2.localdomain respectively. Notice that even though the request is same ( curl localhost ), the output is different. In practice, the output will be the same from both servers.
Set Up NLB Server
Setting up an NLB server is no different than setting up a regular web server. The installation steps are similar to what you have learned already. The configuration, however, is different and you will learn about it in the upcoming sections.
1. Create a new virtual machine called NLB.
2. Set up a NAT configuration as you have learned in previous chapters. It should look similar to Figure 8-4 .

3. Install Nginx (refer to chapter 2 ) on the NLB server.
4. Since it is a new server, when you execute curl localhost , you will see the
default welcome page. You can ignore it for the time being.
5. Open the configuration file ( /etc/nginx/conf.d/default.conf ) and make the
changes as follows:
upstream backend{
server 10.0.2.6;
server 10.0.2.7;
}
server {
listen 80;
location / {
proxy_pass http://backend;
}
6. Restart the service.
7. Try the following command a few times and notice how it gives you output from
WFE1 and WFE2 in an alternate fashion.
[root@nlb ~]# curl localhost
wfe1.localdomain
[root@nlb ~]# curl localhost
wfe2.localdomain
[root@nlb ~]# curl localhost
wfe1.localdomain
[root@nlb ~]# curl localhost
wfe2.localdomain

So, what just happened? Basically, you have set up a load balancer using Nginx and what you saw was the load balancer in action. It was extremely simple, right? There are a couple of directives at play here.
• upstream directive: The upstream directive defines a group of servers. Each server directive points to an upstream server. The server can listen on different ports if needed. You can also mix TCP and UNIX-domain sockets if required. You will learn more about it in the upcoming scenarios.
• proxy_pass directive: This directive sets the address of a proxied server. Notice that in this case, the address was defined as back end, and in turn contained multiple servers. By default, if a domain resolves to several addresses, all of them will be used in a round-robin fashion.
Load Balancing Algorithms
When a load balancer is configured, you need to think about various factors. It helps if you know the application and its underlying architecture. Once you have found the details, you will need to configure some parameters of Nginx so that you can route the traffic accordingly. There are various algorithms that you can use based on your need. You will learn about it next.
Round Robin
This is the default configuration. When the algorithm is not defined, the requests are served in round-robin fashion. At a glance, it might appear way too simple to be useful. But, it is actually quite powerful. It ensures that your servers are equally balanced and each one is working as hard.
Let’s assume that you have two servers, and due to the nature of your application you would like three requests to go to the first server (WFE1) and one request to the second server (WFE2). This way, you can route the traffic in a specific ratio to multiple servers. To achieve this, you can define weight to your server definitions in the configuration file as follows.
upstream backend{
server 10.0.2.6 weight=3;
server 10.0.2.7 weight=1;
}
server {
listen 80;
location / {
proxy_pass http://backend;
}
}
Reload Nginx configuration and try executing curl localhost multiple times. Note that three requests went to the WFE1 server, whereas one request went to WFE2.
[root@nlb ~]# curl localhost
wfe1.localdomain
[root@nlb ~]# curl localhost
wfe1.localdomain
[root@nlb ~]# curl localhost
wfe1.localdomain
[root@nlb ~]# curl localhost
wfe2.localdomain

In scenarios where you cannot easily determine the ratio or weight, you can simply use the least connected algorithm . It means that the request will be routed to the server with the least number of active connections.
This often leads to a good load-balanced performance. To configure this, you can use the configuration file like so:
upstream backend{
least_conn;
server 10.0.2.6 weight=1;
server 10.0.2.7 weight=1;
}
Without a load testing tool, it will be hard to determine the output using command line. But the idea is fairly simple. Apart from the least number of active connections, you can also apply weight to the servers, and it would work as expected.
IP Hash
There are quite a few applications that maintain state on the server: especially the dynamic ones like PHP,Node, ASP.NET, and so on. To give a practical example, let’s say the application creates a temporary file for a specific client and updates him about the progress. If you use one of the round-robin algorithms, the subsequent request might land on another server and the new server might have no clue about the file processing that started on the previous server. To avoid such scenarios, you can make the session sticky, so that once the request from a specific client has reached a server, Nginx continues to route the traffic to the same server. To achieve this, you must use ip_hash directive like so:
upstream backend{
ip_hash;
server 10.0.2.6;
server 10.0.2.7;
}
The configuration above ensures that the request reaches only one specific server for the client based on the client’s IP hash key. The only exception is when the server is down, in which case the request can land on another server.
Generic Hash
A hash algorithm is conceptually similar to an IP hash. The difference here is that for each request the load
balancer calculates a hash that is based on the combination of text and Nginx variables that you can specify.
It sends all requests with that hash to a specific server. Take a look at the following configuration where hash
algorithm is used with variables $scheme (for http or https) and $request_uri (URI of the request):
upstream backend{
hash $scheme$request_uri;
server 10.0.2.6;
server 10.0.2.7;
}

Bear in mind that a hash algorithm will most likely not distribute the load evenly. The same is true for an IP hash. The reason why you still might end up using it is because of your application’s requirement of a sticky session. Nginx PLUS offers more sophisticated configuration options when it comes to session persistence. The best use case for using hash is probably when you have a dynamic page that makes data intensive operations that are cachable. In this case, the request to that dynamic page can go to one server only, which caches the result and keeps serving the cache result, saving the effort required at the database side and on all the other servers.
Least Time (Nginx PLUS), Optionally Weighted
Nginx PLUS has an additional algorithm that can be used. It is called the least time method where the load balancer mathematically combines two metrics for each server—the current number of active connections and a weighted average response time for past requests —and sends the request to the server with the lowest value. This is a smarter and more effective way of doing load balancing with heuristics.
You can choose the parameter on the least_time directive, so that either the time to receive the response
header or the time to receive the full response is considered by the directive. The configuration looks like so:
upstream backend{
least_time (header | last_byte);
server 10.0.2.6 weight=1;
server 10.0.2.7 weight=1;
}
Most Suitable Algorithm
There is no silver bullet or straightforward method to tell you which method will suit you best. There are plenty of variables that need to be carefully determined before you choose the most suitable method. In general, least connections and least time are considered to be best choices for the majority of the workloads.
Round robin works best when the servers have about the same capacity, host the same content, and the requests are pretty similar in nature. If the traffic volume pushes every server to its limit, round robin might push all the servers over the edge at roughly the same time, causing outages.
You should use load testing tools and various tests to figure out which algorithm works best for you. One thing that often helps you make good decision is the knowledge of the application’s underlying architecture.

If you are well aware about the application and its components, you will be more comfortable in doing
appropriate capacity planning.
You will learn about load testing tools, performance, and benchmarking in the upcoming chapters.
Load Balancing Scenarios
So far in this chapter you have seen an Nginx load balancer routing to the back-end Nginx servers. This is not a mandatory requirement. You can choose Nginx to route traffic to any other web server. As a matter of fact, that is what is done mostly in practical scenarios and as far as the request is HTTP based, it will just work.
Nginx routes the request based on the mapped URI. You can use Nginx easily to front end the PHP, ASP.
NET, Node.js, or any other application for that matter and enjoy the benefits of Nginx as you will see in the
upcoming scenarios.

Nginx Routing Request to Express/Node. js
If you recall, in the previous chapter you configured Nginx for MEAN stack. Assuming WFE1 and WFE2 are hosting applications based on MEAN stack and the application is running on port 3000, your NLB server’s configuration will look like the following:
upstream nodeapp {
server 10.0.2.6:3000;
server 10.0.2.7:3000;
}
server {
listen 80;
server_name localhost;
location / {
proxy_pass http://nodeapp;
}
}
A common mistake that usually happens is that the additional ports are not opened in the firewall. So, you need to ensure that ports are opened explicitly by using the following command on both WFE1 and WFE2:
[user1@wfe1 ~]$ sudo firewall-cmd –permanent –add-port=3000/tcp
success
[user1@wfe1 ~]$ sudo firewall-cmd –reload
success
Once you have opened the ports, Nginx will start routing the request successfully. Note that the opened ports are not exposed to the Internet. It is just for Nginx that is load balancing the requests.
Passing the HOST Header
Since everything has been working in these simple demos, it might mislead you into thinking that all you need to pass to the back-end server is the URI. For real world applications you might have additional information in request headers that—if missed—will break the functionality of the application. In other words, the request coming from Nginx to the back-end servers will look different than a request coming directly from the client. This is because Nginx makes some adjustments to headers that it receives from the client. It is important that you are aware of these nuances.
• Nginx gets rid of any empty headers for performance reasons.
• Any header that contains an underscore is considered invalid and is eventually
dropped from the headers collection. You can override this behavior by explicitly
setting underscores_in_headers on ;
• The “HOST” header is set to the value of $proxy_host, which is a variable that
contains the domain name of IP address grabbed from the proxy_pass definition. In
the configuration that follows, it will be backend .
• Connection header is added and set to close.

You can tweak the header information before passing on by using the proxy_set_header directive.
Consider the following configuration in the NLB:
upstream backend{
server 10.0.2.6;
server 10.0.2.7;
}
server {
listen 80;
location / {
proxy_set_header HOST $host;
proxy_pass http://backend;
}
}
In the previous configuration, an explicit HOST header has been set using proxy_set_header directive.
To view the effect, follow these steps:
• Ensure that your NLB configuration appears as the previous configuration block.
Restart Nginx service.
• On WFE1, change the nginx.conf ( sudo vi /etc/nginx/nginx.conf ) such that the
log_format has an additional field called $host as follows:
log_format main ‘$host – $remote_addr – $remote_user – [$time_local] – $document_
root – $document_uri – $request – $status – $body_bytes_sent – $http_referer’;
• Save the file and exit. Restart Nginx service.
• Switch back to NLB and make a few requests using curl localhost
• View the logs on the WFE1 using sudo tail /var/log/nginx/access.log -n 3.
[user1@wfe1 ~]$ sudo tail /var/log/nginx/access.log -n 3
localhost – 10.0.2.9 – – – – /usr/share/nginx/html – /index.html – GET / HTTP/1.0 – 200 – 17 – –
localhost – 10.0.2.9 – – – – /usr/share/nginx/html – /index.html – GET / HTTP/1.0 – 200 – 17 – –
localhost – 10.0.2.9 – – – – /usr/share/nginx/html – /index.html – GET / HTTP/1.0 – 200 – 17 – –
• As you can see, the requests had localhost as the hostname and it is because you have used proxy_set_header HOST $host.
• To view what the result would have looked like without this header change, comment the line in NLB’s configuration:
location / {
# proxy_set_header HOST $host;
proxy_pass http://backend;
}

• Restart Nginx on NLB and retry curl localhost a few times.
• If you view the logs on WFE1 using the tail command, you should see an output
similar to this:
localhost – 10.0.2.9 – – – – /usr/share/nginx/html – /index.html – GET / HTTP/1.0 – 200 – 17 – –
backend – 10.0.2.9 – – – – /usr/share/nginx/html – /index.html – GET / HTTP/1.0 – 200 – 17 – –
backend – 10.0.2.9 – – – – /usr/share/nginx/html – /index.html – GET / HTTP/1.0 – 200 – 17 – –
• Notice the last couple of lines where the hostname appears as back end. This is the default behavior of Nginx if you don’t set the HOST header explicitly. Based on
your application, you might need to set explicitly or ignore this header in the NLB configuration.
Forwarding IP Information
Since the requests are forwarded to the back end, it has no information about where the requests have actually come from. To the back-end servers, it knows the NLB as the client. There are scenarios where you might want to log information about the actual visitors. To do that, you can use proxy-set-header just as you did in the previous example but with different variables like so:
location / {
proxy_set_header HOST $proxy_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://backend;
}
In this configuration apart from setting HOST header, you are also setting the following headers:
• X-Real-IP is set to $remote_addr variable that contains the actual client IP.
• X-Forwarded-For is another header set here, which contains $proxy_add_x_forwarded_for. This variable contains a list of $remote_addr – client IPs – separated by a comma.
• To log the actual client IP, you should now modify the log_format to include $http_x_real_ip variable that contains the real client IP information.
• By default, X-Real-IP is stored in $http_x_real_ip. You can change this behavior
by using – real_ip_header X-Forwarded-For; – in your http, location or server directive in order to save the value of X-Forward-For header instead of X-Real-IP
header.

Buffering
As you can guess, with an NLB in between the real back-end server, there are two hops for every request. This may adversely affect the client’s experience. If the buffers are not used, data that is sent from the back-end server immediately gets transmitted to the client. If the clients are fast, they can consume this immediately and buffering can be turned off. For practical purposes, the clients will typically not be as fast as the server in consuming the data. In that case, turning buffering on will tell Nginx to hold the back-end data temporarily, and feed that data to the client. This feature allows the back ends to be freed up quickly since they have to simply work and ensure that the data is fed to Nginx NLB. By default, buffering is on in Nginx
and controlled using the following directives:
• proxy_buffering: Default value is on, and it can be set in http, server, and location blocks.
• proxy_buffers number size : proxy_buffers directive allows you to set the number of
buffers along with its size for a single connection. By default, the size is equal to one
memory page, and is either 4K or 8K depending on the platform.
• proxy_buffer_size size : The headers of the response are buffered separately from the
rest of the response. This directive sets that size, and defaults to proxy_buffers size.
• proxy_max_temp_file_size size : If the response is too large, it can be stored in a
temporary file. This directive sets the maximum size of the temporary file.
• proxy_temp_file_write_size size : This directive governs the size of data written to the
file at a time. If you use 0 as the value, it disables writing temporary files completely.
• proxy_temp_path path : This directive defines the directory where temporary files are
written.
Nginx Caching
Buffering in Nginx helps the back-end servers by offloading data transmission to the clients. But the request actually reaches the backend server to begin with. Quite often, you will have static content, like 3rd party

JavaScript libraries, CSS, Images, PDFs, etc. that doesn’t change at all, or rarely changes. In these cases, it makes sense to make a copy of the data on the NLB itself, so that the subsequent requests could be served directly from the NLB instead of fetching the data every time from the backend servers. This process is called caching.
To achieve this, you can use the proxy_cache_path directive like so in the HTTP block:
proxy_cache_path path levels=1:2 keys_zone=my_cache:10m max_size=10g inactive=60m
Before you use this directive, create the path as follows and set appropriate permissions:
mkdir -p /data/nginx/cache
chown nginx /data/nginx/cache
chmod 700 /data/nginx/cache
• Levels define the number of subdirectories Nginx will create to maintain the cached files. Having a large number of files in one flat directory slows down access, so it is recommended to have at least a two-level directory hierarchy.
• keys_zone defines the area in memory which contains information about cached file keys. In this case a 10MB zone is created and it should be able to hold about 80,000
keys (roughly).

• max_size is used to allocate 10GB space for the cached files. If the size increases, cache manager process trims it down by removing files that were used least recently.
• inactive=60m implies the number of minutes the cache can remain valid in case it is not used. Effectively, if the file is not used for 60 minutes, it will be purged from the cache automatically.
By default, Nginx caches all responses to requests made with the HTTP GET and HEAD methods. You can cache dynamic content too where the data is fetched from a dynamic content management system, but changes less frequently, using fastcgi_cache . You will learn about caching details in chapter 12 .
Server Directive Additional Parameters
The server directive has more parameters that come in handy in certain scenarios. The parameters are fairly
straightforward to use and simply require you to use the following format:
server address [parameters]
You have already seen the server address in use with weight. Let’s learn more about some additional
parameters.
• max_fails=number: Sets the number of unsuccessful attempts before considering the server unavailable for a duration. If this value is set to 0, it disables the accounting of
attempts.
• fail_timeout=time: Sets the duration in which max_fails should happen. For example, if max_fails parameter is set to 3, and fail_timeout is set to 10 seconds, it would imply that there should be 3 failures in 10 seconds so that the server could be considered unavailable.
• backup: Marks the server as a backup server. It will be passed requests when the primary servers are unavailable.
• down: Marks the server as permanently unavailable.
• max_conns=number: Limits the maximum number of simultaneous active connections. Default value of 0 implies no limit.
Configure Nginx (PLUS) for Heath Checks
The free version of Nginx doesn’t have a very important directive, and it is called health_check. This feature is available in Nginx PLUS, and enabling it gives you a lot of options related to health of the upstream servers.
• interval=time : Sets the interval between two health checks. The default value is 5
seconds and it implies that the server checks the upstream servers every 5 seconds.
• fails=number : If the upstream server fails x number of times, it will be considered
unhealthy. The default value is 1.
• passes=number : Once considered unhealthy, the upstream server needs to pass the
test x number of times before it could be considered healthy. The default value is 1.
• uri = path : Defines the URI that is used to check requests. Default value is /.
• match=name : You can specify a block with its expected output in order the test to succeed. In the following configuration, the test is to ensure that the output has a status code of 200, and the body contains “Welcome to nginx!”
http {
server {
location / {
proxy_pass http://backend;
health_check match=welcome;
}
}
match welcome {
status 200;
header Content-Type = text/html;
body ~ “Welcome to nginx!”;
}
}
• If you specify multiple checks, any single failure will make the server be considered
unhealthy.
Activity Monitoring in Nginx (PLUS)
Nginx PLUS includes a real-time activity monitoring interface that provides load and performance metrics.
It uses a RESTful JSON interface, and hence it is very easy to customize. There are plenty of third-party monitoring tools that take advantage of JSON interface and provide you a comprehensive dashboard for performance monitoring.
You can also use the following configuration block to configure Nginx PLUS for status monitoring.
server {
listen 8080;
root /usr/share/nginx/html;
# Redirect requests for / to /status.html
location = / {
return 301 /status.html;
}
location = /status.html { }
location /status {
allow x.x.x.x/16; # permit access from local network
deny all; # deny access from everywhere else
status;
}
}

Status is a special handler in Nginx PLUS. The configuration here is using port 8080 to view the detailed status of Nginx requests. To give you a better idea of the console, the Nginx team has set up a live demo page that can be accessed at http://demo.nginx.com/status.html .
Summary
In this chapter, you have learned about the basic fundamentals of high availability and why it matters. You should also be comfortable with the basic concepts about hardware and software load balancing. Nginx is an awesome product for software load balancing and you have learned about how easily you can set it up in your web farm. The architecture of Nginx allows you to have a very small touch point for front-end servers, and the flexibility ensures that you can customize it precisely based on your requirements. You can scale out your farm easily with Nginx, and use Nginx PLUS to achieve even more robustness in your production farm when the need arises.

发表评论

电子邮件地址不会被公开。 必填项已用*标注