Tag Archives: Nginx Error

[Solved] Nginx error

Error log

/v3.0/uploadIdentityPhoto.security HTTP/1.1", host: "xc-loan.nflow.internal"
2021/11/01 15:01:50 [warn] 1028#1028: *57622584 a client request body is buffered to a temporary file /tmp/client-body/0000003122, client: 240e:434:620:c52d:750e:9df0:6367:175b, server: xc-loan.nflow.internal, request: "POST /loan-web/api/common/user/v3.0/uploadIdentityPhoto.security HTTP/1.1", host: "xc-loan.nflow.internal"
2021/11/01 15:01:56 [warn] 1028#1028: *57622673 a client request body is buffered to a temporary file /tmp/client-body/0000003123, client: 2408:844f:7311:47b9:2cea:adff:fe0e:7242, server: xc-loan.nflow.internal, request: "POST /loan-web/api/common/user/v3.0/uploadIdentityPhoto.security HTTP/1.1", host: "xc-loan.nflow.internal"
2021/11/01 15:02:05 [warn] 1028#1028: *57622830 a client request body is buffered to a temporary file /tmp/client-body/0000003124, client: 125.112.37.67, server: xc-loan.nflow.internal, request: "POST /loan-web/api/common/user/v4.0/commonFaceVerify.security HTTP/1.1", host: "xc-loan.nflow.internal"

Solution:

data:
  client-body-buffer-size: 256k
  proxy-body-size: 25m

In addition, it is not recommended to set these two values too large, because this will cause slow page loading, waste traffic and poor user experience. You should consider compressing the image

client_body_buffer_size   Sets the buffer size of the client request body. If the request body is larger than the buffer, the whole or part of the request body will be written to a temporary file. The default buffer size is equal to two memory pages, 8K on X86, x86-64 and other 32-bit platforms, and 16K on 64 bit platforms

client_body_max_Size sets the maximum allowable size of the client request body (the size specified in the “content length” request header). If the size in the request exceeds the configured value (i.e. client_max_body_size), an error code of 413 (the request entity is too large) will be returned to the client. It should be noted that the browser cannot display this error correctly. Setting the size to 0 disables checking the client request body size

[Solved] Nginx Error: connect() failed (110: Connection timed out) while connecting to upstream

Background

During the stress test of the application service, nginx began to report errors after the continuous pressure test request for about 1min. It took some time to investigate the causes of the errors and finally locate the problems. Now the process is summarized.

Pressure measuring tools

The pressure test here uses siege , which is very easy to specify the number of concurrent accesses and concurrent time, and has very clear result feedback, number of successful accesses, number of failures, throughput and other performance results.

Pressure measurement index

Single interface pressure test, concurrent 100, lasting for 1min.

Error reported by pressure measuring tool

The server is now under siege...
[error] socket: unable to connect sock.c:249: Connection timed out
[error] socket: unable to connect sock.c:249: Connection timed out

Nginx error.log error

2018/11/21 17:31:23 [error] 15622#0: *24993920 connect() failed (110: Connection timed out) while connecting to upstream, client: 192.168.xx.xx, server: xx-qa.xx.com, request: "GET /guide/v1/activities/1107 HTTP/1.1", upstream: "http://192.168.xx.xx:8082/xx/v1/activities/1107", host: "192.168.86.90"

2018/11/21 18:21:09 [error] 4469#0: *25079420 connect() failed (110: Connection timed out) while connecting to upstream, client: 192.168.xx.xx, server: xx-qa.xx.com, request: "GET /guide/v1/activities/1107 HTTP/1.1", upstream: "http://192.168.xx.xx:8082/xx/v1/activities/1107", host: "192.168.86.90"

Troubleshooting problems

When you see timed out , your first impression is that there is a performance problem in the application service, resulting in failure to respond to concurrent requests; By checking the application service logs, it is found that there are no errors in the application service;

Observe the CPU load of the application service (docker container docker state ID ), and find that the CPU utilization increases during concurrent requests. There are no other exceptions, which is normal. However, through continuous observation, it is found that after the start of pressure measurement and error reporting, the CPU load of the application service decreases, and there is no request log in the application service log. For the time being, it can be determined that the unresponsive request should come from the previous node of the application service link, that is, nginx;

Use the command to check the TCP connection of the server where nginx is located during pressure test

# View the current number of connections on port 80
netstat -nat|grep -i "80"|wc -l
5407

# View the status of the current TCP connection
netstat -na | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
LISTEN 12
SYN_RECV 1
ESTABLISHED 454
FIN_WAIT1 1
TIME_WAIT 5000

Two outliers were found in the TCP connection

There are 5K more connections

TCP status time_ Wait to 5000 and stop growing

Start the analysis on these two points:

Theoretically, there should be only 100 connections for the pressure test of 100 concurrent users. The reason for this should be that 5000 connections were created during the pressure test of College

# View siege configuration
vim ~/.siege/siege.conf

# The truth is clear, the original siege in the pressure test, the connection is closed by default, that is, in the continuous pressure test, after the end of each request, directly close the connection, and then create a new connection, then you can understand why the pressure test when Nginx is located on the server TCP connections more than 5000, rather than 100;

# Connection directive. Options "close" and "keep-alive" Starting with
# version 2.57, siege implements persistent connections in accordance 
# to RFC 2068 using both chunked encoding and content-length directives
# to determine the page size. 
#
# To run siege with persistent connections set this to keep-alive. 
#
# CAUTION:        Use the keep-alive directive with care.
# DOUBLE CAUTION: This directive does not work well on HPUX
# TRIPLE CAUTION: We don't recommend you set this to keep-alive
# ex: connection = close
#     connection = keep-alive
#
connection = close

TIME_ Wait to 5000 analysis. First find out the TCP status time_ What does wait mean

Time-wait: wait enough time to ensure that the remote TCP receives the confirmation of the connection interruption request; TCP should ensure that all data can be delivered correctly under all possible circumstances. When you close a socket, actively closing the socket at one end will enter time_ The wait state, while the passive shutdown party turns to the closed state, which can indeed ensure that all data is transmitted.

From the analysis of the time-wait definition, when the connection of the pressure test tool is closed, in fact, the connection of the machine where Nginx is located is not closed immediately, but enters the time-wait state. You can find a lot of explanations on the packet loss caused by too much time-wait on the Internet, which is the same as what I encountered during the pressure test.

# Check the configuration of the server on which Nginx is running
cat /etc/sysctl.conf 
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

vm.swappiness = 0
net.ipv4.neigh.default.gc_stale_time=120


# see details in https://help.aliyun.com/knowledge_detail/39428.html
net.ipv4.conf.all.rp_filter=0
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_announce=2
net.ipv4.conf.all.arp_announce=2


# see details in https://help.aliyun.com/knowledge_detail/41334.html
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_synack_retries = 2
kernel.sysrq = 1
fs.file-max = 65535
net.ipv4.ip_forward = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_max_syn_backlog = 10240
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_synack_retries = 3
net.ipv4.tcp_syn_retries = 3
net.ipv4.tcp_max_orphans = 8192
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_window_scaling = 0
net.ipv4.tcp_sack = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.ip_local_port_range = 1024 65000
net.ipv4.icmp_echo_ignore_all = 0
net.ipv4.tcp_max_tw_buckets = 50005000 Indicates the maximum number of TIME_WAIT sockets that the system can maintain simultaneously. If this number is exceeded, the TIME_WAIT sockets will be cleared immediately and a warning message will be printed.

Optimization scheme

Adjust the Linux kernel parameter optimization by referring to the information obtained from online search:

net.ipv4.tcp_syncookies = 1 indicates that SYN cookies are enabled to handle when there is a SYN wait queue overflow, which protects against a small number of SYN attacks; the default is 0, which means off.

net.ipv4.tcp_tw_reuse = 1 indicates that reuse is enabled. Allows Time-WAIT sockets to be reused for new TCP connections, default is 0 for off.

net.ipv4.tcp_tw_recycle = 1 Indicates that fast recycling of TIME-WAIT sockets for TCP connections is turned on, default is 0, indicating off.

net.ipv4.tcp_fin_timeout = 30 indicates that this parameter determines how long a socket remains in FIN-WAIT-2 state if it is requested to be closed by this end.

net.ipv4.tcp_keepalive_time = 1200 Indicates how often TCP sends keepalive messages when keepalive is up. The default is 2 hours, change to 20 minutes.

net.ipv4.ip_local_port_range = 1024 65000 Indicates the range of ports used for outbound connections. The default is very small: 32768 to 61000, change to 1024 to 65000.

net.ipv4.tcp_max_syn_backlog = 8192 Indicates the length of the SYN queue, default is 1024, increase the queue length to 8192 to accommodate a larger number of network connections waiting to connect.

net.ipv4.tcp_max_tw_buckets = 5000 indicates the maximum number of TIME_WAIT sockets the system can keep simultaneously. If this number is exceeded, the TIME_WAIT sockets will be cleared immediately and a warning message will be printed. The default is 180000, change it to 5000.

 

Nginx Error: nginx: [error] CreateFile() “D:\nginx-1.20.1/logs/nginx.pid” failed (2: The system cannot find the file specified)

background

After modifying the Nginx configuration file Nginx.conf, you want to restart Nginx to make the configuration take effect. CMD enter Nginx installation directory and enter the command:   Nginx -s reload

Error: nginx: [error] createfile() “D: \ nginx-1.20.1/logs/nginx. PID” failed (2: the system cannot find the file specified)

Cause analysis

The direct cause of the problem is that there is no Nginx.pid file in the logs directory under the Nginx installation directory. Generally, the Nginx.pid file will be automatically created when Nginx starts

To start or restart Nginx, you need to kill the original Nginx process. You need to find the original process according to Nginx.PID, where PID is actually the original process ID

The reason for this may be that Nginx is not started or Nginx.PID is manually deleted. Sometimes, when Nginx is started by double-clicking on the Windows system, the Nginx.PID file cannot be automatically generated in the logs directory

If Nginx is not started (and you are not sure whether it is started), do not use Nginx – s reload command as the start command to avoid this error

 

Solution:

1. Find the original nginx process (if any) in Task Manager and close it, then use the cmd command to start nginx

2. cmd into the nginx installation directory

3. Enter the command start nginx or nginx.exe to start nginx

4. Go to the nginx directory under the logs directory and see if there is a nginx.pid file; if not, repeat the 123 operation

5. If nginx.pid has been generated, you can restart normally after modifying the configuration file

6.nginx -s reload

Nginx Error: too many open files [How to Solve]

The system access is abnormal. Log in to the background to check the nginx log, and find that there are a lot of (to many open files) problems in the error.log.
In this case, there are two general investigations:
1. Number of linux open file handles
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 63455
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
The default value of open files is 1024. You can increase the number in two ways:
1. Execute the command:
ulimit -n 65535
Effective immediately, invalid after restart
2. Modify the system configuration file:
vim /etc/security/limits.conf
Most files added last
* soft nofile 65535
* hard nofile 65535
2. Modify nginx
worker_rlimit_nofile 65535
Reload nginx (nginx -s reload)
lsof:
list open files, lists open files, including open files, established connections (TCP/UDP, etc.)
The commonly used parameters are:
1 lsof abc.txt shows the process that opened the file abc.txt 2 lsof -c abc shows the files currently opened by the process beginning with the letter abc 3 lsof -p 1234 lists the files opened by the process with the process number 1234 4 lsof -g gname/gid Display the process status of gname or gid 5 lsof -u uname/uid Display the process status of uname or uid 6 lsof +d /usr/local/ Display the files opened by the process in the directory 7 lsof +D /usr/ local/ Same as above, but it will search the directories under the directory for a long time. 8 lsof -d 4 Displays the process using fd as 4 9 lsof -i Used to display the conditions of the process 10 lsof -i[46] [protocol][ @hostname|hostaddr][:service|port]
Ningx tuning:
A nigix configuration file:
1. Worker_processes 8 is generally the number of CPUs.
2. Worker_cpu_affinity allocates CPU for each process
3. Worker_rlimit_nofile 65535. This instruction refers to the maximum number of file descriptors opened by an nginx process. The theoretical value should be the maximum number of open files (ulimit -n) divided by the number of nginx processes, but nginx allocation requests are not so uniform, so it is best to use ulimit The value of -n remains the same.
4. Use poll. Using epoll’s I/O model is an implementation of multiplexing by the linux kernel
5. Worker_connections 65536, the maximum number of connections allowed per process.
6, keepalive_timeout 30; timeout time in seconds

Nginx Error: Job for nginx.service failed because the control process exited with error code

Job for nginx.service failed because the control process exited with error code

View Status

systemctl status nginx.service

nginx.service – The nginx HTTP and reverse proxy server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Fri 2021-01-15 20:18:19 CST; 2min 1s ago
Process: 439 ExecStart=/usr/sbin/nginx (code=exited, status=1/FAILURE)
Process: 437 ExecStartPre=/usr/sbin/nginx -t (code=exited, status=0/SUCCESS)
Process: 436 ExecStartPre=/usr/bin/rm -f /run/nginx.pid (code=exited, status=0/SUCCESS)

Jan 15 20:18:18 iZ2zehly2ejxq9dlkzklw9Z nginx[439]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Jan 15 20:18:18 iZ2zehly2ejxq9dlkzklw9Z nginx[439]: nginx: [emerg] bind() to [::]:80 failed (98: Address already in use)
Jan 15 20:18:18 iZ2zehly2ejxq9dlkzklw9Z nginx[439]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Jan 15 20:18:18 iZ2zehly2ejxq9dlkzklw9Z nginx[439]: nginx: [emerg] bind() to [::]:80 failed (98: Address already in use)
Jan 15 20:18:19 iZ2zehly2ejxq9dlkzklw9Z nginx[439]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Jan 15 20:18:19 iZ2zehly2ejxq9dlkzklw9Z nginx[439]: nginx: [emerg] bind() to [::]:80 failed (98: Address already in use)
Jan 15 20:18:19 iZ2zehly2ejxq9dlkzklw9Z nginx[439]: nginx: [emerg] still could not bind()
Jan 15 20:18:19 iZ2zehly2ejxq9dlkzklw9Z systemd[1]: nginx.service: Control process exited, code=exited status=1
Jan 15 20:18:19 iZ2zehly2ejxq9dlkzklw9Z systemd[1]: nginx.service: Failed with result ‘exit-code’.
Jan 15 20:18:19 iZ2zehly2ejxq9dlkzklw9Z systemd[1]: Failed to start The nginx HTTP and reverse proxy server.

Problem

nginx: [emerg] bind() to 0.0.0.0:80 failed

 

Checking port 80 is occupied
kill pid

 

Restart nginx

systemctl start nginx.service

How to Solve Nginx Error 13: Permission denied) while connecting to upstream

When using nginx for load balancing, there is a problem

1 connect() to 10.51.2.237:8084 failed (13: Permission denied) while connecting to upstream, client: 10.51.2.237, server:

It has been confirmed that the local firewall has been shut down. After some inquiry, it is because of SELinux

SELinux is a mandatory access control (MAC) system provided in Linux kernel version 2.6. It’s a built-in security system, and firewalls should be external

So: there are two solutions

1. Close SELinux

1. Temporary shutdown (no need to restart the machine)

Setenforce 0 # set SELinux to permission mode

##Setenforce 1 sets SELinux to enforcing mode

2. To modify the configuration file, restart the machine

Modify/etc/SELinux/config file

Change SELinux = enforcing to SELinux = disabled

Just restart the machine

2. Execute the following command

setsebool -P httpd_ can_ network_ connect 1

Nginx Error: nginx: [emerg] “server” directive is not allowed here in

If you want to check whether a configuration file is correct, an error is reported after – C is specified, as follows:

[root@op-2:~# nginx -t -c /etc/nginx/conf.d/default.conf
nginx: [emerg] "server" directive is not allowed here in /etc/nginx/conf.d/default.conf:1
nginx: configuration file /etc/nginx/conf.d/default.conf test failed

Sometimes the file is correct, but it will also report an error

The practical problem is that the object of syntax detection has problems

To detect whether there are errors in the existing modified nginx configuration, we should not only detect the. Conf file, but always detect the main file/etc/nginx/nginx.conf at any time. Only in this way can we load the. Conf file in the corresponding module smoothly

This ensures the correctness of the context before and after the configuration, which is the real detection

So the correct command to detect whether the syntax of the modified nginx is wrong should be: nginx - t - C/etc/nginx/nginx. Conf , if there is an exception in the configuration file, it will be reported directly, otherwise it will be

[ root@op-2 :~# nginx -t -c /etc/nginx/nginx.conf
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

[ root@op-2 :~# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

Nginx Error when installing the startup service failed to start a high performance web server and a reverse proxy serve

Nginx Error when installing the startup service failed to start a high performance web server and a reverse proxy serve
ubuntu16.04
apt update
apt install -y nginx
service nginx start
Error:

root@zabbix:/home/appliance# systemctl status nginx.service

nginx.service - A high performance web server and a reverse proxy server
   Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Wed 2018-07-25 18:33:26 UTC; 1min 27s ago
  Process: 30040 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=1/FAILURE)
  Process: 30037 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)

Jul 25 18:33:25 zabbix nginx[30040]: nginx: [emerg] listen() to [::]:80, backlog 511 failed (98: Address already in use)
Jul 25 18:33:25 zabbix nginx[30040]: nginx: [emerg] listen() to 0.0.0.0:80, backlog 511 failed (98: Address already in use)
Jul 25 18:33:25 zabbix nginx[30040]: nginx: [emerg] listen() to [::]:80, backlog 511 failed (98: Address already in use)
Jul 25 18:33:26 zabbix nginx[30040]: nginx: [emerg] listen() to 0.0.0.0:80, backlog 511 failed (98: Address already in use)
Jul 25 18:33:26 zabbix nginx[30040]: nginx: [emerg] listen() to [::]:80, backlog 511 failed (98: Address already in use)
Jul 25 18:33:26 zabbix nginx[30040]: nginx: [emerg] still could not bind()
Jul 25 18:33:26 zabbix systemd[1]: nginx.service: Control process exited, code=exited status=1

Jul 25 18:33:26 zabbix systemd[1]: *******Failed to start A high performance web server*** and a reverse proxy server.****

Jul 25 18:33:26 zabbix systemd[1]: nginx.service: Unit entered failed state.
Jul 25 18:33:26 zabbix systemd[1]: nginx.service: Failed with result 'exit-code'.

Error reason: you already have a process bound to HTTP port 80. You can run the command sudo lsof - I: 80 to get the list of processes using this port, and then stop/disable the web server
solution: you can run the command to stop the process using port 80 sudo fuser - K 80/TCP