nginx 50x故障分析
近期經(jīng)歷了一系列的nginx 50x錯(cuò)誤,在此總結(jié)一下如何處理錯(cuò)誤,以及各個(gè)錯(cuò)誤可能根源。
錯(cuò)誤處理提前需要了解
1 代碼發(fā)布時(shí)間
2 php error log
3 nginx access log
4 nginx error log
5 每個(gè)接口訪問(wèn)時(shí)間log
錯(cuò)誤處理流程
1. 確認(rèn)是否有人剛發(fā)過(guò)代碼。根據(jù)故障時(shí)間線&代碼發(fā)布時(shí)間線,如果能找到精確對(duì)應(yīng)關(guān)系,基本上可以判定這次事故的原因?yàn)榇a發(fā)布事故,回滾代碼往往是解決問(wèn)題最直接有效的方式。
2. 線上測(cè)試服務(wù)器,測(cè)試接口。線上測(cè)試服務(wù)器訪問(wèn)量較小,不存在nginx訪問(wèn)壓力過(guò)大造成的其他隱患,可以直接測(cè)試后端的存儲(chǔ)服務(wù)器是否有故障。
3. 從日志挖掘有效信息。
3.1 php日志,檢查是否有大量的php報(bào)錯(cuò)信息。
3.2 nginx日志,確定接口開(kāi)始出現(xiàn)大量50x錯(cuò)誤的時(shí)間點(diǎn)
3.3 接口請(qǐng)求時(shí)間日志(自行記錄),查看接口請(qǐng)求時(shí)間是否有異常。
3.4 配合xhprof等工具,分析耗時(shí)請(qǐng)求的時(shí)間分布。
50x原因分析:
分析前需要了解
1. php.ini
2. php-fpm.conf (訪問(wèn)<?php phpinfo(); 查找"Loaded Configuration File"可以找到php-fpm.conf的位置 php -i | grep PATH | grep php; cd ../etc # 找到php-fpm.conf存儲(chǔ)位置)
3. nginx.conf
504:
1. 在nginx.conf keepalive_timeout時(shí)間內(nèi)php-fpm沒(méi)有返回結(jié)果
2. php-fpm設(shè)置的過(guò)少,請(qǐng)求過(guò)多達(dá)到php-fpm.conf pm.max_children
pm = dynamic ; The number of child processes to be created when pm is set to 'static' and the ; maximum number of child processes to be created when pm is set to 'dynamic'. ; This value sets the limit on the number of simultaneous requests that will be ; served. Equivalent to the ApacheMaxClients directive with mpm_prefork. ; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP ; CGI. ; Note: Used when pm is set to either 'static' or 'dynamic' ; Note: This value is mandatory. pm.max_children = 4096; The number of child processes created on startup. ; Note: Used only when pm is set to 'dynamic' ; Default Value: min_spare_servers + (max_spare_servers - min_spare_servers) / 2 pm.start_servers = 768 ; The desired minimum number of idle server processes. ; Note: Used only when pm is set to 'dynamic' ; Note: Mandatory when pm is set to 'dynamic' pm.min_spare_servers = 512這里面我覺(jué)得最重要的參數(shù)是max_children: 代表了dynamic狀態(tài)下,fpm的最大數(shù)量。
3. nginx請(qǐng)求排隊(duì)超時(shí)
fpm到達(dá)上限,nginx會(huì)將fpm放入請(qǐng)求隊(duì)列,如果在keepalive_timeout時(shí)間內(nèi)始終沒(méi)有空閑fpm,返回504
504的access&error log
2013/08/14 20:48:31 [error] 20370#0: *1948283 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 127.0.0.1, server: lv.com.cn, request: "GET /a.php HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "lv.com" 127.0.0.1 - - [14/Aug/2013:20:48:31 +0800] "GET /a.php HTTP/1.1" 504 183 "-" "curl/7.15.5 (x86_64-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5"502原因分析,502錯(cuò)誤出現(xiàn)的原因一般都不是nginx的問(wèn)題。
1. php-fpm request_terminate_timeout超時(shí)
request_terminate_timeout用于設(shè)置當(dāng)某個(gè)php腳本運(yùn)行最長(zhǎng)時(shí)間,若超出php-fpm進(jìn)程管理器強(qiáng)行中止當(dāng)前程序,并關(guān)閉fastcgi和nginx的網(wǎng)絡(luò)連接,然后nginx中就會(huì)出現(xiàn)Connection reset by peer的錯(cuò)誤了。
warning: php.ini中的max_execution_time在fpm中一般是不生效的,因?yàn)閙ax_execution_time不計(jì)入網(wǎng)絡(luò)請(qǐng)求,系統(tǒng)請(qǐng)求,對(duì)于網(wǎng)絡(luò)請(qǐng)求,大部分都是訪問(wèn)數(shù)據(jù)庫(kù),很少有純粹的計(jì)算,因此很難超時(shí)。
盲目的延長(zhǎng)request_terminate_timeout并不能解決問(wèn)題,一般對(duì)于線上請(qǐng)求1s就已經(jīng)非常長(zhǎng)了,所以如果超時(shí),更應(yīng)該去查找哪個(gè)步驟耗時(shí),并優(yōu)化。
2. php-fpm進(jìn)程出錯(cuò)
想寫一個(gè)將php-fpm出現(xiàn)段錯(cuò)誤,意外退出也是一件比較難的事情。大部分情況都是因?yàn)槟承U(kuò)展的某些bug。(redis->pconnect遇到過(guò), 如下)
PHP Notice:? Redis::setex(): send of 869 bytes failed with errno=32 Broken pipe in /data/home/xxx.php on line 43
3. php.ini的memory_limit過(guò)小
4. nginx.conf client head buffer,fastcgi buffer size過(guò)小
nginx錯(cuò)誤日志: pstream sent too big header while reading response header from upstream
502的access&error log
2013/08/23 17:14:26 [error] 20370#0: *2529767 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 10.18.128.37, server: lv.com.cn, request: "GET /a.php HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "lv.com" 10.18.128.37 - - [23/Aug/2013:17:14:26 +0800] "GET /a.php HTTP/1.1" 502 575 "-" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17"參考文獻(xiàn)
http://blog.xiuwz.com/2012/09/25/php-max-execution-time-internal/
http://www.cnblogs.com/zhengyun_ustc/archive/2013/06/06/3120967.html
轉(zhuǎn)載于:https://www.cnblogs.com/codesay/p/3278717.html
總結(jié)
以上是生活随笔為你收集整理的nginx 50x故障分析的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 开源视频直播软件介绍
- 下一篇: Linux查找文件命令find .