nginx配置文件学习

nginx配置文件学习

本文转自：http://blog.csdn.net/na_tion/article/details/17527957

nginx配置文件主要分为六个区域：

main section、events section、http section、sever section、location section、upstream section。

main module：

主要控制子进程的所属用户/用户组、派生子进程数、错误日志位置/级别、pid位置、子进程优先级、进程对应cpu、进程能够打开的文件描述符数目等。

user kingnet kingnet;
worker_processes 4;
error_log logs/error.log notice;
pid logs/nginx.pid;
# worker_priority -5;
# worker_cpu_affinity 0001 0010 0100 1000;
worker_rlimit_nofile 1024;
注：worker_cpu_affinity:指定进程对应cpu，上面配置是第一个进程对应cpu0，第二个进程对应cpu1，第三个进程对应cpu2，第四个进程对应cpu3，也可以指定一个进程对应多个cpu。比如0101表示进程对应cpu0/cpu2。
worker_rlimit_nofile:指定每个woker可以打开的文件描述符数目，一般和worker_connections的值是相同的。

event module：控制nginx处理连接的方式。

1234 events {
use epoll;
worker_connections 1024;
}
注：use：使用网络IO模型，epoll模型比select模型效率高很多。
worker_connections：每个worker能够处理的最大连接数，取决于ulimit -n的值。
nginx并发连接数：<worker_processes*worker_connections。
HTTP Access模块提供了一个简单的基于host名称的访问控制。通过该模块，可以允许或禁止指定的IP地址或IP地址段访问某些虚拟主机或目录
allow指令
语法：allow [address|CIDR|all]
使用环境：http，server，location
作用：允许指定的IP地址或IP地址段访问某些虚拟主机或目录
deny指令
语法：deny [address|CIDR|all]
使用环境：http，server，location
作用：禁止指定的IP地址或IP段访问某些虚拟主机或目录
匹配规则
控制规则按照声明的顺序进行检查，首条匹配IP的访问规则将被使用
演示用例
location / {
deny 192.168.1.1;
allow 192.168.1.0/24;
deny all;
}
解释：
1.禁止192.168.1.1这个ip地址访问
2.允许192.168.1.0/24这个地址段的ip访问，但是由于192.168.1.1首先匹配deny，因此192.168.1.1是无法访问的
3.当ip地址不匹配1,2两条规则时，将禁止所有的ip地址访问
http core主要用来控制处理客户端的请求方式。
主要参数：
sendfile on;使用文件描述符拷贝数据，在内核状态下完成
tcp_nopush on;在sendfile开启时有效
keepalive_timeout 60; 长连接(一次连接可以连续发送多个数据包)超时时间
tcp_nodelay on;在keepalive开启时有效
client_body_buffer_size 128k; 指定连接请求实体的缓冲区大小
client_max_body_size 100m; 指定最大连接请求实体的大小
client_header_buffer_size 64k; 指定连接请求实体头部的缓冲区大小
large_client_header_buffers 4 64k; 指定客户端头部比较大的使用缓冲区数量、大小
server_tokens off; 关闭nginx的版本信息
server_names_hash_max_size 1024; 名称哈希表的最大值

server_names_hash_bucket_size 256 名称哈希表每个页面的大小

注：依据/sys/devices/system/cpu/cpu0/cache/index1/size来决定hash表的大大小，一般是倍数关系。

server_name参数：将http请求的主机头与参数值匹配
域名遵循优先级规则：
完整匹配的名称
名称开始于一个文件通配符：*.example.com
名称结束于一个文件通配符：www.example.*
使用正则表达式的名称。
如果没有匹配到，遵循下面优先级
listen指令标记为default的server字段
第一个出现listen的server字段。
error_page参数：为错误代码指定相应的错误页面
error_page 401 402 403 404 /40x.html;
如果出现401、402、403、404错误则重定向到/40x.html页面，这个页面的位置需要结合匹配规则。
一般会为错误页面定义一个独立的匹配规则，比如
location =/40x.html {
root html; #到html这个目录寻找这个页面
}

location参数：根据uri匹配。

语法规则： location [=|~|~*|^~] /uri/ { … }
= 开头表示精确匹配
^~ 开头表示uri以某个常规字符串开头，理解为匹配 url路径即可。nginx不对url做编码，因此请求为/static/20%/aa，可以被规则^~ /static/ /aa匹配到（注意是空格）。
~ 开头表示区分大小写的正则匹配
~*  开头表示不区分大小写的正则匹配
!~和!~*分别为区分大小写不匹配及不区分大小写不匹配的正则
/ 通用匹配，任何请求都会匹配到。
多个location配置的情况下匹配顺序为（参考资料而来，还未实际验证，试试就知道了，不必拘泥，仅供参考）:
首先匹配 =，其次匹配^~, 其次是按文件中顺序的正则匹配，最后是交给 / 通用匹配。当有匹配成功时候，停止匹配，按当前匹配规则处理请求。
例子，有如下匹配规则：
location = / {
   #规则A
}
location = /login {
   #规则B
}
location ^~ /static/ {
   #规则C
}
location ~ .(gif|jpg|png|js|css)$ {
   #规则D
}
location ~* .png$ {
   #规则E
}
location !~ .xhtml$ {
   #规则F
}
location !~* .xhtml$ {
   #规则G
}
location / {
   #规则H
}
那么产生的效果如下：
访问根目录/，比如http://localhost/ 将匹配规则A
访问 http://localhost/login 将匹配规则B，http://localhost/register 则匹配规则H
访问 http://localhost/static/a.html 将匹配规则C
访问 http://localhost/a.gif, http://localhost/b.jpg 将匹配规则D和规则E，但是规则D顺序优先，规则E不起作用，而 http://localhost/static/c.png 则优先匹配到规则C
访问 http://localhost/a.PNG 则匹配规则E，而不会匹配规则D，因为规则E不区分大小写。
访问 http://localhost/a.xhtml 不会匹配规则F和规则G，http://localhost/a.XHTML不会匹配规则G，因为不区分大小写。规则F，规则G属于排除法，符合匹配规则但是不会匹配到，所以想想看实际应用中哪里会用到。
访问 http://localhost/category/id/1111 则最终匹配到规则H，因为以上规则都不匹配，这个时候应该是nginx转发请求给后端应用服务器，比如FastCGI（php），tomcat（jsp），nginx作为方向代理服务器存在。

所以实际使用中，个人觉得至少有三个匹配规则定义，如下：
#直接匹配网站根，通过域名访问网站首页比较频繁，使用这个会加速处理，官网如是说。
#这里是直接转发给后端应用服务器了，也可以是一个静态首页
# 第一个必选规则
location = / {
    proxy_pass http://tomcat:8080/index
}
# 第二个必选规则是处理静态文件请求，这是nginx作为http服务器的强项
# 有两种配置模式，目录匹配或后缀匹配,任选其一或搭配使用
location ^~ /static/ {
    root /webroot/static/;
}
location ~* .(gif|jpg|jpeg|png|css|js|ico)$ {
    root /webroot/res/;
}
#第三个规则就是通用规则，用来转发动态请求到后端应用服务器
#非静态文件请求就默认是动态请求，自己根据实际把握
#毕竟目前的一些框架的流行，带.php,.jsp后缀的情况很少了
location / {
    proxy_pass http://tomcat:8080/
}

在nginx中配置proxy_pass时，如果是按照^~匹配路径时,要注意proxy_pass后的url最后的/,当加上了/，相当于是绝对根路径，则nginx不会把location中匹配的路径部分代理走;如果没有/，则会把匹配的路径部分也给代理走。

location ^~ /static_js/
{
proxy_cache js_cache;
proxy_set_header Host js.test.com;
proxy_pass http://js.test.com/;
}
如上面的配置，如果请求的url是http://servername/static_js/test.html
会被代理成http://js.test.com/test.html
而如果这么配置
location ^~ /static_js/
{
proxy_cache js_cache;
proxy_set_header Host js.test.com;
proxy_pass http://js.test.com;
}
则会被代理到http://js.test.com/static_js/test.htm
当然，我们可以用如下的rewrite来实现/的功能
location ^~ /static_js/
{
proxy_cache js_cache;
proxy_set_header Host js.test.com;
rewrite /static_js/(.+)$ /$1 break;
proxy_pass http://js.test.com;
}

一些可用的全局变量:

$args
$content_length
$content_type
$document_root
$document_uri
$host
$http_user_agent
$http_cookie
$limit_rate
$request_body_file
$request_method
$remote_addr
$remote_port
$remote_user
$request_filename
$request_uri
$query_string
$scheme
$server_protocol
$server_addr
$server_name
$server_port
$uri

例如：http://localhost:88/test1/test2/test.php

$host：localhost
$server_port：88
$request_uri：http://localhost:88/test1/test2/test.php
$document_uri：/test1/test2/test.php
$document_root：D: ginx/html
$request_filename：D: ginx/html/test1/test2/test.php

ReWrite语法

nginx的rewrite格式是：rewrite regex replacement flag
rewrite可以放在server, location 和 if 模块中。

nginx rewrite指令执行顺序：

1.执行server块的rewrite指令(这里的块指的是server关键字后{}包围的区域，其它xx块类似)
2.执行location匹配
3.执行选定的location中的rewrite指令
如果其中某步URI被重写，则重新循环执行1-3，直到找到真实存在的文件
如果循环超过10次，则返回500 Internal Server Error错误

其中flag标记有四种格式：
last – 相当于Apache中的L
break – 中止Rewirte，不在继续匹配
redirect – 返回临时重定向的HTTP状态302，相当于Apache中的R
permanent – 返回永久重定向的HTTP状态301，相当于Apache中的R=301

1、下面是可以用来判断的表达式：
-f和!-f用来判断是否存在文件
-d和!-d用来判断是否存在目录
-e和!-e用来判断是否存在文件或目录
-x和!-x用来判断文件是否可执行

nginx 的 upstream目前支持 4 种方式的分配

1)、轮询（默认）
每个请求按时间顺序逐一分配到不同的后端服务器，如果后端服务器down掉，能自动剔除。
2)、weight
指定轮询几率，weight和访问比率成正比，用于后端服务器性能不均的情况，默认为1。
2)、ip_hash
每个请求按访问ip的hash结果分配，这样每个访客固定访问一个后端服务器，可以解决session的问题。
3)、fair（第三方）
按后端服务器的响应时间来分配请求，响应时间短的优先分配。

4)、url_hash（第三方）：按照url的hash结果来分配请求，使每个url定向到同一个后端的服务器

在http节点里添加:
#定义负载均衡设备的 Ip及设备状态
upstream myServer {
server 127.0.0.1:9090 down;
server 127.0.0.1:8080 weight=2;
server 127.0.0.1:6060;
server 127.0.0.1:7070 backup;
}
#在需要使用负载的Server节点下添加
proxy_pass http://myServer;
#一个均衡服务器可配置多项，用空格隔开

upstream 每个设备的状态:
down 表示单前的server暂时不参与负载
weight 默认为1.weight越大，负载的权重就越大。
max_fails ：允许请求失败的次数默认为1.当超过最大次数时，返回proxy_next_upstream 模块定义的错误
fail_timeout:max_fails 次失败后，暂停的时间。
backup：其它所有的非backup机器down或者忙的时候，请求backup机器。所以这台机器压力会最轻。
Nginx还支持多组的负载均衡,可以配置多个upstream 来服务于不同的Server.
配置负载均衡比较简单,但是最关键的一个问题是怎么实现多台服务器之间session的共享
下面有几种方法(以下内容来源于网络,第四种方法没有实践.)
1) 不使用session，换作cookie
如果程序逻辑不复杂，将session都改成cookie
2) 应用服务器自行实现共享
可以用数据库或memcached来保存session，它的效率是不会很高的，不适用于对效率要求高的场合。
3) ip_hash
nginx中的ip_hash技术能够将某个ip的请求定向到同一台后端，这样一来这个ip下的某个客户端和某个后端就能建立起稳固的session，ip_hash是在upstream配置中定义的：
upstream backend {
server 127.0.0.1:8080 ;
server 127.0.0.1:9090 ;
ip_hash;
}
ip_hash是容易理解的，但是因为仅仅能用ip这个因子来分配后端，因此ip_hash是有缺陷的，不能在一些情况下使用：
1、nginx不是最前端的服务器。ip_hash要求nginx一定是最前端的服务器，否则nginx得不到正确ip，就不能根据ip作hash。譬如使用的是squid为最前端，那么nginx取ip时只能得到squid的服务器ip地址，用这个地址来作分流是肯定错乱的。
2、nginx的后端还有其它方式的负载均衡。假如nginx后端又有其它负载均衡，将请求又通过另外的方式分流了，那么某个客户端的请求肯定不能定位到同一台session应用服务器上。这么算起来，nginx后端只能直接指向应用服务器，或者再搭一个squid，然后指向应用服务器。最好的办法是用location作一次分流，将需要session的部分请求通过ip_hash分流，剩下的走其它后端去。

client_max_body_size 300m; //允许客户端请求的最大的单个文件字节数

client_body_buffer_size 128k; //缓存区代理用户端文件字节数
client_body_temp_path /dev/shm/client_body_temp; //请求试图写入到缓存文件的目录路径
proxy_connect_timeout600; //和后端服务器连接的超时时间，
proxy_read_timeout 600; //连接成功等待后端相应的时间，默认是60S
proxy_send_timeout 600; //后端服务器的回传时间，规定服务器在一定的时间内传送完。
proxy_buffer_size 16k; //代理服务器的缓存文件头部文件大小，默认是4K
proxy_buffers 4 32k; //后端真是服务器的数量和大小
proxy_busy_buffers_size 64k; //当系统忙事，申请更大proxy_buffer
proxy_temp_file_write_size 64k; //写入临时目录的文件大小
proxy_temp_path /dev/shm/proxy_temp; //指定一个目录来缓存比较大的代理请求

proxy_pass http://cluster/; //指定需要代理的URL，
proxy_redirect off; //如果需要从后端打开location和Refresh字段，可以开启。也就是说后端还有代理服务器时，需要打开
proxy_set_header X-Real-IP $remote_addr; //允许将发送到后端的服务器请求重新定义或者增加一个字段，这个可以是变量也是文本组合。
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; //联系下面PS中所写，在后端web中就算加上$http_x_Forwarded_for这条，也得不到用户的IP，所以在 nginx反向代理添加Header头信息 X-Forwarded-For在配合后端服务器日志文件的$http_x_Forwarded_for这条就可以获得用户的IP地址了。
proxy_set_header Host $host; //首先说明 proxy_set_header 指令在向反向代理的后端Web服务器发起请求时添加指定的 Header头信息，后端web服务器有多个基于域名的虚拟主机时，通过头信息Host，用于指定请求的域名，这样后端web才能识别反向代理请求哪个虚拟主机处理。
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503
http_504 http_404; 服务器头部超时相应的各种状态

例子：

upstream cluster {
server 192.168.100.238:80 weight=8 max_fails=2 fail_timeout=30s;
server 192.168.100.248:80 weight=8 max_fails=2 fail_timeout=30s;
}

server {
listen 80;
server_name localhost;
location / {
root html;
index index.html index.htm;
...

nginx fastcgi和gzip模块
fastcgi模块：nginx协同fastcgi工作
fastcgi_connect_timeout 200;
fastcgi_send_timeout 200;
fastcgi_read_timeout 200;
fastcgi_buffer_size 4k;
fastcgi_buffers 16 4k;
fastcgi_busy_buffers_size 8k;
fastcgi_max_temp_file_size 16k;
fastcgi_intercept_errors on; php返回错误给nginx
说明：超时时间可以设置的大一些，缓冲区大小也可以设置大一些。
gzip模块：数据压缩传输
gzip on;
gzip_min_length 1k;
gzip_buffers 8 8k;
gzip_comp_level 2;
gzip_types text/plain application/x-javascript text/css application/xml;

gzip_vary on;

rewirite实例：

由于rewrite执行效率比较低，通常用return语句替代，如：

rewrite  (.*)  http://www.example.org$1;

改为

retrun 301 https://www.example.org$request_uri;

Redirect语法

server {
listen 80;
server_name start.igrow.cn;
index index.html index.php;
root html;
if ($http_host !~ “^star.igrow.cn$&quot {
rewrite ^(.*) http://star.igrow.cn$1 redirect;
}
}
防盗链
location ~* .(gif|jpg|swf)$ {
valid_referers none blocked start.igrow.cn sta.igrow.cn;
if ($invalid_referer) {
rewrite ^/ http://$host/logo.png;
}
}
根据文件类型设置过期时间
location ~* .(js|css|jpg|jpeg|gif|png|swf)$ {
if (-f $request_filename) {
expires 1h;
break;
}
}
禁止访问某个目录
location ~* .(txt|doc)${
root /data/www/wwwroot/linuxtone/test;
deny all;
}

多目录转成参数
abc.domian.com/sort/2 => abc.domian.com/index.php?act=sort&name=abc&id=2
if ($host ~* (.*).domain.com) {
set $sub_name $1;
rewrite ^/sort/(d+)/?$ /index.php?act=sort&cid=$sub_name&id=$1 last;
}
目录对换
/123456/xxxx -> /xxxx?id=123456
rewrite ^/(d+)/(.+)/ /$2?id=$1 last;
例如下面设定nginx在用户使用ie的使用重定向到/nginx-ie目录下：
if ($http_user_agent ~ MSIE) {
rewrite ^(.*)$ /nginx-ie/$1 break;
}
目录自动加“/”
if (-d $request_filename){
rewrite ^/(.*)([^/])$ http://$host/$1$2/ permanent;
}
禁止htaccess
location ~/.ht {
deny all;
}
禁止多个目录
location ~ ^/(cron|templates)/ {
deny all;
break;
}
禁止以/data开头的文件
可以禁止/data/下多级目录下.log.txt等请求;
location ~ ^/data {
deny all;
}
禁止单个目录
不能禁止.log.txt能请求
location /searchword/cron/ {
deny all;
}
禁止单个文件
location ~ /data/sql/data.sql {
deny all;
}
给favicon.ico和robots.txt设置过期时间;
这里为favicon.ico为99天,robots.txt为7天并不记录404错误日志
location ~(favicon.ico) {
log_not_found off;
expires 99d;
break;
}
location ~(robots.txt) {
log_not_found off;
expires 7d;
break;
}
设定某个文件的过期时间;这里为600秒，并不记录访问日志
location ^~ /html/scripts/loadhead_1.js {
access_log off;
root /opt/lampp/htdocs/web;
expires 600;
break;
}
文件反盗链并设置过期时间
这里的return 412 为自定义的http状态码，默认为403，方便找出正确的盗链的请求
“rewrite ^/ http://leech.c1gstudio.com/leech.gif;”显示一张防盗链图片
“access_log off;”不记录访问日志，减轻压力
“expires 3d”所有文件3天的浏览器缓存
location ~* ^.+.(jpg|jpeg|gif|png|swf|rar|zip|css|js)$ {
valid_referers none blocked *.c1gstudio.com *.c1gstudio.net localhost 208.97.167.194;
if ($invalid_referer) {
rewrite ^/ http://leech.c1gstudio.com/leech.gif;
return 412;
break;
}
access_log off;
root /opt/lampp/htdocs/web;
expires 3d;
break;
}
只充许固定ip访问网站，并加上密码
root /opt/htdocs/www;
allow 208.97.167.194;
allow 222.33.1.2;
allow 231.152.49.4;
deny all;
auth_basic "C1G_ADMIN";
auth_basic_user_file htpasswd;
将多级目录下的文件转成一个文件，增强seo效果
/job-123-456-789.html 指向/job/123/456/789.html
rewrite ^/job-([0-9]+)-([0-9]+)-([0-9]+).html$ /job/$1/$2/jobshow_$3.html last;
将根目录下某个文件夹指向2级目录
如/shanghaijob/ 指向 /area/shanghai/
如果你将last改成permanent，那么浏览器地址栏显是/location/shanghai/
rewrite ^/([0-9a-z]+)job/(.*)$ /area/$1/$2 last;
上面例子有个问题是访问/shanghai 时将不会匹配
rewrite ^/([0-9a-z]+)job$ /area/$1/ last;
rewrite ^/([0-9a-z]+)job/(.*)$ /area/$1/$2 last;
这样/shanghai 也可以访问了，但页面中的相对链接无法使用，
如./list_1.html真实地址是/area/shanghia/list_1.html会变成/list_1.html,导至无法访问。
那我加上自动跳转也是不行咯
(-d $request_filename)它有个条件是必需为真实目录，而我的rewrite不是的，所以没有效果
if (-d $request_filename){
rewrite ^/(.*)([^/])$ http://$host/$1$2/ permanent;
}
知道原因后就好办了，让我手动跳转吧
rewrite ^/([0-9a-z]+)job$ /$1job/ permanent;
rewrite ^/([0-9a-z]+)job/(.*)$ /area/$1/$2 last;
文件和目录不存在的时候重定向：
if (!-e $request_filename) {
proxy_pass http://127.0.0.1;
}
域名跳转
server
{
listen 80;
server_name jump.c1gstudio.com;
index index.html index.htm index.php;
root /opt/lampp/htdocs/www;
rewrite ^/ http://www.c1gstudio.com/;
access_log off;
}
多域名转向
server_name www.c1gstudio.com www.c1gstudio.net;
index index.html index.htm index.php;
root /opt/lampp/htdocs;
if ($host ~ "c1gstudio.net") {
rewrite ^(.*) http://www.c1gstudio.com$1 permanent;
}
三级域名跳转
if ($http_host ~* "^(.*).i.c1gstudio.com$") {
rewrite ^(.*) http://top.yingjiesheng.com$1;
break;
}
域名镜向
server
{
listen 80;
server_name mirror.c1gstudio.com;
index index.html index.htm index.php;
root /opt/lampp/htdocs/www;
rewrite ^/(.*) http://www.c1gstudio.com/$1 last;
access_log off;
}
某个子目录作镜向
location ^~ /zhaopinhui {
rewrite ^.+ http://zph.c1gstudio.com/ last;
break;
}
相关阅读:
android data binding jetpack I 环境配置 model-view 简单绑定
 java 直接内存
 Android内存管理机制
 使用老版本的java api提交hadoop作业
 通过java api提交自定义hadoop 作业
 hadoop错误总结
 linux下eclipse闪退和重装jdk的方法
 完全分布式安装hadoop
hadoop伪分布式安装
 2014年度总结
原文地址：https://www.cnblogs.com/kathy870513/p/3758647.html