网上代码只见了http://sourceforge.net/projects/sqlxsswaf/?source=directory
开始read!
一: 主函数
流程很清晰,
1. 整个WAF主函数体为死循环,在while(1)的代码段中,当代码处理完毕当前日志内容后,睡眠10ms,继续从get_pos处向后处理新内容。
2. 第二个while处理日志,当找到get或post为开头的日志内容后,对客户端发送来的命令进行检测,直到文档结尾。然后到达1中while循环的末尾
#define tailer "/var/log/apache2/access.log"
#define finder "GET"
#define finder2 "POST"
int main(void){ fpos_t get_pos; printf("SQLXSSGRABBER HAS STARTED\n"); while(1) { FILE *fp = fopen(tailer,"r"); fsetpos(fp,&get_pos); if (fp != NULL){ char max_line[LINE_MAX]; while(fgets(max_line,sizeof(max_line),fp) != NULL) { if (strstr(max_line,finder)||strstr(max_line,finder2)){ fgetpos(fp,&get_pos); capture(max_line); } } fclose(fp); } else{ perror(tailer); } sleep(10); } return 0; }
二: capture函数
主要是一个中间层,把具体的正则式和匹配实现抽象成一个函数,供main使用。
很新奇的一点是,全部case使用了FALL THROUTH,又从regex_roller开始控制,很方便,省去了许多修改功能的注释时间,不需要某个引擎又不想要删除的时候,直接放在regex_roller的前面即可。
#define att_1 "((\%3C)|<)[^\n]+((\%3E)|>)"
#define att_2 "((\%3C)|<)((\%2F)|\\/)*[a-z0-9\%]+((\%3E)|>)"
#define att_3 "((\%3C)|<)((\%69)|i|(\%49))((\%6D)|m|(\%4D))((\%67)|g|(\%47))[^\n]+((\%3E)|>)"
#define att_4 "((\%27)|('))union"
#define att_5 "((\%3D)|(=))[^\n]*((\%27)|(\\')|(\\-\\-)|(\%3B)|(;))"
#define att_6 "((\%27)|('))"
char *capture(char *log_line){ int regex_roller =0; char *xss_para_regex = att_1; char *xss_simple_regex = att_2; char *css_img_regex = att_3; char *unionsql_regex = att_4; char *sqlmeta_regex = att_5; char *sqlmagicquote_regex = att_6; switch(regex_roller){ //add as many more as you wish but dont forget to #define the regex above. case 0: cap_matcher(log_line,xss_para_regex,0); case 1: cap_matcher(log_line,xss_simple_regex,1); case 2: cap_matcher(log_line,css_img_regex,2); case 3: cap_matcher(log_line,unionsql_regex,3); case 4: cap_matcher(log_line,sqlmeta_regex,4); case 5: cap_matcher(log_line,sqlmagicquote_regex,5); default: break; } return 0; }
方便带来的后果就是如果检测出一段攻击代码后,会对这段代码继续进行其他可能性的检测,这是不需要的,同时底层函数功能过于复杂,把本该在中间层函数实现的功能带到了底层去。
三: 匹配引擎
首先编译正则式,然后进行匹配,匹配成功,根据传入的规则编号,输出对应的攻击方式,阻塞IP,再利用邮件通知给管理员。
char *cap_matcher(char *log_line, char *regex,int attack_type){ pid_t pid; pcre *attack_regex; const char *error; int erroffset; int ovector[OVECCOUNT]; int rc; attack_regex = pcre_compile(regex,0,&error,&erroffset,NULL); if (! attack_regex){ fprintf(stderr,"PCRE compilation failed at expression offset %d: %s\n", erroffset, error); return (char *)1; } rc = pcre_exec(attack_regex,NULL,log_line,strlen(log_line),0,0,ovector,OVECCOUNT); if (rc < 3) { return (char *)1; } else{ switch(attack_type){ case 0: printf("Paranoid Xss Filter Detection\n"); iptables_blockage(log_line); break; case 1: printf("Simple Xss Filter Detection\n"); iptables_blockage(log_line); break; case 2: printf("Xss Img Filter Detection\n"); iptables_blockage(log_line); break; case 3: printf("Sql Injection Union Filter Detection\n"); iptables_blockage(log_line); break; case 4: printf("Sql Injection meta characters Filter Detection\n"); iptables_blockage(log_line); break; case 5: printf("Sql Injection magic quote Filter Detection\n"); iptables_blockage(log_line); break; default: break; } pid = fork(); if (pid ==0){ FILE *emails = popen("/usr/bin/mail -s 'WebAttack On server' root@localhost","w"); fprintf(emails,"Attack FOUND %s ! in the log file.\n",log_line); pclose(emails); _exit(0); } } return 0; }
个人感觉作者在这里的代码逻辑有些混乱,应该在匹配成功后,结束代码,返回中间层的capture函数进行处理,在capture中用宏来替换代码
这样代码会少很多而且逻辑清晰:)
#define R(re,way,info) if(cap_matcher(log_line,#re,#way)){\ printf(##info);\ iptables_blockage(log_line);\ break;\ }
四: 防火墙阻塞IP
apache的log文件如下格式:
127.0.0.1 - - [23/Sep/2011:15:27:36 +0800] "GET / HTTP/1.1" 200 44
因此以第一个空格为标准,获取IP后,调用iptables添加阻塞IP
void iptables_blockage(char *log_line){ char *ip_address= malloc(100); char command[1000]; int i; for (i =0; i <= 100 ; i++){ if (isspace(log_line[i])){break;} ip_address[i] = log_line[i]; } snprintf(command,sizeof(command),"/sbin/iptables -A INPUT -s %s -j DROP",ip_address); FILE *iptables_run = (FILE*)popen(command,"r"); pclose(iptables_run); free(ip_address); }
最后的邮件通知实现与iptables差不多,详细的可以参见前一篇文章,LINUX下C语言利用命令发邮件