这两天博客日志分析

今天打开博客先是很慢,半天页面还没打开,后来干脆抛出"Service Unavailable",表示服务器系统资源占用过多。咨询客服,说是程序占用太多资源,可是以前不都运行好好的?是不是有人在做采集?本希望客服向技术人员反映下,希望能在服务器对采集程序进行适当的限制,客服却坚持是博客本身程序的问题,汗啊。于是便把整个博客载到本地,上网搜索些网站测试工具,希望能找些证据说明并非博客程序本身的问题,工具没找着,突然想起来网站有做访问日志的,便下载这两天的日志看看。

以前只知道有日志这个概念,从没有真正的去接触去看过,今天看了看,虽然很无聊(非常多的数据,每天日志文件都有几M),但却也很有趣:网站除了接受一般访客访问外,还得接受各大搜索引擎蜘蛛程序的搜索:

●爱问
2007-01-11 01:25:36 W3SVC6383714 58.215.65.159 GET /article.asp id=311 80 - 60.28.164.26
Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 200 0 64
2007-01-11 01:25:36 W3SVC6383714 58.215.65.159 GET /default.asp id=498 80 - 65.55.209.63
msnbot/1.0+(+http://search.msn.com/msnbot.htm) 302 0 64
2007-01-11 01:25:37 W3SVC6383714 58.215.65.159 GET /article.asp id=308 80 - 60.28.164.26
Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 200 0 64
2007-01-11 01:26:13 W3SVC6383714 58.215.65.159 GET /article.asp id=316 80 - 60.28.164.26
Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 200 0 64
2007-01-11 01:26:13 W3SVC6383714 58.215.65.159 GET /article.asp id=324 80 - 60.28.164.26
Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 200 0 64

●百度
2007-01-11 02:42:26 W3SVC6383714 58.215.65.159 HEAD /article.asp id=546 80 - 202.108.22.143
Baiduspider+(+http://www.baidu.com/search/spider.htm) 200 0 64
2007-01-11 02:43:48 W3SVC6383714 58.215.65.159 HEAD /pic/20060810/008.gif - 80 - 61.135.162.21
Mozilla/4.0+(compatible;+MSIE+5.0;+Windows+98;+DigExt) 200 0 0
2007-01-11 02:43:51 W3SVC6383714 58.215.65.159 GET /pic/20060810/008.gif - 80 - 61.135.162.21
Mozilla/4.0+(compatible;+MSIE+5.0;+Windows+98;+DigExt) 200 0 0
2007-01-11 02:47:52 W3SVC6383714 58.215.65.159 HEAD /default.asp log_Year=2006&page=37 80 - 202.108.22.143
Baiduspider+(+http://www.baidu.com/search/spider.htm) 200 0 64
2007-01-11 02:49:45 W3SVC6383714 58.215.65.159 GET /feed.asp cateID=6 80 - 202.108.22.75
Baiduspider+(+http://www.baidu.com/search/spider.htm) 200 0 64

●MSN
2007-01-11 03:54:08 W3SVC6383714 58.215.65.159 GET /article.asp id=565 80 - 65.55.209.73
msnbot/1.0+(+http://search.msn.com/msnbot.htm) 404 0 2
2007-01-11 03:55:00 W3SVC6383714 58.215.65.159 GET /default.asp id=469|-|0|404_Not_Found 80 - 65.55.209.71
msnbot/1.0+(+http://search.msn.com/msnbot.htm) 404 0 64
2007-01-11 03:55:03 W3SVC6383714 58.215.65.159 GET /article.asp id=525 80 - 65.55.209.73
msnbot/1.0+(+http://search.msn.com/msnbot.htm) 404 0 2

●Google
2007-01-11 05:20:53 W3SVC6383714 58.215.65.159 GET /feed.asp - 80 - 66.249.66.212
Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) 404 0 2
2007-01-11 05:20:55 W3SVC6383714 58.215.65.159 GET /member.asp - 80 - 66.249.66.196
Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) 404 0 2
2007-01-11 05:21:10 W3SVC6383714 58.215.65.159 GET /bloglink.asp - 80 - 66.249.66.210
Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) 404 0 2

●雅虎
2007-01-11 05:29:31 W3SVC6383714 58.215.65.159 GET /robots.txt - 80 - 72.30.226.206
Mozilla/5.0+(compatible;+Yahoo!+Slurp;+http://help.yahoo.com/help/us/ysearch/slurp) 404 0 2
2007-01-11 05:29:31 W3SVC6383714 58.215.65.159 GET /article.asp id=451 80 - 74.6.72.175
Mozilla/5.0+(compatible;+Yahoo!+Slurp;+http://help.yahoo.com/help/us/ysearch/slurp) 404 0 2

除此之外,还有一些别有用心的人对网站进行扫描:

2007-01-10 00:06:41 W3SVC6383714 58.215.65.159 GET /diy.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 2
2007-01-10 00:06:41 W3SVC6383714 58.215.65.159 GET /bbs/diy.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 3
2007-01-10 00:06:41 W3SVC6383714 58.215.65.159 GET /ASPAdmin_A.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 2
2007-01-10 00:06:41 W3SVC6383714 58.215.65.159 GET /ASPAdmin.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 2
2007-01-10 00:06:50 W3SVC6383714 58.215.65.159 GET /bbs/digshell0.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 3
2007-01-10 00:06:50 W3SVC6383714 58.215.65.159 GET /digshell2.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 2
2007-01-10 00:06:50 W3SVC6383714 58.215.65.159 GET /bbs/digshell2.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 3
2007-01-10 00:06:50 W3SVC6383714 58.215.65.159 GET /tmdqq.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 2
2007-01-10 00:07:00 W3SVC6383714 58.215.65.159 GET /qq.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 2
2007-01-10 00:07:00 W3SVC6383714 58.215.65.159 GET /myup.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 2
2007-01-10 00:07:00 W3SVC6383714 58.215.65.159 GET /bbs/myup.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 3
2007-01-10 00:07:00 W3SVC6383714 58.215.65.159 GET /cmd.asp - 80 - 219.155.136.111 Mozilla/3.0+(compatible;+Indy+Library) 404 0 2

少不了注入:

2007-01-11 00:20:17 W3SVC6383714 58.215.65.159 GET /article.asp id=434%20and%20char(124)%2Buser%2Bchar(124)=0 80 - 222.140.96.30 Internet+Explorer+6.0 302 0 0
2007-01-11 00:20:17 W3SVC6383714 58.215.65.159 GET /article.asp id=434'%20and%20char(124)%2Buser%2Bchar(124)=0%20and%20'%25'=' 80 - 222.140.96.30 Internet+Explorer+6.0 302 0 0
2007-01-11 00:20:17 W3SVC6383714 58.215.65.159 GET /article.asp id=434%20%61%6E%64%20%31%3D%31 80 - 222.140.96.30 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0) 302 0 0
2007-01-11 00:20:17 W3SVC6383714 58.215.65.159 GET /article.asp id=434%20%61%6E%64%20%31%3D%32 80 - 222.140.96.30 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0) 302 0 0
2007-01-11 00:20:18 W3SVC6383714 58.215.65.159 GET /article.asp id=434'%20and%201=1%20and%20''=' 80 - 222.140.96.30 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0) 302 0 0

另外,还看到一个有趣的现象:为了防止一些人对博客的引用,我之前将trackback.asp进行了改名,之后又新建一个trackback.asp文件,其作用是将传过来的参数传给article.asp,于是看到了如下的日志:

2007-01-11 00:02:14 W3SVC6383714 58.215.65.159 POST /trackback.asp tbid=248 80 - 210.51.190.123
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322) 302 0 0
2007-01-11 00:02:15 W3SVC6383714 58.215.65.159 GET /article.asp id=248 80 - 210.51.190.123
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322) 200 0 0
2007-01-11 00:02:15 W3SVC6383714 58.215.65.159 POST /trackback.asp tbid=248 80 - 210.51.190.123
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322) 302 0 0
2007-01-11 00:02:15 W3SVC6383714 58.215.65.159 GET /article.asp id=248 80 - 210.51.190.123
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322) 200 0 0
2007-01-11 00:02:15 W3SVC6383714 58.215.65.159 POST /trackback.asp tbid=248 80 - 210.51.190.123
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322) 302 0 0
2007-01-11 00:02:16 W3SVC6383714 58.215.65.159 GET /article.asp id=248 80 - 210.51.190.123
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322) 200 0 0

最后在今天的访问日志中看到了非常多的:

2007-01-11 07:04:41 W3SVC6383714 58.215.65.159 GET /default.asp cateID=4|10|800a0414|调用子程序时不能使用括号 80 - 60.28.164.26 Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 500 0 1236
2007-01-11 07:04:42 W3SVC6383714 58.215.65.159 GET /default.asp cateID=5|10|800a0414|调用子程序时不能使用括号 80 - 60.28.164.26 Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 500 0 0
2007-01-11 07:04:56 W3SVC6383714 58.215.65.159 GET /default.asp cateID=6|10|800a0414|调用子程序时不能使用括号 80 - 60.28.164.26 Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 500 0 0
2007-01-11 07:05:15 W3SVC6383714 58.215.65.159 GET /default.asp cateID=7|10|800a0414|调用子程序时不能使用括号 80 - 60.28.164.26 Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 500 0 0
2007-01-11 07:05:31 W3SVC6383714 58.215.65.159 GET /default.asp cateID=8|10|800a0414|调用子程序时不能使用括号 80 - 60.28.164.26 Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 500 0 0
2007-01-11 07:05:46 W3SVC6383714 58.215.65.159 GET /default.asp cateID=9|10|800a0414|调用子程序时不能使用括号 80 - 60.28.164.26 Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 500 0 0
2007-01-11 07:05:58 W3SVC6383714 58.215.65.159 GET /default.asp cateID=10|10|800a0414|调用子程序时不能使用括号 80 - 60.28.164.26 Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 500 0 0
2007-01-11 07:06:13 W3SVC6383714 58.215.65.159 GET /default.asp cateID=12|10|800a0414|调用子程序时不能使用括号 80 - 60.28.164.26 Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 500 0 0
2007-01-11 07:06:28 W3SVC6383714 58.215.65.159 GET /default.asp cateID=2|10|800a0414|调用子程序时不能使用括号 80 - 60.28.164.26 Mozilla/5.0+(compatible;+iaskspider/1.0;+MSIE+6.0) 500 0 0

估计使系统资源占用过多的罪魁祸首就是它了iask。

上一篇: Service Unavailable
下一篇: ASP中如何执行存储过程
文章来自: 本站原创
引用通告: 查看所有引用 | 我要引用此文章
Tags:
最新日志:
评论: 7 | 引用: 0 | 查看次数: 6447
管理员[2007-04-10 08:09 PM | | | 125.77.48.156 | del | 回复回复]
6#
那看看这篇文章的后半部分http://www.mzwu.com/article.asp?id=492
lookinto[2007-04-10 07:44 PM | | | 125.78.4.218 | del | 回复回复]
5#
现在担心的是spider认为你的网站有问题,不来访问你了,毕竟网站的流量80-90%来自于搜索引擎。

问题是,如何能做到防采集,而不影响spider来访。
管理员[2007-04-10 07:16 PM | | | 125.77.48.156 | del | 回复回复]
4#
如果spider过于频繁,我们能做的就是阻止它访问,方法参见:http://www.mzwu.com/article.asp?id=775
管理员[2007-04-10 07:13 PM | | | 125.77.48.156 | del | 回复回复]
地板
防采集主要还是针对人而言,而对于spider这丝毫没什么作用,因为spider不会对采集的内容进行判断而决定是否要继续采集。
lookinto[2007-04-10 05:24 PM | | | 125.78.4.218 | del | 回复回复]
板凳
我在 PJblog 论坛你开的那个防采集贴子里提了些疑问,
http://bbs.pjhome.net/thread-13762-1-1.html
请看看。
lookinto[2007-04-09 05:56 PM | | | 125.78.4.91 | del | 回复回复]
沙发
用上了,但不知道具体会如何影响搜索引擎?
发表评论
登录后再发表评论!