spider_demo

所属分类:Windows编程
开发工具:C#
文件大小:31KB
下载次数:500
上传日期:2007-03-05 09:59:26
上 传 者oop66
说明:  C#编写的spider demo 主要实现多线程的网页抓取及网页内容中URL的提取
(prepared by the spider demo main multithreaded website crawls and website content URL Extraction)

文件列表:
spider_demo (0, 2006-08-13)
spider_demo\spider_demo.sln (922, 2006-08-10)
spider_demo\spider_demo.suo (31744, 2006-11-02)
spider_demo\spider_demo (0, 2006-08-13)
spider_demo\spider_demo\base64.cs (11108, 2006-08-12)
spider_demo\spider_demo\workThread.cs (4040, 2006-09-03)
spider_demo\spider_demo\spider_demo.cs (1850, 2006-08-14)
spider_demo\spider_demo\spider.cs (7841, 2006-09-03)
spider_demo\spider_demo\spider_demo.csproj (3600, 2006-09-01)
spider_demo\spider_demo\spider_demo.Designer.cs (2679, 2006-08-14)
spider_demo\spider_demo\Listener.cs (459, 2006-08-14)
spider_demo\spider_demo\spider_demo.resx (5814, 2006-08-14)
spider_demo\spider_demo\bin (0, 2006-08-13)
spider_demo\spider_demo\bin\Release (0, 2006-08-13)
spider_demo\spider_demo\bin\Release\WebRegex.dll (12288, 2006-08-12)
spider_demo\spider_demo\bin\Release\html (0, 2006-08-14)
spider_demo\spider_demo\bin\Release\yy.txt (101, 2006-08-14)
spider_demo\spider_demo\bin\Debug (0, 2006-08-13)
spider_demo\spider_demo\bin\Debug\WebRegex.dll (12288, 2006-08-12)
spider_demo\spider_demo\bin\Debug\yy.txt (101, 2006-08-14)
spider_demo\spider_demo\bin\Debug\html (0, 2006-09-11)
spider_demo\spider_demo\obj (0, 2006-08-13)
spider_demo\spider_demo\obj\spider_demo.csproj.FileList.txt (620, 2006-11-02)
spider_demo\spider_demo\obj\Release (0, 2006-08-13)
spider_demo\spider_demo\obj\Release\TempPE (0, 2006-08-13)
spider_demo\spider_demo\obj\Release\TempPE\Properties.Resources.Designer.cs.dll (4608, 2006-08-12)
spider_demo\spider_demo\obj\Release\Refactor (0, 2006-08-14)
spider_demo\spider_demo\obj\Debug (0, 2006-08-13)
spider_demo\spider_demo\Properties (0, 2006-08-13)
spider_demo\spider_demo\Properties\AssemblyInfo.cs (1170, 2006-08-10)
spider_demo\spider_demo\Properties\Resources.resx (5612, 2006-08-10)
spider_demo\spider_demo\Properties\Resources.Designer.cs (2844, 2006-09-22)
spider_demo\spider_demo\Properties\Settings.settings (249, 2006-08-10)
spider_demo\spider_demo\Properties\Settings.Designer.cs (1107, 2006-09-22)
spider_demo\spider_demo\HtmlAnalyzer.cs (3446, 2006-08-13)
spider_demo\spider_demo\GetHtml.cs (4055, 2006-09-13)
spider_demo\spider_demo\Program.cs (476, 2006-08-14)
spider_demo\spider_demo\CRC.cs (5698, 2006-09-01)

//==================================================================== // Copyright (c) 2006.8.10-2006.8.13 King (yy8354@tom.com) All rights reserved. // QQ:5088300 // 由于本程序目的是演示Spider的工作流程,因此在各个方面只求实现功能,并无任何优化,不适合商业使用。 // 本程序除MyRegexNamespace以外无使用其他组件,该组件为The Regulator 2.0编译而成,功能就是一个取URL的正则表达式。 // DEMO只在windows2003企业版下进行过测试,开发环境VS.NET2005 // 由于本程序在url合法性检测部分使用了.NET 2.0才支持的类或函数,如需在.NET 1.1运行必须修改部分代码 // 程序运行目录下的yy.txt为初始爬行url地址,每个url为一行 // 程序运行目录下生成的more.txt为工作记录,保存了爬行的url及页面保存的文件名 // 程序运行目录下的\html目录为爬行过的页面保存位置 // 欢迎任何人以任何形式方式进行修改,但请保留此信息 //==================================================================== //程序设计思路可参考: //本人开辟搜索主题的GOOGLE论坛 //http://groups.google.com/group/sosou?lnk=oa&hl=zh-CN //本人BLOG //http://blog.sina.com.cn/u/1249533702 //====================================================================

近期下载者

相关文件


收藏者