spider_demo

所属分类:搜索引擎
开发工具:C#
文件大小:85KB
下载次数:221
上传日期:2009-03-11 01:16:43
上 传 者bloodxia
说明:  C#多线程网络爬虫,使用线程池来控制线程,效率不错。
(C# multi-threaded network reptiles, use the thread pool to control the thread, good efficiency.)

文件列表:
_UpgradeReport_Files\UpgradeReport.css (3348, 2009-03-02)
_UpgradeReport_Files\UpgradeReport.xslt (12579, 2007-06-27)
_UpgradeReport_Files\UpgradeReport_Minus.gif (69, 2009-03-02)
_UpgradeReport_Files\UpgradeReport_Plus.gif (71, 2009-03-02)
spider_demo\base64.cs (11108, 2006-08-12)
spider_demo\CRC.cs (5698, 2006-09-01)
spider_demo\GetHtml.cs (4055, 2006-09-13)
spider_demo\HtmlAnalyzer.cs (3446, 2006-08-13)
spider_demo\Listener.cs (459, 2006-08-14)
spider_demo\Program.cs (476, 2006-08-14)
spider_demo\spider.cs (7841, 2006-09-03)
spider_demo\spider_demo.cs (1850, 2006-08-14)
spider_demo\spider_demo.csproj (3771, 2009-03-02)
spider_demo\spider_demo.Designer.cs (2679, 2006-08-14)
spider_demo\spider_demo.resx (5814, 2006-08-14)
spider_demo\workThread.cs (4040, 2006-09-03)
spider_demo\Properties\AssemblyInfo.cs (1170, 2006-08-10)
spider_demo\Properties\Resources.Designer.cs (2849, 2009-03-02)
spider_demo\Properties\Resources.resx (5612, 2006-08-10)
spider_demo\Properties\Settings.Designer.cs (1092, 2009-03-02)
spider_demo\Properties\Settings.settings (249, 2006-08-10)
spider_demo\obj\spider_demo.csproj.FileList.txt (620, 2006-11-02)
spider_demo\obj\Release\TempPE\Properties.Resources.Designer.cs.dll (4608, 2006-08-12)
spider_demo\obj\Debug\ResolveAssemblyReference.cache (3188, 2009-03-02)
spider_demo\obj\Debug\spider_demo.csproj.FileListAbsolute.txt (1102, 2009-03-02)
spider_demo\obj\Debug\spider_demo.csproj.GenerateResource.Cache (853, 2009-03-02)
spider_demo\obj\Debug\spider_demo.exe (16896, 2009-03-02)
spider_demo\obj\Debug\spider_demo.pdb (44544, 2009-03-02)
spider_demo\obj\Debug\spider_demo.Properties.Resources.resources (180, 2009-03-02)
spider_demo\obj\Debug\spider_demo.spider_demo.resources (180, 2009-03-02)
spider_demo\obj\Debug\TempPE\Properties.Resources.Designer.cs.dll (4608, 2009-03-02)
spider_demo\bin\Release\WebRegex.dll (12288, 2006-08-12)
spider_demo\bin\Release\yy.txt (101, 2006-08-14)
spider_demo\bin\Debug\moreUrl.txt (23925, 2009-03-02)
spider_demo\bin\Debug\spider_demo.exe (16896, 2009-03-02)
spider_demo\bin\Debug\spider_demo.pdb (44544, 2009-03-02)
spider_demo\bin\Debug\spider_demo.vshost.exe (14328, 2009-03-02)
spider_demo\bin\Debug\spider_demo.vshost.exe.manifest (490, 2007-07-21)
spider_demo\bin\Debug\WebRegex.dll (12288, 2006-08-12)
spider_demo\bin\Debug\yy.txt (101, 2006-08-14)
... ...

//==================================================================== // Copyright (c) 2006.8.10-2006.8.13 King (yy8354@tom.com) All rights reserved. // QQ:5088300 // 由于本程序目的是演示Spider的工作流程,因此在各个方面只求实现功能,并无任何优化,不适合商业使用。 // 本程序除MyRegexNamespace以外无使用其他组件,该组件为The Regulator 2.0编译而成,功能就是一个取URL的正则表达式。 // DEMO只在windows2003企业版下进行过测试,开发环境VS.NET2005 // 由于本程序在url合法性检测部分使用了.NET 2.0才支持的类或函数,如需在.NET 1.1运行必须修改部分代码 // 程序运行目录下的yy.txt为初始爬行url地址,每个url为一行 // 程序运行目录下生成的more.txt为工作记录,保存了爬行的url及页面保存的文件名 // 程序运行目录下的\html目录为爬行过的页面保存位置 // 欢迎任何人以任何形式方式进行修改,但请保留此信息 //==================================================================== //程序设计思路可参考: //本人开辟搜索主题的GOOGLE论坛 //http://groups.google.com/group/sosou?lnk=oa&hl=zh-CN //本人BLOG //http://blog.sina.com.cn/u/1249533702 //====================================================================

近期下载者

相关文件


收藏者