PSpider - Python3下极为简洁的爬虫框架

网友投稿 709 2022-10-24

PSpider - Python3下极为简洁的爬虫框架

PSpider - Python3下极为简洁的爬虫框架

PSpider

A simple web spider frame written by Python, which needs Python3.5+

Features of PSpider

Support multi-threading crawling mode (using threading and requests)Support multi-processing in parse process, automatically (using multiprocessing)Support using proxies for crawling (using threading and queue)Define some utility functions and classes, for example: UrlFilter, get_string_num, etcFewer lines of code, easyer to read, understand and expand

Modules of PSpider

utilities module: define some utilities functions and classes for multi-threading spiderinstances module: define classes of fetcher, parser, saver for multi-threading spiderconcurrent module: define WebSpiderFrame of multi-threading spider

Procedure of PSpider

Tutorials of PSpider

Installation: you'd better use the first method (1)Copy the "spider" directory to your project directory, and import spider (2)Install spider to your python system using python3 setup.py install

See test.py

TodoList

Distribute SpiderExecute JavaScriptMore Demos

If you have any questions or advices, you can commit "Issues" or "Pull requests"

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:js中!和!!的区别及用法
下一篇:一个基于OpenResty的仿Yii的web框架
相关文章

 发表评论

暂时没有评论,来抢沙发吧~