53 - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

Scrapy 0.18 Documentation

>>> proc = Compose(lambda v: v[0], str.upper) >>> proc(['hello', 'world']) 'HELLO' 3.6. Item Loaders 53 Scrapy Documentation, Release 0.18.4 Each function can optionally receive a loader_context parameter [follow] INFO: Crawled 343 pages (at 4140 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:53-0300 [follow] INFO: Crawled 410 pages (at 4020 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 38 scrapy.contrib.linkextractors.sgml, 38 scrapy.contrib.loader, 46 scrapy.contrib.loader.processor, 53 scrapy.contrib.logstats, 131 scrapy.contrib.memdebug, 132 scrapy.contrib.memusage, 131 scrapy.contrib

0 码力 | 201 页 | 929.55 KB | 1 年前
3
Scrapy 1.5 Documentation

XML response body, returning a list of Selector objects (ie. a SelectorList object): 3.3. Selectors 53 Scrapy Documentation, Release 1.5.2 sel.xpath("//product") 2. Extract all prices from a Google Base logstats] INFO: Crawled 198 pages (at 3840 ˓→pages/min), scraped 0 items (at 0 items/min) 2016-12-16 21:18:53 [scrapy.extensions.logstats] INFO: Crawled 254 pages (at 3360 ˓→pages/min), scraped 0 items (at 0 method), 91 css() (scrapy.selector.Selector method), 52 css() (scrapy.selector.SelectorList method), 53 CSVFeedSpider (class in scrapy.spiders), 39 CsvItemExporter (class in scrapy.exporters), 215 custom_settings

0 码力 | 285 页 | 1.17 MB | 1 年前
3
Scrapy 1.3 Documentation

Extract all prices from a Google Base XML feed which requires registering a namespace: 3.3. Selectors 53 Scrapy Documentation, Release 1.3.3 sel.register_namespace("g", "http://base.google.com/ns/1.0") logstats] INFO: Crawled 198 pages (at 3840 ˓→pages/min), scraped 0 items (at 0 items/min) 2016-12-16 21:18:53 [scrapy.extensions.logstats] INFO: Crawled 254 pages (at 3360 ˓→pages/min), scraped 0 items (at 0 __nonzero__() (scrapy.selector.Selector method), 52 __nonzero__() (scrapy.selector.SelectorList method), 53 A adapt_response() (scrapy.spiders.XMLFeedSpider method), 38 add_css() (scrapy.loader.ItemLoader

0 码力 | 272 页 | 1.11 MB | 1 年前
3
Scrapy 1.2 Documentation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.5 Item Loaders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . price = scrapy.Field() stock = scrapy.Field() last_updated = scrapy.Field(serializer=str) 3.4. Items 53 Scrapy Documentation, Release 1.2.3 Note: Those familiar with Django will notice that Scrapy Items [scrapy] INFO: Crawled 343 pages (at 4140 pages/min), ˓→scraped 0 items (at 0 items/min) 2013-05-16 13:08:53-0300 [scrapy] INFO: Crawled 410 pages (at 4020 pages/min), ˓→scraped 0 items (at 0 items/min) 2013-05-16

0 码力 | 266 页 | 1.10 MB | 1 年前
3
Scrapy 0.24 Documentation

data from the given value using extract_regex() method, applied before processors 3.5. Item Loaders 53 Scrapy Documentation, Release 0.24.6 Examples: >>> from scrapy.contrib.loader.processor import TakeFirst [follow] INFO: Crawled 343 pages (at 4140 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:53-0300 [follow] INFO: Crawled 410 pages (at 4020 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 scrapy.webservice.JsonRpcResource method), 83 get_value() (scrapy.contrib.loader.ItemLoader method), 53 get_value() (scrapy.statscol.StatsCollector method), 146 get_xpath() (scrapy.contrib.loader.ItemLoader

0 码力 | 222 页 | 988.92 KB | 1 年前
3
Scrapy 0.20 Documentation

>>> proc = Join(’
’) >>> proc([’one’, ’two’, ’three’]) u’one
two
three’ 3.6. Item Loaders 53 Scrapy Documentation, Release 0.20.2 class scrapy.contrib.loader.processor.Compose(*functions, * [follow] INFO: Crawled 343 pages (at 4140 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:53-0300 [follow] INFO: Crawled 410 pages (at 4020 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 64 scrapy.contrib.linkextractors.sgml, 64 scrapy.contrib.loader, 47 scrapy.contrib.loader.processor, 53 scrapy.contrib.logstats, 133 scrapy.contrib.memdebug, 134 scrapy.contrib.memusage, 133 scrapy.contrib

0 码力 | 197 页 | 917.28 KB | 1 年前
3
Scrapy 0.9 Documentation

Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.3 Sending e-mail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ', value) Get global stat value: >>> stats.get_value('spiders_crawled') 8 4.2. Stats Collection 53 Scrapy Documentation, Release 0.9 Get all global stats (ie. not particular to any spider): >>> stats

0 码力 | 156 页 | 764.56 KB | 1 年前
3
Scrapy 1.4 Documentation

for that. Let’s show an example that illustrates this with GitHub blog atom feed. 3.3. Selectors 53 Scrapy Documentation, Release 1.4.0 First, we open the shell with the url we want to scrape: $ scrapy logstats] INFO: Crawled 198 pages (at 3840 ˓→pages/min), scraped 0 items (at 0 items/min) 2016-12-16 21:18:53 [scrapy.extensions.logstats] INFO: Crawled 254 pages (at 3360 ˓→pages/min), scraped 0 items (at 0 method), 52 272 Index Scrapy Documentation, Release 1.4.0 re() (scrapy.selector.SelectorList method), 53 REACTOR_THREADPOOL_MAXSIZE setting, 107 REDIRECT_ENABLED setting, 184 REDIRECT_MAX_TIMES setting

0 码力 | 281 页 | 1.15 MB | 1 年前
3
Scrapy 0.22 Documentation

SiteSpecificLoader(ProductLoader): name_in = MapCompose(strip_dashes, ProductLoader.name_in) 3.5. Item Loaders 53 Scrapy Documentation, Release 0.22.0 Another case where extending Item Loaders can be very helpful [follow] INFO: Crawled 343 pages (at 4140 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:53-0300 [follow] INFO: Crawled 410 pages (at 4020 pages/min), scraped 0 items (at 0 items/min) 2013-05-16

0 码力 | 199 页 | 926.97 KB | 1 年前
3
Scrapy 1.0 Documentation

copy() >>> print product3 Product(name='Desktop PC', price=1000) Creating dicts from items: 3.4. Items 53 Scrapy Documentation, Release 1.0.7 >>> dict(product) # create a dict from all populated values {'price': [scrapy] INFO: Crawled 343 pages (at 4140 pages/min), ˓→scraped 0 items (at 0 items/min) 2013-05-16 13:08:53-0300 [scrapy] INFO: Crawled 410 pages (at 4020 pages/min), ˓→scraped 0 items (at 0 items/min) 2013-05-16

0 码力 | 244 页 | 1.05 MB | 1 年前
3

共 58 条前往

页

Scrapy 0.18 Documentati on 1.5 1.3 1.2 0.24 0.20 0.9 1.4 0.22 1.0

分类

语言

格式

Scrapy 0.18 Documentation

Scrapy 1.5 Documentation

Scrapy 1.3 Documentation

Scrapy 1.2 Documentation

Scrapy 0.24 Documentation

Scrapy 0.20 Documentation

Scrapy 0.9 Documentation

Scrapy 1.4 Documentation

Scrapy 0.22 Documentation

Scrapy 1.0 Documentation