Scrapy 0.9 DocumentationStats Collector used when stats are diabled (through the STATS_ENABLED setting). 4.2. Stats Collection 55 Scrapy Documentation, Release 0.9 SimpledbStatsCollector class scrapy.stats.collector.simpledb. Size % Cumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode 4 2461 1 2222272 141 clear_stats() (scrapy.stats.collector.StatsCollector method), 55 close_spider() (scrapy.stats.collector.StatsCollector method), 55 CLOSESPIDER_ITEMPASSED setting, 101 CLOSESPIDER_TIMEOUT setting0 码力 | 156 页 | 764.56 KB | 1 年前3
Scrapy 0.24 Documentationclass (or factory), used to instantiate items when not given in the constructor. 3.5. Item Loaders 55 Scrapy Documentation, Release 0.24.6 default_input_processor The default input processor to use Size % Cumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode 4 2461 1 2222272 [follow] INFO: Crawled 474 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:55-0300 [follow] INFO: Crawled 538 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-160 码力 | 222 页 | 988.92 KB | 1 年前3
Scrapy 1.0 DocumentationLoaders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.6 Scrapy shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . data as soon as it’s received (through the add_xpath(), add_css() or add_value() 3.5. Item Loaders 55 Scrapy Documentation, Release 1.0.7 methods) and the result of the input processor is collected and Size % Cumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode 4 2461 1 22222720 码力 | 244 页 | 1.05 MB | 1 年前3
Scrapy 0.16 Documentationself.vat_factor return item else: raise DropItem("Missing price in %s" % item) 3.8. Item Pipeline 55 Scrapy Documentation, Release 0.16.5 Write items to a JSON file The following pipeline stores all Size % Cumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode 4 2461 1 2222272 downloadermiddleware.chunked), 127 clear_stats() (scrapy.statscol.StatsCollector method), 141 close_spider(), 55 close_spider() (scrapy.statscol.StatsCollector method), 141 CloseSpider, 165 CLOSESPIDER_ERRORCOUNT0 码力 | 203 页 | 931.99 KB | 1 年前3
Scrapy 0.14 DocumentationPipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.9 Feed exports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Item should continue through the pipeline or be dropped and no longer processed. 3.8. Item Pipeline 55 Scrapy Documentation, Release 0.14.4 Typical use for item pipelines are: • cleansing HTML data Size % Cumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode 4 2461 1 22222720 码力 | 179 页 | 861.70 KB | 1 年前3
Scrapy 1.2 Documentationcopy() >>> print product3 Product(name='Desktop PC', price=1000) Creating dicts from items: 3.4. Items 55 Scrapy Documentation, Release 1.2.3 >>> dict(product) # create a dict from all populated values {'price': Size % Cumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode 4 2461 1 2222272 [scrapy] INFO: Crawled 474 pages (at 3840 pages/min), ˓→scraped 0 items (at 0 items/min) 2013-05-16 13:08:55-0300 [scrapy] INFO: Crawled 538 pages (at 3840 pages/min), ˓→scraped 0 items (at 0 items/min) 2013-05-160 码力 | 266 页 | 1.10 MB | 1 年前3
Scrapy 1.1 Documentationresult of the input processor is appended to the data collected in (1) (if any). 3.5. Item Loaders 55 Scrapy Documentation, Release 1.1.3 3. This case is similar to the previous ones, except that the Size % Cumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode 4 2461 1 2222272 [scrapy] INFO: Crawled 474 pages (at 3840 pages/min), ˓→scraped 0 items (at 0 items/min) 2013-05-16 13:08:55-0300 [scrapy] INFO: Crawled 538 pages (at 3840 pages/min), ˓→scraped 0 items (at 0 items/min) 2013-05-160 码力 | 260 页 | 1.12 MB | 1 年前3
Scrapy 1.0 DocumentationCumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode [scrapy] INFO: Crawled 474 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:55-0300 [scrapy] INFO: Crawled 538 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 com/scrapy/scrapy/commit/1b85bcf]) nextcall repetitive calls (heartbeats). (commit 55f7104 [https://github.com/scrapy/scrapy/commit/55f7104]) Backport fix compatibility with Twisted 15.4.0 (commit b262411 [https://github0 码力 | 303 页 | 533.88 KB | 1 年前3
Scrapy 1.3 Documentation>>> product['name'] Desktop PC >>> product.get('name') Desktop PC >>> product['price'] 3.4. Items 55 Scrapy Documentation, Release 1.3.3 1000 >>> product['last_updated'] Traceback (most recent call Size % Cumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode 4 2461 1 2222272 logstats] INFO: Crawled 302 pages (at 2880 ˓→pages/min), scraped 0 items (at 0 items/min) 2016-12-16 21:18:55 [scrapy.extensions.logstats] INFO: Crawled 358 pages (at 3360 ˓→pages/min), scraped 0 items (at 00 码力 | 272 页 | 1.11 MB | 1 年前3
Scrapy 1.1 DocumentationCumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode [scrapy] INFO: Crawled 474 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:55-0300 [scrapy] INFO: Crawled 538 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 com/scrapy/scrapy/commit/1b85bcf]) nextcall repetitive calls (heartbeats). (commit 55f7104 [https://github.com/scrapy/scrapy/commit/55f7104]) Backport fix compatibility with Twisted 15.4.0 (commit b262411 [https://github0 码力 | 322 页 | 582.29 KB | 1 年前3
共 62 条
- 1
- 2
- 3
- 4
- 5
- 6
- 7













