Scrapy and Start URLs -
i'm scraping text title tags off of bunch of pages want include start url field in item. know how that? when export data csv want see start url next title i'm pulling.
here's code spider---
class quadnumbers(basespider): name = "quad_numbers" allowed_domains = ["quadratec.com"] start_urls = ["http://www.example.com/abc", "http://www.example.com/abc",] def parse(self, response): sel = selector(response) sites = sel.xpath('//title') items = [] site in sites: item = quadnumbersitem() item['title'] = site.xpath('text()').extract() item['start_url'] = __________?? items.append(item) return items
you can this:
item['start_url'] = response.url
Comments
Post a Comment