Scrapy and Start URLs -

July 15, 2012

i'm scraping text title tags off of bunch of pages want include start url field in item. know how that? when export data csv want see start url next title i'm pulling.

here's code spider---

class quadnumbers(basespider):     name = "quad_numbers"     allowed_domains = ["quadratec.com"]     start_urls = ["http://www.example.com/abc",                   "http://www.example.com/abc",]      def parse(self, response):         sel = selector(response)         sites = sel.xpath('//title')         items = []         site in sites:             item = quadnumbersitem()             item['title'] = site.xpath('text()').extract()             item['start_url'] = __________??             items.append(item)         return items

you can this:

item['start_url'] = response.url

Search This Blog

My

Scrapy and Start URLs -

Comments

Post a Comment

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

Why am I getting Internal .NET Framework Data Provider error 1025 when passing Method to where? -

linux - phpmyadmin, neginx error.log - Check group www-data has read access and open_basedir -