Rails, Scraping from dynamic URL -

July 15, 2014

at basic wanting scrape website , render parts of code h1s or something. have used nokogiri , mechanize in past , familiar basics of scraping. in past structure thor task, this

class scrape < thor desc "cl_redding","scrape craigslist rentals" def cl_redding      require file.expand_path('config/environment.rb')      require 'rubygems'      require 'nokogiri'      require 'open-uri'      require 'mechanize'      require 'yaml'      require 'aws-sdk'      require 'csv'      require 'json'      agent = mechanize.new      page = agent.get('http://redding.craigslist.org/search/apa?zoomtoposting=&catabb=apa&query=&minask=&maxask=&bedrooms=&housing_type=&haspic=1&excats=')

all cool , works, though scrapes craigslist , because called through page =, asking is, have advice on how scrape site called input box on website? specific help, tutorials, advice or resources welcome.

i think question bit generic.

you need start rails app
build form accept input of url scrape - possibly implement page model store pages scrape
parse url way in example
possibly use end processing tool sidekiq avoid scraping on front end
store results , display them on page#show

Search This Blog

My

Rails, Scraping from dynamic URL -

Comments

Post a Comment

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

Why am I getting Internal .NET Framework Data Provider error 1025 when passing Method to where? -

postgresql - how to get points from linestring postgis -