python web scraping with missing source code -
i trying scrape pricing information these 2 websites: site1 , site2 using python , packages beautifulsoup , requests.
what realized pricing section not available in source code both sites. wondering how can scrape data.
any advice appreciated. thank you
the problem first need select country see prices.
in technical sense, need make post request http://www.strem.com/catalog/index.php select country, can prices:
from bs4 import beautifulsoup import requests url = "http://www.strem.com/catalog/v/29-6720/17/copper_1300746-79-5" session = requests.session() p = session.post("http://www.strem.com/catalog/index.php", {'country': 'usa', 'page_function': 'select_country', 'item_id': '7211', 'group_id': '17'}) response = session.get(url) soup = beautifulsoup(response.content) print [td.text.strip() td in soup.find_all('td', class_='price')] this prints:
[u'us$85.00', u'us$285.00', u'us$1,282.00', u'us$3,333.00'] a more elegant solution submit form using mechanize package:
import cookielib bs4 import beautifulsoup import mechanize url = "http://www.strem.com/catalog/v/29-6720/17/copper_1300746-79-5" browser = mechanize.browser() cj = cookielib.lwpcookiejar() browser.set_cookiejar(cj) browser.open(url) browser.select_form(nr=1) browser.form['country'] = ['usa'] browser.submit() data = browser.response().read() soup = beautifulsoup(data) print [td.text.strip() td in soup.find_all('td', class_='price')] prints:
[u'us$85.00', u'us$285.00', u'us$1,282.00', u'us$3,333.00']
Comments
Post a Comment