html - Web scrapping with beautiful soup 4 python -
so started using beautiful soup 4 , came across problem i've been trying solve few days can't. let me first paste html code want analyse:
<table class="table table-condensed table-hover tenlaces tablesorter"> <thead> <tr> <th class="al">language</th> <th class="ac">link</th> </tr> </thead> <tbody> <tr> <td class="tdidioma"><span class="flag flag_0">0</span></td> <td class="tdenlace"><a class="btn btn-mini enlace_link" data-servidor="42" rel="nofollow" target="_blank" title="ver..." href="link want save0"><i class="icon-play"></i> ver</a></td> </tr> <tr> <td class="tdidioma"><span class="flag flag_1">1</span></td> <td class="tdenlace"><a class="btn btn-mini enlace_link" data-servidor="42" rel="nofollow" target="_blank" title="ver..." href="link want save1"><i class="icon-play"></i> ver</a></td> </tr> <tr> <td class="tdidioma"><span class="flag flag_2">2</span></td> <td class="tdenlace"><a class="btn btn-mini enlace_link" data-servidor="42" rel="nofollow" target="_blank" title="ver..." href="link want save2"><i class="icon-play"></i> ver</a></td> </tr> </tbody> </table>
as can see in each < tr > there < td > language , link. problem don't know how relate language link. mean, i'd select example if space in language 1 return link. if not, don't anything. i'm able return < td > language, not < tr > important think don't know if made point because don't know how explain
the code have gets < tbody > main url don't know how make i'm asking.
thanks, , sorry bad english!
edit: here sample of code can see libraries i'm using , everything
from bs4 import beautifulsoup import urllib2 url = raw_input("introduce url analyse: ") page = urllib2.urlopen(url) soup = beautifulsoup(page.read()) body = soup.tbody #here should don't know how page.close()
try this:
result = none row in soup.tbody.find_all('tr'): lang, link = row.find_all('td') if lang.string == '1': result = link.a['href'] print result
Comments
Post a Comment