python - Scrapy and Selenium error : Element not found in the cache - perhaps the page has changed since it was looked up Stacktrace -
i want extract data amazon.
source code :
scrapy.contrib.spiders import crawlspider scrapy import selector selenium import webdriver selenium.webdriver.support.select import select time import sleep import selenium.webdriver.support.ui ui scrapy.xlib.pydispatch import dispatcher scrapy.http import htmlresponse, textresponse extraction.items import produititem class runnerspider(crawlspider): name = 'products' allowed_domains = ['amazon.com'] start_urls = ['http://www.amazon.com'] def __init__(self): self.driver = webdriver.firefox() def parse(self, response): items = [] sel = selector(response) self.driver.get(response.url) recherche = self.driver.find_element_by_xpath('//*[@id="twotabsearchtextbox"]') recherche.send_keys("a") recherche.submit() resultat = self.driver.find_element_by_xpath('//ul[@id="s-results-list-atf"]') resultas = resultat.find_elements_by_xpath('//li') result in resultas: item = produititem() lien = result.find_element_by_xpath('//div[@class="s-item-container"]/div/div/div[2]/div[1]/a') lien.click() #lien.implicitly_wait(2) res = self.driver.find_element_by_xpath('//h1[@id="aiv-content-title"]') item['titre'] = res.text item['image'] = lien.find_element_by_xpath('//div[@id="dv-dp-left-content"]/div[1]/div/div/img').get_attribute('src') items.append(item) self.driver.close() yield items
when run code error :
element not found in cache - perhaps page has changed since looked stacktrace:
if tell selenium click on likn moved original page page behind link.
in case have result site urls products on amazon click 1 of links in result list , moved detail site. in case site changes , rest of elements want iterate on in for
loop not there -- that's why exception.
why don't use search result site extract title , image? both there need change xpath expressions right fields of lien
.
update
to title search result site extract text in h2
element of a
element want click.
to image need take other div
in li
element: in xpath select div[2]
need select div[1]
image.
if open search result site in browser , @ sources developer tools can see xpath expression use elements.
Comments
Post a Comment