When trying to obtain all the articles of the consulted page, I only obtain the first record as many times as there are articles in the consulted URL.
class MercadoSpider(scrapy.Spider):
name = 'mercado'
allowed_domain = ['www.mercadolibre.com.ar']
start_urls = ['https://videojuegos.mercadolibre.com.ar/videojuegos/ps4/fisico/_DisplayType_LF_PriceRange_1200-2500']
def parse(self, response):
for articulo in response.xpath('//li[@class="ui-search-layout__item"]'):
precio= articulo.xpath('//span[@class="price-tag-fraction"]//text()').get()
titulo= articulo.xpath('//h2[@class="ui-search-item__title"]//text()').get()
yield {
'price': precio,
'title': titulo
}
This is what the output would look like as an example
[
{"price": "2.099", "title": "Plants vs. Zombies: Garden Warfare 2 Standard Edition Electronic Arts PS4 F\u00edsico"},
{"price": "2.099", "title": "Plants vs. Zombies: Garden Warfare 2 Standard Edition Electronic Arts PS4 F\u00edsico"},
{"price": "2.099", "title": "Plants vs. Zombies: Garden Warfare 2 Standard Edition Electronic Arts PS4 F\u00edsico"},
{"price": "2.099", "title": "Plants vs. Zombies: Garden Warfare 2 Standard Edition Electronic Arts PS4 F\u00edsico"},
{"price": "2.099", "title": "Plants vs. Zombies: Garden Warfare 2 Standard Edition Electronic Arts PS4 F\u00edsico"},
{"price": "2.099", "title": "Plants vs. Zombies: Garden Warfare 2 Standard Edition Electronic Arts PS4 F\u00edsico"},
{"price": "2.099", "title": "Plants vs. Zombies: Garden Warfare 2 Standard Edition Electronic Arts PS4 F\u00edsico"},
{"price": "2.099", "title": "Plants vs. Zombies: Garden Warfare 2 Standard Edition Electronic Arts PS4 F\u00edsico"}
]
Thank you.-
Your problem is that the expression
xpath()
you use inside the loop contains an absolute path. That is, here for example:Although
articulo
it is a particular element, when executing on it.xpath()
you can pass it an absolute or relative path. If you pass it absolute (starting with/
) it will be applied from the root of the document, instead of from that element.Change it to a relative path, like so:
Now the result will be: