Web scraping with Scrapy | How to use ItemLoader and Input/Output Processors
Dr Pi
#scrapy #xpath #loader #processor Scraping a website using Scrapy with ItemLoader and Input/Output Processors to clean up the \t and \n characters from the response.xpath selectors.
I get the Description, Link, and Price of 185 drills from the Screwfix website as an example of how to use Scrapy to get the next page when there is pagination.
Micro editor was being used for the first time here and I show how easy it is to use. Can recommend it, much prefer it to Nano. Thanks to Maksim Korzh, CMK for the tip.
Doing some troubleshooting we also look at the mistake of leaving a "." from the xpath - you get groups of 20 results the same.
Official Scrapy Documentation : https://docs.scrapy.org/en/latest/topics/loaders.html#input-and-output-processors
Repo for the project = GitHub : https://github.com/RGGH/Scrapy
The actual spider is : https://github.com/RGGH/Scrapy/blob/master/sfix_spider.py
☕️☕️☕️ Buy Dr Pi a Coffee...or Tea! : https://www.buymeacoffee.com/DrPi ☕️☕️☕️ ... https://www.youtube.com/watch?v=ps9VFsgSj4k
181289449 Bytes