Web Scraping with Scrapy | code comparison using items
Dr Pi
Using Item Loaders to populate items when web scraping with Scrapy : The 'Items.py' and ItemLoader class are demonstrated here.
With this bare minimum spider I show 2 ways of populating the items containers in Scrapy ready to export.
The beauty of using items (items.py) is that you can use the same scrapy fields/items.py file with multiple spiders.
I use css selectors here and show how to use add_value to add literal values as well. I also discuss 'SelectorGadget', 'neofetch'
To read more about Feed Exports : https://docs.scrapy.org/en/latest/topics/feed-exports.html Feed Exports allow you to export to one of the following formats:
JSON
JSON lines
CSV
XML
⦿ The challenge is to rewrite the selectors using xpath! ➤ Scrapy reference : https://docs.scrapy.org/en/latest/topics/loaders.html?highlight=itemloader
A follow up video will show these, how to handle the pagination and collect the geo coordinates that I identified during this video.
Useful Links:
➤ SelectorGadget: point and click CSS selectors : https://selectorgadget.com/
➤ neofetch : https://github.com/dylanaraps/neofetch (Neofetch supports almost 150 different operating systems inc. Linux and Windows)
Python scrapy.loader.ItemLoader() Examples : ➤ https://www.programcreek.com/python/example/89768/scrapy.loader.ItemLoader
Credit : https://www.pythongasm.com/introduction-to-scrapy/ for the original idea.
My Craiglsist Real Estate Python Scrapy code : https://github.com/RGGH/Scrapy4
Apologies if the cam was slightly out of synch with the sound - I'll fine tune it for next time. I need a clapperboard.
Feel free to buy me a coffee or tea!: https://www.buymeacoffee.com/DrPi ☕️☕️☕️ else: 🤬
See you around yeah? ... https://www.youtube.com/watch?v=Go4rR88tXcw
146095381 Bytes