How to Crawl ALL text from ALL domains in a CSV of URLS

Dr Pi

code monkey king coderecode crawler csv john watson rooney linkextractor python pyton scrapper scrapping scrappy scrapy urls webcrawling webscrape webscrapping

description

Scrapy example using Crawl and LinkExtractor to solve a request from a subscriber. One CSV file with 476 URLs. Collect text from every page of every URL using LinkExtractor and CrawlSpider

Code is on GitHub : https://github.com/RGGH/Scrapy18/blob/main/multi_crawl.py

Visit redandgreen blog for more Tutorials

🌏 http://redandgreen.co.uk/about/blog/

Subscribe to the YouTube Channel

🌏 https://www.youtube.com/c/DrPiCode

Follow on Twitter - to get notified of new videos

🌏 https://twitter.com/RngWeb

👍 Become a patron 👍 🌏 https://www.patreon.com/drpi

Buy Dr Pi a coffee (or Tea) ☕ https://www.buymeacoffee.com/DrPi

Proxies

If you need a good, easy to use proxy, I was recommended this one, and having used ScraperAPI for a while I can vouch for them. If you were going to sign up anyway, then maybe you would be kind enough to use the link and the coupon code below?

You can also do a full working trial first as well, (unlike some other companies). The trial doesn't ask for any payment details either so all good! 👍

🌏 10% off ScraperAPI : https://www.scraperapi.com?fpr=ken49 ◼️ Coupon Code: DRPI10 (You can also get started with 1000 free API calls. No credit card required.)

Thumbs up yeah? (cos Algos..)

#webscraping #tutorials #python ... https://www.youtube.com/watch?v=uiUMhVHQ6ow

created

2021-06-08

staked

0.0 LBC

license

Copyrighted (contact publisher)

File size

231651481 Bytes