Web Scraping the impossible? | Parsing CSS style "left" and "top" to create order with Python
Dr Pi
As a solution to CMK's web scraping challenge I created this using "requests_html" and some custom code to use create the correct sequence from randomised class names.
Everything went ok, until I spotted an error in the 3rd column, half way down!
Web scraping usually makes use of the fact that the text within the page can be accessed with some built in order, to keep the columns and rows in sequence when generating the output.
My solution was to cross reference left, top with the class names. In doing this I used lists, and zip, but as you find out, there is still a bug in the code...
And ideas / suggestions on how to perfect the code? Use dict, enumerate, and read across from column B to Column C is my idea!
Tip. I even tried saving the to pdf and using pandas, but there was no actual table to sort, and the prices started spanning 2 rows as well when saved to pdf!
CMK's original challenge:
My solution: https://github.com/RGGH/Experimental-Custom-Scrapers
My blog: https://redandgreen.co.uk/scraping-a-page-via-css-style-data/ ... https://www.youtube.com/watch?v=7nmflQZL9eI
98257682 Bytes