How to Install BeautifulSoup and use the Page Inspector
Zenva
ACCESS the FULL COURSE here: https://academy.zenva.com/product/data-science-mini-degree/?zva_src=youtube-datascience-md
TRANSCRIPT
Hello, everyone, and welcome to our tutorial on introducing Beautiful Soup and also Page Inspection. Here, we're going to perform two main tasks. The first is gonna be obtaining Beautiful Soup, which is a Python library, and the second is going to be using the Page Inspector to check out a page for us to scrape. So, first of all, what is Beautiful Soup? Beautiful Soup is simply a Python library that is very, very good at web scraping, and also data parsing. We're going to be using it for HTML, JSON, and XML data, first focusing on HTML, and then we'll take a look at JSON and XML in the second half. The great thing about Beautiful Soup is it's very easy to import and extremely easy to use. We'll actually only need a few Beautiful Soup functions to get ourselves all the data we need.
Now what about Page Inspection? There's a very good chance you guys have actually used this tool before, but we're going to show you, and just kind of refresh, how to use our inspector tool in a web browser. What this does is it allows us to view the HTML, CSS, and Javascript code that makes up a page. And we're going to choose the Wikipedia Genome page to examine. We'll kinda go through it quickly together, and then I'll let you guys check it out some more after this. So, let's first check out Beautiful Soup. In fact, the greatest thing is that we only need to start up a terminal, or command line if you're using Windows. And we should be able to install Beautiful Soup just by typing the line, pip install bs4. Make sure it's bs4 there. So this is assuming that you have pip installed. And you can use these command line tools, you should be able to do this in Mac and Windows. I already have Beautiful Soup four, that's bs4. That's not short of something else. We already have this installed on my computer, which is why it says Requirement already satisfied, but likely this'll just take a couple of seconds and it should say that the installation was successful. If we want to make sure that we have the package available, we can actually start a new Python shell, and just try to import bs4, and if it works without any errors, then we know we're good to go here. So we have Beautiful Soup ready for us to use. That's the first half of this tutorial done.
What about Page Inspection? We can go ahead and quit that terminal, and what we're gonna do is actually do the Wikipedia. I'll actually just go to Wikipedia here. And we're just gonna search the genome page. Honestly, you can use whichever page you'd like, but I would recommend, at least when you follow these tutorials, you start with our genome page, and then you can go on to do some more later on ... https://www.youtube.com/watch?v=172OazeDIMU
44898632 Bytes