For this blog, I created one sample HTML file to scrap and will follow that throughout this blog of web scraping. For this, we will use Beautiful Soup, request package to parse the website.
You have to install packages if you are on Python and if you are on anaconda then it will automatically come with the installed libraries.
- pip install beautifulsoup4 -> It helps in pulling data out of HTML and XML files.
- pip install lxml/html5lib -> There are parsers for HTML file as different parsers behave in a different way.
- pip install requests -> It is used to fetch the information from the web.
Beautiful Soup supports the HTML parser included in Python’s standard library, but it also supports a number of third-party Python parsers. One is the LXML parser and we are going to use LXML parser for this tutorial.
How to fetch all links and scrap a page
If you want to fetch the data from a website on some server.