Skip to main content

Posts

Featured

Web scraping with Python

This post is for those who are interested in learning about common design patterns, tricks, and rules related to web scraping. For scraping, we will use a programming language and corresponding libraries. In our case, Python will be used. This language is a pretty strong tool for writing scrapers if you know how to use it and its libraries correctly: requests, bs4, json, lxml, re. Here we work with selectors to get the elements we want. To do this, first we need to connect the requests library and make a request. Special attention should be paid to headers, because with their help, the server analyzes the request and returns you the result depending on what was indicated in them, I highly recommend finding information about the standard headers and their values. import requests headers = { 'authority': 'www.walmart.com', 'cache-control': 'max-age=0', 'upgrade-insecure-requests': '1', 'user-agent': '

Latest posts

Machine learning