Scrapy
To be added
Python web scraper
Table of contents
Installation
pip install Scrapy
# or
poetry add Scrapy
Start a project
cd <proj-root>
scrapy startproject <proj-name> .
Create a spider
First navigate to a specific Scrapy project:
cd <proj-name>
Check that you are indeed in the right project by:
$ scrapy
Scrapy x.x.x - project: <proj-name>
Show available spider templates
$ scrapy genspider -l
basic
crawl
csvfeed
xmlfeed
Generate spider
scrapy genspider -t crawl <spider-name> <allowed-domain>
Check generated spiders
scrapy list