Scrapy

To be added

Python web scraper

Official Documentation

Table of contents
  1. Installation
  2. Start a project
  3. Create a spider
    1. Show available spider templates
    2. Generate spider
    3. Check generated spiders

Installation

pip install Scrapy
# or
poetry add Scrapy

Start a project

cd <proj-root>
scrapy startproject <proj-name> .

Create a spider

First navigate to a specific Scrapy project:

cd <proj-name>

Check that you are indeed in the right project by:

$ scrapy
Scrapy x.x.x - project: <proj-name>

Show available spider templates

$ scrapy genspider -l
  basic
  crawl
  csvfeed
  xmlfeed

Generate spider

scrapy genspider -t crawl <spider-name> <allowed-domain>

Check generated spiders

scrapy list

Table of contents