Skip to content

Latest commit

 

History

History
40 lines (24 loc) · 933 Bytes

README.md

File metadata and controls

40 lines (24 loc) · 933 Bytes

pyjobs

Its a crawler with the goal of extract offers of python jobs from websites, mostly Brazilian websites.

How to install

  1. Check if you have libxml2-dev, libffi-dev, libssl-dev libxml2-dev libxslt-dev and mongodb, if you doesn't install it:

sudo apt-get install libxml2-dev libffi-dev libssl-dev libxml2-dev libxslt-dev mongodb

  1. Install project requirements

pip install -r requirements.txt

Please, be kind with yourself and install it in an virtualenv! :)

How to run it

scrapy crawl ceviu scrapy crawl catho scrapy crawl vagas scrapy crawl empregos

ROADMAP

[x] - Iterate over CEVIU search pages

[x] - Store items in database, preferably a NoSQL database such as MongoDB

[x] - Implement Catho.com.br spider

[x] - Implement Empregos.com.br spider

[x] - Implement Vagas.com.br spider

[] - Build an web interface to search for jobs