ESI

ESI

About the project

The project purpose is to scrape news that is relevant to technology companies. We currently have three scrapers. Each scrapes news from different websites relevant to a specific company. You can search new by special tags to the most relevant news. The basic service was created to manage the scrapers. It's similar to ScrapingHub, but with fewer features. You can set the list of companies for scraping, run/stop any scraper, and check logs. There is also a feature that allows the scraper to run periodically.

Duration6 months
ClientThibaut Mallet de Chauny
Categorybig data & analytics
TypeWeb Application
visit website

Technologies

Technologies

Python, Flask, Scrappy, Pandas

We made scrapping scripts using pure python and Scrappy framework, simple admin panel to manage the gathered data was made with Flask. Pandas was used to clean the dataset.