portfolios Details

Quant Coding > Massive Scraping Project

Scraping Project

For a young startup from the USA, we have completed a fairly challenging and massive scrape of online data. We have collected over 60 million data records from an online source and written this to the client database. For the project, we have used the robust Python scraping framework Scrapy utilizing a lot of the extensions available and building our custom extensions as needed. This involved making complex requests while maintaining some sort of “state” with the server through cookies and other items all that on separate proxies or “sessions” to allow timely completion of the job. We have used SQLALchemy, Pandas, and other helper libraries in Python to complete this job. In the end, the client was very happy with the results and the overall work on the project.

Schedule Your 15-min Free Consultation Today