Let’s run Jupyter notebooks in a Visual Studio Code development container, so we keep our host system clean and our development setup replicable. We’re building a scraper, so let’s add support for Puppeteer (pyppeteer) as well!
You have a bunch of online services that let you take screenshots of a site and save them in a folder. While it can be very useful to pay for such a system, it is not so hard to create it. Let’s use Chrome / Chromium with Puppeteer and Node.js (cluster) to take some snapshots in no-time. We’ll use the Puppeteer Cluster package to run multiple threads / workers to grab those screens in parallel. We’ll be using TypeScript.
Have you tried turning it on and off again? The web is a weird place and calls might not always succeed in the right manner. A retry with an exponential back-off mechanism helps your code to be more resilient when it connects to services outside of your control. While there are many packages that can help in this area, it pretty easy to add some utility methods to your project. In this article I’ll show how you can create a general-purpose exponential back-off and retry mechanism using TypeScript and Node.js.