Is it safe to run multiple instances of Puppeteer at the same time?

It’s fine to run multiple browser, contexts or even pages in parallel. The limits depend on your network/disk/memory and task setup.

I crawled a few million pages and from time to time (in my setup, every ~10,000 pages) puppeteer will crash. Therefore, you should have a way to auto-restart the browser and retry the job.

You might want to check out puppeteer-cluster, which takes care of pooling the browser instances, restarting and crash detection/restarting. (Disclaimer: I’m the author)

An example of a creation of a cluster is below:

// create a cluster that handles 10 parallel browsers
const cluster = await Cluster.launch({
    concurrency: Cluster.CONCURRENCY_BROWSER,
    maxConcurrency: 10,
});

// Queue your jobs (one example)
cluster.queue(async ({ page }) => {
    await page.goto('http://www.wikipedia.org');
    await page.screenshot({path: 'wikipedia.png'});
});

This is just a minimal example. There are many more ways to use the cluster.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)