
log( "CHILD: url received from parent process", url) Ĭonst browser = await puppeteer. The code snippet below is a simple example of running parallel downloads with Puppeteer.Ĭonst downloadPath = path.

💡 If you are not familiar with how child process work in Node I highly encourage you to give this article a read. We can combine the child process module with our Puppeteer script and download files in parallel. Child process is how Node.js handles parallel programming. We can fork multiple child_proces in Node. Our CPU cores can run multiple processes at the same time. Further reading: how to submit forms with Puppeteer. Once you have a solid understanding of Puppeteer’s API and how it fits together in the Node.js ecosystem you can come up with custom solutions best suited for you. 💡 Learn more about the single threaded architecture of node here There are many ways you can download files with Puppeteer.

Therefore if we have to download 10 files each 1 gigabyte in size and each requiring about 3 mins to download then with a single process we will have to wait for 10 x 3 = 30 minutes for the task to finish. It can only execute one process at a time. You see Node.js in its core is a single-threaded system. However, if you have to download multiple large files things start to get complicated. In this next part, we will dive deep into some of the advanced concepts. Chrome defaults to downloading files in various places, depending on the operating system. puppeteerrc.cjs (or browser = await puppeteer. Puppeteer uses several defaults that can be customized through configurationįor example, to change the default cache directory Puppeteer uses to installīrowsers, you can add a. Include $HOME/.cache into the project's deployment.įor a version of Puppeteer without the browser installation, see In my case, I’ll name it Puppeteer -Tutorial. To that end, create a new folder and name it whatever you like. The next thing to do is to initialize a new node.js project. Learn how to download Node.js installer and NPM here. Your project folder (see an example below) because not all hosting providers Alright, first thing, make sure you have Node and NPM installed Puppeteer relies heavily on those. Heroku, you might need to reconfigure the location of the cache to be within


If you deploy a project using Puppeteer to a hosting provider, such as Render or The browser is downloaded to the $HOME/.cache/puppeteer folderīy default (starting with Puppeteer v19.0.0). When you install Puppeteer, it automatically downloads a recent version ofĬhrome for Testing (~170MB macOS, ~282MB Linux, ~280MB Windows) that is guaranteed to
