How do you send a POST request in Puppeteer?

Getting the “order” right can be a bit of a challenge. Documentation doesn’t have that many examples… there are some juicy items in the repository in the example folder that you should definitely take a look at. https://github.com/GoogleChrome/puppeteer/tree/main/examples Here is the example; place the following into an async block: // Create browser instance, and give … Read more

Why is puppeteer reporting “UnhandledPromiseRejectionWarning: Error: Navigation failed because browser has disconnected!”?

The Navigation failed because browser has disconnected error usually means that the node scripts that launched Puppeteer ends without waiting for the Puppeteer actions to be completed. Hence it’s a problem with some waitings as you told. About your script, I made some changes to make it work: First of all you’re not awaiting the … Read more

How to run headless Chrome with Selenium in Python?

To run chrome-headless just add –headless via chrome_options.add_argument, e.g.: from selenium import webdriver from selenium.webdriver.chrome.options import Options chrome_options = Options() # chrome_options.add_argument(“–disable-extensions”) # chrome_options.add_argument(“–disable-gpu”) # chrome_options.add_argument(“–no-sandbox”) # linux only chrome_options.add_argument(“–headless=new”) # for Chrome >= 109 # chrome_options.add_argument(“–headless”) # chrome_options.headless = True # also works driver = webdriver.Chrome(options=chrome_options) start_url = “https://duckgo.com” driver.get(start_url) print(driver.page_source.encode(“utf-8”)) # b'<!DOCTYPE html><html … Read more

How can I download images on a page using puppeteer?

If you want to skip the manual dom traversal you can write the images to disk directly from the page response. Example: const puppeteer = require(‘puppeteer’); const fs = require(‘fs’); const path = require(‘path’); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); page.on(‘response’, async response => { const url … Read more

Puppeteer page.evaluate querySelectorAll return empty objects

The values returned from evaluate function should be json serializeable. https://github.com/GoogleChrome/puppeteer/issues/303#issuecomment-322919968 the solution is to extract the href values from the elements and return it. await this.page.evaluate((sel) => { let elements = Array.from(document.querySelectorAll(sel)); let links = elements.map(element => { return element.href }) return links; }, sel);

google-chrome Failed to move to new namespace

After researching extensively internet I think I found the answer: Sandboxing  For security reasons, Google Chrome is unable to provide sandboxing when it is running in the container-based environment. To use Chrome in the container-based environment, pass the –no-sandbox flag to the chrome executable So it looks like there is no better solution than –no-sandbox for me, even though its … Read more

Puppeteer: How to handle multiple tabs?

A new patch has been committed two days ago and now you can use browser.pages() to access all Pages in current browser. Works fine, tried myself yesterday 🙂 Edit: An example how to get a JSON value of a new page opened as ‘target: _blank’ link. const page = await browser.newPage(); await page.goto(url, {waitUntil: ‘load’}); … Read more

additional options in Chrome headless print-to-pdf

Add this CSS to the page your creating into a PDF to remove Chrome Headless’s implemented Header and Footer. CSS: @media print { @page { margin: 0; } body { margin: 1.6cm; } } You should format your command like below to create the PDF: “C:\PATH\TO\CHROME\EXECUTABLE\FILE”, “–headless”,”–disable-gpu”,”–print-to-pdf=” + directory path to where you want the … Read more