Scrape Google Search Using Puppeteer



For my side project, I needed to scrape Google search using a headless browser. I ended up using the Nodejs library called puppeteer. It’s a headless browser that uses chromium.

Install puppeteer

npm install puppeteer

Scrape Google Search using Puppeteer

First, let’s go to the google homepage, type something in, and click the search button.

	try {
		(async () => {
			const browser = await puppeteer.launch();
			const page = await browser.newPage();
			await page.goto('https://google.com');
			await page.type('input.gLFyf.gsfi', 'itunes podcast');
			page.keyboard.press('Enter');

I’m using the type function to insert text into the Google Search input which has the classes .gLFyf.gsfi. Then pressing enter to trigger the search.

Now, let’s wait for the page to load.

await page.waitForSelector('div#resultStats');

This is telling puppeteer to wait for this element to become visible. This is the result stat that you see in Google search telling you how many results it found.

Now that the page is visible, let’s get the first link and navigate to it.

const links = await page.$$('div.r');
await links[0].click();

The links will be an array of all the div.r elements. If you inspect the Google search HTML, you will see that it corresponds to all the title links that Google shows. Then we select the first one and navigate to it.

That’s it! Pretty easy right? Here’s the full code:

const puppeteer = require('puppeteer');

function goToIteuns() {
	try {
		(async () => {
			const browser = await puppeteer.launch();
			const page = await browser.newPage();
			await page.goto('https://google.com');
			await page.type('input.gLFyf.gsfi', 'itunes podcast');
			page.keyboard.press('Enter');
			await page.waitForSelector('div#resultStats');
			const links = await page.$$('div.r');
			await links[0].click();
			await page.screenshot({ path: 'screenshot.png' });
		})();
	} catch (err) {
		console.error(err);
	}
}

See also