Speed up your automation with parallelization

Some complex automation requires processing multiple pages and actions without resulting in an exponential duration. For example, a backup automation tool must access and back up hundreds of pages reliably and efficiently.

Using multiple pages or contexts for parallelism will result in issues with Browserbase’s persistent context. Instead, we recommend leveraging multiple browser instances as a pool to process actions on numerous pages in parallel.

Enable parallelism with a pool of Browser instances

Let’s walk through the code of our example script processing hundreds of Wikipedia pages efficiently:

The processBrowserbaseTasks() utility

The processBrowserbaseTasks() utility creates 5 Browserbase Sessions and reuses the available pages between tasks:

Playwright
import { Page, chromium } from "playwright-core";

export async function processBrowserbaseTasks<R>(
  tasks: ((page: Page) => Promise<R>)[],
): Promise<R[]> {
  const tasksQueue = tasks.slice();
  const resultsQueue: R[] = [];

  const createBrowserSession = async (browserWSEndpoint: string) => {
    const browser = await chromium.connectOverCDP(browserWSEndpoint);
    const pages = await browser.pages();
    const page = pages[0];

    while (true) {
      if (tasksQueue.length > 0) {
        const task = tasksQueue.shift();
        if (task) {
          const result = await task(page);
          resultsQueue.push(result);
        }
      } else {
        break;
      }
    }

    await page.close();
    await browser.close();
  };

  const browserWSEndpoint = `wss://connect.browserbase.com?apiKey=${process.env.BROWSERBASE_API_KEY}&enableProxy=true`;
  const sessions = Array.from({ length: 5 }, () =>
    createBrowserSession(browserWSEndpoint),
  );

  await Promise.all(sessions);

  return resultsQueue;
}

Creating a task to fetch each Wikipedia page

Let’s create a function that will be called with a page for each Wikipedia URL and return the url, content tuple:

Playwright
const tasks = loadUrlsFromFile("wikipedia_urls.txt").map(
  (url) => async (page: Page) => {
    console.log(`Processing ${url}...`);
    await page.goto(url);
    const content = await page.content();
    return [url, content];
  },
);

Passing the tasks to `processBrowserbaseTasks()` and printing the results

Playwright
const result = await processBrowserbaseTasks(tasks);
result.map(([url, content]) => {
  console.log(url, content.substring(0, 200) + "...");
});

Run the automation

Let’s now run our Wikipedia automation; you will notice that pages are processed by a group of 5:

bash
$ BROWSERBASE_PROJECT_ID=xxxxxxxxx BROWSERBASE_API_KEY=xxxxxxxxx node dist/index.js
Processing https://en.wikipedia.org/wiki/Patrick_Flynn_(hurler)...
Processing https://en.wikipedia.org/wiki/Environmental_radioactivity...
Processing https://en.wikipedia.org/wiki/Alexi_Ogando...
Processing https://en.wikipedia.org/wiki/Costantino_Maria_Attilio_Barneschi...
Processing https://en.wikipedia.org/wiki/Breaking_bulk...
Processing https://en.wikipedia.org/wiki/New_Hampshire_Route_122...
Processing https://en.wikipedia.org/wiki/David_Hoff...
Processing https://en.wikipedia.org/wiki/Neodesha,_Oklahoma...
Processing https://en.wikipedia.org/wiki/List_of_Bethel_Threshers_head_football_coaches...
Processing https://en.wikipedia.org/wiki/Thysanodonta_boucheti...
Processing https://en.wikipedia.org/wiki/Sturm_und_Drang_(play)...
Processing https://en.wikipedia.org/wiki/Maša_Kolanović...
Processing https://en.wikipedia.org/wiki/Hermitage_of_Sant'Onofrio,_Serramonacesca...
Processing https://en.wikipedia.org/wiki/Bill_Simpson_(racing_driver)...
Processing https://en.wikipedia.org/wiki/Dundee,_Oregon...
Processing https://en.wikipedia.org/wiki/Caragh_McMurtry...
Processing https://en.wikipedia.org/wiki/Palmar_metacarpal_veins...
Processing https://en.wikipedia.org/wiki/2000_Uzbek_presidential_election...

Find the complete example on GitHub

Clone this GitHub repo to get started with a Playwright parallelism setup

Capabilities

Guides

Integrations

Speed up your automation with parallelization

Enable parallelism with a pool of Browser instances

Find the complete example on GitHub

Capabilities

Guides

Integrations

​Enable parallelism with a pool of Browser instances

Find the complete example on GitHub

Enable parallelism with a pool of Browser instances