Playwright is a modern Node.js library for browser automation that supports Chromium, Firefox, and WebKit. Its robust feature set and simple API make it an excellent choice for scraping complex, interactive websites.
Integrating 2extract.com proxies with Playwright is done by passing proxy settings directly to the browser.newContext()
or browser.newPage()
method.
Basic Setup
Here’s how to launch a Playwright browser instance that routes all traffic through our proxy gateway.
You’ll need playwright
installed in your project: npm install playwright
const { chromium } = require('playwright');
// 1. Get these from your proxy's "Connection Details" page
const proxyHost = "proxy.2extract.net";
const proxyPort = 5555;
const proxyUser = "PROXY_USERNAME";
const proxyPass = "PROXY_PASSWORD";
(async () => {
console.log('Launching browser with proxy...');
const browser = await chromium.launch({
// 2. Pass proxy settings directly to the launch options
proxy: {
server: `http://${proxyHost}:${proxyPort}`,
username: proxyUser,
password: proxyPass
},
headless: false // Set to true for production
});
const context = await browser.newContext();
const page = await context.newPage();
console.log('Navigating to IP checker...');
await page.goto('https://api.ipify.org?format=json');
// 3. Get the content and verify the IP
const content = await page.textContent('body');
console.log('Success! Your proxy IP is:', JSON.parse(content).ip);
await browser.close();
})();
In Playwright, proxy credentials are set at the browser or context level. All new pages within that context will automatically use the same proxy settings, which is simpler than Puppeteer’s per-page authentication.
Real-World Example: Checking Flight Prices on Google Flights
Flight prices are a classic example of geo-dependent data. Let’s build a script to check the price of a one-way flight from Berlin to London on a specific date, making the request appear as if it’s coming from Germany (DE).
const { chromium } = require('playwright');
// --- Your Base Credentials ---
const BASE_USERNAME = "PROXY_USERNAME";
const PASSWORD = "PROXY_PASSWORD";
const PROXY_HOST = "proxy.2extract.net";
const PROXY_PORT = 5555;
// --- Target Information ---
const DEPARTURE_AIRPORT = "BER"; // Berlin
const ARRIVAL_AIRPORT = "LHR"; // London Heathrow
const DEPARTURE_DATE = "2025-11-07"; // YYYY-MM-DD format
const TARGET_URL = `https://www.google.com/travel/flights/search?tfs=CBwQAhoeEgoyMDI1LTExLTA3agcIARIDQkVScgcIARIDTEhSQAFIAXABggELCP___________wGYAQI`; // Note: This is a complex, pre-generated URL for BER-LHR on 2025-08-15
const REGION = "de"; // Germany
(async () => {
// Dynamically construct the username for the target region
const proxyUsername = `${BASE_USERNAME}-country-${REGION}`;
console.log("Launching Playwright browser...");
const browser = await chromium.launch({
proxy: {
server: `http://${PROXY_HOST}:${PROXY_PORT}`,
username: proxyUsername,
password: PASSWORD
}
});
const context = await browser.newContext({
// Set a realistic User-Agent and viewport
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
viewport: { width: 1920, height: 1080 },
});
const page = await context.newPage();
console.log(`Navigating to Google Flights via a ${REGION.toUpperCase()} proxy...`);
try {
// Navigate and wait for the results to load
await page.goto(TARGET_URL, { waitUntil: 'load', timeout: 60000 });
// Selector to find the price of the first non-stop flight
// This is a complex selector that may change!
const priceSelector = 'div[role="main"] li[role="listitem"] .YMlIz > div > div:last-child';
// Wait for the price element to be visible
await page.waitForSelector(priceSelector, { timeout: 20000 });
const price = await page.locator(priceSelector).first().textContent();
console.log(`Success! The cheapest non-stop flight price found is: ${price}`);
await page.screenshot({ path: `google_flights_${REGION}.png` });
console.log(`Screenshot saved as 'google_flights_${REGION}.png'`);
} catch (error) {
console.error(`An error occurred: ${error.message}`);
await page.screenshot({ path: 'error_flights.png' });
console.log("Saved an error screenshot to error_flights.png for debugging.");
} finally {
await browser.close();
console.log("Browser closed.");
}
})();
Google is a highly sophisticated target! The URL structure and CSS selectors on Google Flights are complex and change frequently. This script demonstrates the technical integration, but for reliable, large-scale scraping of Google, you will need to build very robust error handling and adapt your selectors regularly.
What This Example Demonstrates
- Cross-Browser Automation: How to use Playwright’s powerful API to control a browser.
- Simplified Proxy Setup in Playwright: Shows the modern approach of setting proxy credentials at the browser-launch level.
- Scraping Dynamic Content: How to wait for specific elements (
page.waitForSelector
) to appear on a JavaScript-heavy page before trying to extract data.
- Advanced Locators: Uses Playwright’s
locator
API for more resilient element selection.