SKILL.md
$2a
Common Commands/API
Use Playwright's Node.js API. Install via npm install playwright. Key methods include:
- Launch browser:
const browser = await playwright.chromium.launch({ headless: true });
- Navigate page:
const page = await browser.newPage(); await page.goto('https://example.com');
- Handle auth:
await page.fill('#username', process.env.USERNAME); await page.fill('#password', process.env.PASSWORD); await page.click('#login');
- Extract data:
const data = await page.evaluate(() => document.querySelector('#target').innerText); console.log(data);
- Pagination:
while (await page.$('#next-button')) { await page.click('#next-button'); await page.waitForSelector('.item'); }
- Take screenshot:
await page.screenshot({ path: 'screenshot.png' });
CLI flags for running scripts: Use npx playwright test with flags like --headed for visible mode or --timeout 30000 for extended waits.
Integration Notes
Integrate by importing Playwright in Node.js projects. For auth, use environment variables like $PLAYWRIGHT_USERNAME and $PLAYWRIGHT_PASSWORD to avoid hardcoding. Configuration format: Use a JSON file for settings, e.g., { "url": "https://target.com", "selector": "#data-element" }. Pass it via script args: node scraper.js --config config.json. For larger systems, chain with tools like Puppeteer (if migrating) or export data to databases via page.evaluate results. Ensure compatibility with Node.js 14+ and handle proxy settings with browser.launch({ proxy: { server: 'http://myproxy.com:8080' } }).
Error Handling
Anticipate common errors like timeout on dynamic loads or selector failures. Use page.waitForSelector with timeouts: await page.waitForSelector('#element', { timeout: 10000 }).catch(err => console.error('Element not found:', err));. For network issues, wrap page.goto in try-catch: try { await page.goto(url, { waitUntil: 'networkidle' }); } catch (e) { console.error('Navigation failed:', e.message); await browser.close(); }. Handle authentication failures by checking for error elements: if (await page.$('#error-message')) { throw new Error('Login failed'); }. Log errors with details and retry up to 3 times using a loop.
Concrete Usage Examples
- Scraping a logged-in dashboard: First, set env vars:
export PLAYWRIGHT_USERNAME='user@example.com'andexport PLAYWRIGHT_PASSWORD='securepass'. Then, run:const browser = await playwright.chromium.launch(); const page = await browser.newPage(); await page.goto('https://dashboard.com/login'); await page.fill('#username', process.env.PLAYWRIGHT_USERNAME); await page.fill('#password', process.env.PLAYWRIGHT_PASSWORD); await page.click('#submit'); const data = await page.evaluate(() => document.querySelector('#dashboard-data').innerText); console.log(data); await browser.close();This extracts data from a protected page.
- Handling pagination on a search site: Script:
const browser = await playwright.chromium.launch(); const page = await browser.newPage(); await page.goto('https://search.com?q=query'); let items = []; while (true) { items.push(...await page.$$eval('.result-item', elements => elements.map(el => el.innerText))); const nextButton = await page.$('#next-page'); if (!nextButton) break; await nextButton.click(); await page.waitForTimeout(2000); } console.log(items); await browser.close();This collects results across multiple pages.
Graph Relationships
- Related to: "selenium-automation" (alternative browser automation tool)
- Depends on: "node-runtime" (for Playwright execution)
- Complements: "data-extraction" (for post-processing scraped data)
- In cluster: "community" (shared with other open-source tools)