Automate Website Screenshots with Node.js
Whether you are building a link preview service, monitoring dashboards, or generating PDF reports, automating website screenshots with Node.js is a common requirement. This guide walks through three approaches: Puppeteer (self-hosted), Playwright, and using a screenshot API -- with working code for each.
Approach 1: Puppeteer (Self-Hosted)
Puppeteer is Google's official Node.js library for controlling headless Chrome. It gives you full control over the browser but requires you to manage the infrastructure yourself.
Installation
npm install puppeteer
# Downloads Chromium (~170MB) during installBasic Screenshot
const puppeteer = require('puppeteer');
async function takeScreenshot(url, outputPath = 'screenshot.png') {
const browser = await puppeteer.launch({
headless: 'new',
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 800 });
await page.goto(url, { waitUntil: 'networkidle2' });
await page.screenshot({
path: outputPath,
fullPage: false,
type: 'png'
});
await browser.close();
console.log(`Screenshot saved to ${outputPath}`);
}
takeScreenshot('https://example.com');Advanced: Browser Pool for Concurrency
For production use, you do not want to launch a new browser for every screenshot. Instead, use a browser pool that keeps browsers warm and reuses them:
const puppeteer = require('puppeteer');
class BrowserPool {
constructor(size = 3) {
this.browsers = [];
this.currentIndex = 0;
this.size = size;
}
async init() {
for (let i = 0; i < this.size; i++) {
const browser = await puppeteer.launch({
headless: 'new',
args: ['--no-sandbox', '--disable-dev-shm-usage']
});
this.browsers.push(browser);
}
}
getBrowser() {
const browser = this.browsers[this.currentIndex];
this.currentIndex = (this.currentIndex + 1) % this.browsers.length;
return browser;
}
async screenshot(url, options = {}) {
const browser = this.getBrowser();
const page = await browser.newPage();
try {
await page.setViewport({
width: options.width || 1280,
height: options.height || 800
});
await page.goto(url, { waitUntil: 'networkidle2', timeout: 30000 });
// Wait for dynamic content
if (options.waitFor) {
await new Promise(r => setTimeout(r, options.waitFor));
}
return await page.screenshot({
type: options.format || 'png',
fullPage: options.fullPage || false,
quality: options.format === 'jpeg' ? (options.quality || 80) : undefined,
});
} finally {
await page.close();
}
}
async close() {
await Promise.all(this.browsers.map(b => b.close()));
}
}
// Usage
const pool = new BrowserPool(3);
await pool.init();
const screenshot = await pool.screenshot('https://example.com', {
width: 1920, height: 1080, format: 'jpeg', quality: 90
});Heads up: Self-hosted Puppeteer requires Chrome/Chromium installed on the server. On Linux, you need system libraries like libgbm, libnss3, and libatk-bridge. Docker images like ghcr.io/puppeteer/puppeteer bundle everything, but each container uses 500MB+ RAM.
Approach 2: Playwright
Playwright is Microsoft's browser automation library. It supports Chrome, Firefox, and WebKit, and has a similar API to Puppeteer with some ergonomic improvements.
const { chromium } = require('playwright');
async function takeScreenshot(url, outputPath = 'screenshot.png') {
const browser = await chromium.launch();
const context = await browser.newContext({
viewport: { width: 1280, height: 800 }
});
const page = await context.newPage();
await page.goto(url, { waitUntil: 'networkidle' });
await page.screenshot({ path: outputPath, fullPage: false });
await browser.close();
}
takeScreenshot('https://example.com');Playwright's key advantage is multi-browser support and better auto-waiting. However, it still requires you to manage the browser infrastructure, handle crashes, and deal with scaling.
Approach 3: Screenshot API (Recommended)
The simplest approach for production use: delegate browser management to an API service. You make an HTTP request and get back the screenshot. No browser to install, no crashes to handle, no Docker to configure.
const fs = require('fs');
const API_KEY = process.env.SCREENSHOT_API_KEY;
const BASE_URL = 'https://screenshotapi-api-production.up.railway.app';
async function captureScreenshot(url, options = {}) {
const params = new URLSearchParams({
url,
format: options.format || 'png',
width: String(options.width || 1280),
height: String(options.height || 800),
fullpage: String(options.fullPage || false),
wait: String(options.wait || 1000),
});
const response = await fetch(`${BASE_URL}/v1/screenshot?${params}`, {
headers: { 'Authorization': `Bearer ${API_KEY}` }
});
if (!response.ok) {
const error = await response.json();
throw new Error(`Screenshot failed: ${error.error}`);
}
return Buffer.from(await response.arrayBuffer());
}
// Single screenshot
const screenshot = await captureScreenshot('https://github.com');
fs.writeFileSync('github.png', screenshot);
// Batch screenshots
const urls = [
'https://github.com',
'https://vercel.com',
'https://stripe.com',
];
const results = await Promise.all(
urls.map(url => captureScreenshot(url, { format: 'jpeg', quality: 85 }))
);
results.forEach((buffer, i) => {
fs.writeFileSync(`screenshot-${i}.jpeg`, buffer);
});
console.log(`Captured ${results.length} screenshots`);Comparison: Puppeteer vs. Playwright vs. API
| Factor | Puppeteer | Playwright | Screenshot API |
|---|---|---|---|
| Setup | npm install + Chrome | npm install + browsers | API key only |
| Infrastructure | You manage | You manage | Managed |
| Scaling | Docker/K8s | Docker/K8s | Automatic |
| Memory per instance | ~500MB | ~500MB | 0 (API call) |
| Multi-browser | Chrome only | Chrome, Firefox, WebKit | Chrome |
| Login/cookies | Full control | Full control | Not supported |
| Cost (100/mo) | $5-20/mo server | $5-20/mo server | Free |
| Cost (10K/mo) | $50-100/mo | $50-100/mo | $29/mo |
Building a Screenshot Automation Pipeline
Here is a practical example: a Node.js script that takes screenshots of a list of URLs on a schedule and saves them with timestamps. This is useful for visual monitoring.
const fs = require('fs');
const path = require('path');
const API_KEY = process.env.SCREENSHOT_API_KEY;
const BASE_URL = 'https://screenshotapi-api-production.up.railway.app';
const URLS_TO_MONITOR = [
{ name: 'homepage', url: 'https://yoursite.com' },
{ name: 'pricing', url: 'https://yoursite.com/pricing' },
{ name: 'signup', url: 'https://yoursite.com/signup' },
];
async function captureAndSave(name, url) {
const params = new URLSearchParams({
url, format: 'png', width: '1920', height: '1080'
});
const response = await fetch(`${BASE_URL}/v1/screenshot?${params}`, {
headers: { 'Authorization': `Bearer ${API_KEY}` }
});
if (!response.ok) throw new Error(`Failed: ${url}`);
const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
const filename = `screenshots/${name}-${timestamp}.png`;
fs.mkdirSync('screenshots', { recursive: true });
fs.writeFileSync(filename, Buffer.from(await response.arrayBuffer()));
return filename;
}
async function runMonitoring() {
console.log(`Monitoring run at ${new Date().toISOString()}`);
for (const { name, url } of URLS_TO_MONITOR) {
try {
const file = await captureAndSave(name, url);
console.log(` Captured ${name} -> ${file}`);
} catch (err) {
console.error(` FAILED ${name}: ${err.message}`);
}
}
}
// Run immediately
runMonitoring();
// Run every hour (use a proper scheduler like node-cron in production)
// setInterval(runMonitoring, 60 * 60 * 1000);Tips for Production Use
- Use environment variables for API keys. Never hardcode credentials in your source code.
- Add retry logic. Network requests can fail. Wrap your screenshot calls with a retry mechanism (2-3 attempts with exponential backoff).
- Process screenshots in parallel. Use
Promise.allor a concurrency limiter likep-limitto capture multiple screenshots efficiently. - Monitor your usage. Use the
/v1/usageendpoint to track your monthly consumption and set up alerts before hitting limits. - Choose the right format. Use WebP for the smallest file size, PNG for transparency, and JPEG for photos.
Frequently Asked Questions
Can I run Puppeteer on serverless (AWS Lambda)?
Yes, but it requires a special build of Chromium (like chrome-aws-lambda). The cold start is 3-5 seconds and you are limited to the Lambda memory/timeout. For most use cases, an API is simpler.
How do I handle websites that block headless browsers?
ScreenshotAPI sets a realistic user agent and browser fingerprint. Most anti-bot measures do not block it. For heavily protected sites, you may need to add a wait time or use a proxy.
What about memory leaks with Puppeteer?
Always close pages in a try/finally block. Browser instances can leak memory over time; restart them periodically. With an API, this is not your problem.
Related Articles
How to Capture Website Screenshots with an API
Complete guide with code examples in Node.js, Python, cURL, and PHP.
Skip the infrastructure
Try ScreenshotAPI's interactive playground -- capture a screenshot in seconds, no setup needed.
Open Playground