End-to-End Testing with Playwright: Real-World Examples
Practical Playwright testing strategies I learned the hard way — from authentication flows to CI/CD integration, with real code examples that actually work in production.
End-to-End Testing with Playwright: Real-World Examples
While I was looking over some flaky tests in our CI pipeline the other day, I realized how many developers are still struggling with end-to-end testing. I was once guilty of writing tests that worked perfectly on my machine but failed randomly in production. Little did I know that most of these issues came from not understanding how Playwright actually works under the hood.
Why Playwright Dominates Modern E2E Testing
When I finally decided to migrate from Cypress to Playwright, I expected a slight improvement. What I got instead was a complete transformation of our testing strategy. Playwright's ability to handle multiple browsers simultaneously, its built-in waiting mechanisms, and its auto-retry functionality saved us countless hours of debugging.
The reality is that Playwright was built by the same team that created Puppeteer, but they learned from all the mistakes. They designed it specifically for modern web applications that rely heavily on JavaScript frameworks like React, Vue, and Angular.
Here's what makes Playwright fascinating: it doesn't just test your application — it actually understands the browser's rendering lifecycle. This means fewer flaky tests and more reliable results across different environments.
Setting Up Playwright for Production-Ready Testing
I cannot stress this enough: your test setup is just as important as the tests themselves. I used to throw everything into a single configuration file and wonder why tests behaved differently across team members' machines.
The first thing I learned was to separate concerns properly. Create different configuration files for local development, staging, and production environments. This simple change reduced our test failures by almost 40%.
// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
testDir: './tests',
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
reporter: process.env.CI ? 'github' : 'html',
use: {
baseURL: process.env.BASE_URL || 'http://localhost:3000',
trace: 'on-first-retry',
screenshot: 'only-on-failure',
video: 'retain-on-failure',
},
projects: [
{
name: 'chromium',
use: { ...devices['Desktop Chrome'] },
},
{
name: 'firefox',
use: { ...devices['Desktop Firefox'] },
},
{
name: 'webkit',
use: { ...devices['Desktop Safari'] },
},
],
webServer: {
command: 'npm run dev',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
},
});Notice how I configure retries only in CI environments? This prevents developers from getting complacent about flaky tests locally while still maintaining stability in the pipeline.

Real-World Example: Testing Authentication Flows
Authentication is where most E2E testing nightmares begin. I spent weeks trying to figure out why our login tests kept failing randomly. The problem wasn't Playwright — it was my approach.
Here's the pattern I now use for all authentication testing:
// tests/auth.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Authentication Flow', () => {
test.beforeEach(async ({ page }) => {
await page.goto('/login');
});
test('should successfully log in with valid credentials', async ({ page }) => {
await page.fill('input[name="email"]', 'test@example.com');
await page.fill('input[name="password"]', 'SecurePass123!');
await page.click('button[type="submit"]');
// Wait for navigation and verify dashboard
await page.waitForURL('/dashboard');
await expect(page.locator('h1')).toContainText('Welcome back');
// Verify authentication state persists
const cookies = await page.context().cookies();
const authToken = cookies.find(c => c.name === 'auth_token');
expect(authToken).toBeDefined();
});
test('should show error with invalid credentials', async ({ page }) => {
await page.fill('input[name="email"]', 'wrong@example.com');
await page.fill('input[name="password"]', 'WrongPass');
await page.click('button[type="submit"]');
// Should stay on login page
await expect(page).toHaveURL('/login');
// Error message should appear
const errorMessage = page.locator('[role="alert"]');
await expect(errorMessage).toBeVisible();
await expect(errorMessage).toContainText('Invalid credentials');
});
test('should handle session expiration gracefully', async ({ page, context }) => {
// Set up an expired token
await context.addCookies([{
name: 'auth_token',
value: 'expired_token',
domain: 'localhost',
path: '/',
expires: Date.now() / 1000 - 3600, // Expired 1 hour ago
}]);
await page.goto('/dashboard');
// Should redirect to login
await page.waitForURL('/login');
await expect(page.locator('.session-expired-notice')).toBeVisible();
});
});The key insight here is using waitForURL instead of arbitrary timeouts. Playwright knows when navigation completes, so let it do the heavy lifting.
Real-World Example: Form Validation and Error Handling
I came across this problem repeatedly: forms that worked perfectly in manual testing but failed in automated tests. The issue was race conditions between validation logic and DOM updates.
// tests/contact-form.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Contact Form Validation', () => {
test('should validate email format in real-time', async ({ page }) => {
await page.goto('/contact');
const emailInput = page.locator('input[name="email"]');
const emailError = page.locator('#email-error');
// Type invalid email
await emailInput.fill('notanemail');
await emailInput.blur(); // Trigger validation
await expect(emailError).toBeVisible();
await expect(emailError).toContainText('Please enter a valid email');
// Fix the email
await emailInput.fill('valid@example.com');
await emailInput.blur();
await expect(emailError).not.toBeVisible();
});
test('should prevent submission with empty required fields', async ({ page }) => {
await page.goto('/contact');
const submitButton = page.locator('button[type="submit"]');
await submitButton.click();
// Check that form wasn't submitted
await expect(page).toHaveURL('/contact');
// Verify all required field errors are shown
const requiredFields = ['name', 'email', 'message'];
for (const field of requiredFields) {
const error = page.locator(`#${field}-error`);
await expect(error).toBeVisible();
}
});
test('should successfully submit valid form data', async ({ page }) => {
await page.goto('/contact');
// Fill all fields
await page.fill('input[name="name"]', 'John Doe');
await page.fill('input[name="email"]', 'john@example.com');
await page.fill('textarea[name="message"]', 'This is a test message');
// Listen for the API call
const responsePromise = page.waitForResponse(
response => response.url().includes('/api/contact') && response.status() === 200
);
await page.click('button[type="submit"]');
await responsePromise;
// Verify success message
const successMessage = page.locator('.success-notification');
await expect(successMessage).toBeVisible();
await expect(successMessage).toContainText('Message sent successfully');
});
});Notice how I use waitForResponse to verify the actual API call? This catches issues that purely DOM-based testing would miss.

API Mocking vs Real Backend Testing
Here's where I made my biggest mistake: trying to mock everything. I thought mocking would make tests faster and more reliable. Instead, it created a false sense of security.
Luckily we can use Playwright's route interception to strike a balance. Mock external services but test against your real backend for critical flows. In other words, mock what you don't control, test what you do.
For critical user journeys like checkout or payment processing, always test against the real backend (or staging environment). The network calls add maybe 2-3 seconds but catch integration issues that mocks would hide.
For non-critical flows like loading user preferences or analytics events, mocking is perfectly acceptable. Just make sure your mocks reflect reality.
Advanced Patterns: File Uploads, Downloads, and Multi-Tab Scenarios
The moment I had to test file uploads, I realized most tutorials stop at the basics. Real applications need to handle file validation, progress indicators, and error states.
For file downloads, use Playwright's download event listener instead of checking the file system directly. For multi-tab scenarios, use page.context().newPage() to spawn new pages within the same browser context — this preserves cookies and storage.
One pattern that saved me countless hours: use fixtures for common setup like authenticated users or pre-populated shopping carts. Playwright's fixture system is wonderful because it handles cleanup automatically.
CI/CD Integration and Debugging Flaky Tests
When I finally decided to integrate Playwright into our GitHub Actions workflow, I discovered that CI environments behave very differently from local development. Headless mode exposes timing issues that headed mode masks.
The trace viewer is fascinating — it records every action, screenshot, and network call. When a test fails in CI, the trace file shows you exactly what happened. Enable it with trace: 'on-first-retry' in your config.
For flaky tests, I learned to look at three things first: network timing, element visibility, and race conditions. Playwright's auto-waiting usually handles these, but complex JavaScript applications can still trip it up.
Production Lessons: What Works and What Doesn't
After running thousands of Playwright tests in production, here's what I learned the hard way:
Don't test implementation details — test user behavior. If you're reaching into component internals or relying on CSS classes that change frequently, your tests will break constantly.
Do use data attributes specifically for testing. Add data-testid attributes to critical elements. Yes, it adds markup, but it makes tests resilient to styling changes.
Don't try to achieve 100% coverage with E2E tests. They're expensive to maintain. Focus on critical user journeys and use unit tests for edge cases.
Do run tests in parallel, but be careful with shared state. We once had tests that worked individually but failed when run together because they shared database records.
The ROI on learning Playwright properly is massive. We went from spending hours debugging failed CI runs to having confidence that our deployments actually work.
And that concludes the end of this post! I hope you found this valuable and look out for more in the future!