A Pragmatist’s Guide to Testing: Building Confidence, Not Just Coverage

November 11, 2025

As engineers, we spend a lot of time talking about tests. We debate frameworks, measure coverage, and enforce check-in policies. But in the rush to “test everything,” we often lose sight of the most important question: Why are we writing tests in the first place?

The answer is simple and singular: to provide confidence.

We write tests to be confident that the application works. We write them to be confident that a refactoring won’t silently break existing behavior. And we write them to be confident that a new feature or bug fix doesn’t inadvertently introduce a regression somewhere else. That’s it. Everything else is details.

A few years ago, Guillermo Rauch of Vercel distilled this entire philosophy into a perfect, memorable line: ”write tests. not many. mostly integration.” It’s a principle that has resonated deeply because it cuts through the noise and focuses on what truly matters. It’s not a hard-and-fast rule, but a powerful guiding principle that helps us make smart trade-offs. Let’s break down how to apply it.

The Testing Spectrum: A Cost-Benefit Analysis

The first mistake teams make is treating all tests as if they provide the same value. They don’t. They exist on a spectrum, and understanding the trade-offs is the first step to building an effective strategy. Think of it as a confidence-to-cost ratio.

Static Tests: ✅ Fast, catches typos. ❌ Zero confidence in your business logic.
Unit Tests: ✅ Verifies parts in isolation. ❌ Zero confidence they work together.
Integration Tests: ✅ Verifies harmony between parts. ❌ Can’t guarantee the full user journey is perfect.
End-to-End (E2E) Tests: ✅ Full confidence. 🐢 Expensive, slow, and brittle.

Now, you may notice that different teams and frameworks use different names for these. A Component Test (testing a React component with its real children) is essentially a type of unit test. An App Test (testing a user flow across multiple components with mocked APIs) is a classic integration test. On the backend, Endpoint Tests are unit tests, and Workflow Tests (testing multiple endpoints together) are integration tests. The labels don’t matter as much as understanding where they fall on this spectrum of confidence versus cost.

The Golden Ratio: Why We Test “Mostly Integration”

If there’s one takeaway, it’s the power of that final clause: “mostly integration.” Unit tests alone are dangerous. They can pass with flying colors while the application is completely broken, giving you a false sense of security.

Let’s break it down with a more concrete example. An umbrella consists of a Canopy component and a Handle component. The Canopy’s function is to open and close. The Handle’s function is to tell the Canopy when to open and to secure it in place. The expected user behavior is to press a button, which opens the canopy, allowing them to hold it over their head to shield from the rain.

We could write unit tests for each component. We could test the Canopy in isolation to ensure it opens, and the Handle in isolation to ensure it sends the right signal. But that approach misses critical integrations—like the handle failing to actually secure the canopy.

We could add a separate integration test to cover that specific failure. But now we’re juggling multiple tests to support a single user behavior.

A more pragmatic approach is to write a single integration test that covers the entire behavior from the user’s perspective right from the start.

import {render, screen, fireEvent} from “@testing-library/react”
import {Umbrella} from “./Umbrella”

// --- A Single Integration Test for the Entire Behavior ---
describe(“Umbrella User Behavior”, () => {
	it(“should open the canopy and secure it when the user presses the button”, async () => {
		// Arrange: Render the full component the user interacts with.
		render(<Umbrella />)

		// Act: Simulate the user’s primary action.
		const openButton = screen.getByRole(“button”, {name: /open umbrella/i})
		await fireEvent.click(openButton)

		// Assert: Verify the user’s goal was achieved.
		// We don’t care *how* it happened, only that the final state is correct.
		expect(screen.getByText(“Canopy is open”)).toBeInTheDocument()
		expect(screen.getByText(“Canopy is secured”)).toBeInTheDocument()
	})
})

This single test gives us immense confidence. It verifies that the Handle and Canopy work together correctly to fulfill the user’s goal. If the Handle fails to secure the Canopy, this test fails. If the Canopy fails to open, this test fails. We get maximum coverage of the user’s journey with a single, maintainable test, embodying the “mostly integration” principle.

Who Are We Writing Tests For? (And Who Aren’t We?)

To write effective tests, you must know your audience. There are only two groups that matter:

The End User: They care about what they see and do. They want their comment to appear, their payment to go through.
The Developer: They care about the public API. How do I use this component? What props does it take?

Notice who’s not on that list: The tests themselves. When you start writing tests for the tests, you inevitably start testing implementation details. As Kent C. Dodds has written, these are “things which users of your code will not typically use, see, or even know about.”

Have you ever experienced this?

You refactor for optimization. The implementation is cleaner, but the behavior is unchanged. Tests pass. App breaks. Wut?
You refactor to prep for a new feature. The code is more flexible, but the behavior remains unchanged. App works. Tests fail. Wat?

Both are symptoms of testing implementation details. A good test should be immune to changes in implementation as long as the behavior remains the same. If you have to change your tests during a refactor, it’s a major red flag.

A Practical Framework: What to Test and What to Skip

So how do you put this into practice? Stop thinking about the code you are testing and start thinking about the use cases that code supports. Test the observable effect the code has for the end user and the developer.

What TO Test (Mapping to Confidence):

Frontend & E2E: Focus on User Behaviors.
- Success Cases: “Did my action work?” (e.g., a new comment appears, a loading spinner disappears).
- Failure Cases: “Why did this break? How do I fix it?” (e.g., a retry button appears with a human-readable error message).
- Edge Cases: “What happens if I push the limits?” (e.g., a submit button is disabled, a clear warning is shown).
Backend: Focus on Endpoint and Workflow robustness.
- Test with valid required and optional parameters.
- Test with valid input for illegal operations (e.g., creating a duplicate record).
- Test with invalid input.
- Test destructively: try to break the API with malformed payloads to ensure it fails gracefully.

What NOT to Test (The Red Flags):

If you’re ever unsure, run your test through this simple checklist. Delete any tests that:

Break when you refactor the implementation but the behavior remains unchanged.
Assert on implementation details like props, internal state, or specific function calls.
Would pass even if the UI showed broken, incorrect behavior.

Conclusion: It’s About Confidence, Not Labels

In the end, the distinctions between “unit,” “integration,” and “E2E” are far less important than the confidence they provide. The goal is to be confident that when you ship your changes, your code satisfies the business requirements. You’ll use a mix of the different testing strategies to accomplish that goal.

That’s the entire philosophy. Focus on confidence. Test behavior, not implementation. Use a strategic mix of tools to give your team the freedom to refactor and build quickly, secure in the knowledge that they aren’t breaking the application. When your test suite is built this way, it stops being a chore and becomes a tool that empowers your team to build with fearless confidence.

Additional Resources

This post was inspired by the fantastic work of others in the community. If you want to dive deeper, these are the foundational articles that shaped my thinking on this topic:

For the original inspiration, see Guillermo Rauch’s tweet that started it all: Write tests. Not too many. Mostly integration.
Kent C. Dodds’ article Write Tests expands on this principle and provides the foundational arguments for a confidence-focused testing strategy.
His follow-up, The Testing Trophy and Testing Classifications, offers a great visual model for thinking about the different types of tests and where to focus your efforts.