React Testing Library Intro - discussion (2023-01-27)
The RTL intro is still the best reset for teams writing tests that mirror component structure instead of intent. What I keep coming back to in this repo's style is: if the UI renders evidence for route posture and lane status, your tests get simpler and more truthful.
Do you treat data-* evidence as part of the public UI contract, or only as a debug surface? How do you avoid brittle waitFor loops when your app uses lots of stored-derived outputs? What do you use as your "source of truth" in tests: roles/text, evidence keys, or a mix?
Comments (18)
Back to latestI treat evidence as a real contract, but only the evidence that explains behavior (lane, posture, selection, identity).
If the evidence line is readable by a human, it's usually stable enough for tests too.
The best test refactor we did was replacing timing assertions with an evidence lane. We render a single "tips lane" attribute on the shell and assert on that:
tsx
render(<Inbox />);
await userEvent.click(screen.getByRole('button', { name: /refresh/i }));
expect(screen.getByTestId('shell')).toHaveAttribute('data-fetch-lane', 'pending');
await screen.findByText(/updated/i);
expect(screen.getByTestId('shell')).toHaveAttribute('data-fetch-lane', 'ok');
And when it fails, our debug log is human-readable:
txt
[tips] lane=fetch pending reason=refresh:click at=1672773960000
[tips] lane=fetch ok reason=fetch:resolve freshnessAt=2023-02-03T06:06ZCounterpoint: if you add evidence purely for tests, it becomes a second UI and starts drifting.
I like evidence that improves support/debugging anyway (a screenshot should explain the app).
Yep. My rule is: if I wouldn't show it behind a debug toggle, it shouldn't exist.
The test just benefits from the same visibility.
We still use findByRole heavily, but the evidence keys remove ambiguity.
Example: a button label doesn't tell you whether you're in hot vs warm render cadence, but the contract does.
Stored-derived makes tests nicer *if* you test the rendered product, not the derived key itself.
Badge text, row ordering, visible selection are good targets; internal hashes are not.
I've had good luck with a tiny debug panel that mirrors evidence keys in text (not just data-*).
RTL can query it like normal content and it stays accessible.
One place I still reach for waitFor is animated route motion (panel switching). But even there, we log and render the motion key so we can assert the intent:
txt
[tips] routeMotion panel=list->detail reason=nav:click remountKey=thread:42Counterpoint: sometimes you do need to poke a contract directly (e.g. a vault) to set up test state quickly.
But I still like asserting the same thing via evidence so the behavior is visible to humans too.
When tests are flaky, it's usually because the app has hidden async work.
Evidence lanes are basically a way to stop hiding it.
Docs tie-in: the forum's "evidence posture" is a testing strategy, not just a debugging strategy.
Related: Testing and Debugging and API Reference.
A small practice that helped: every async mutation writes an intent string and we render it. Then tests can assert the intent occurred without guessing timing:
txt
[tips] intent=notify:archive id=n-183 reason=user:click lane=okI try to keep evidence stable by making it vocabulary-first: lane, posture, identity, selection.
If someone proposes adding data-randomInternalFlag, that's a sign the model isn't clear.
RTL also gets nicer when components are accessible. Roles are powerful when the UI is semantic.
It sounds obvious, but it's the difference between readable tests and CSS-selector archaeology.
The intro doc taught me to stop asserting on implementation, but evidence isn't implementation—it's the product contract.
If the product says it's pending, it should be visible and testable.
We use a little "tips trace" helper in dev that prints the last 20 evidence transitions.
Not for tests directly, but it makes failed test runs diagnosable in minutes.
Counterpoint: too much trace logging can make people ignore logs entirely.
We log only contract transitions (lane changes, identity remounts, posture toggles).
If you want tests to feel like user intent, prefer userEvent + role queries + a tiny evidence assertion.
It's a nice blend: human-level + contract-level.
The best thing about evidence-based tests is they push you to build honest UI.
If you can't write a stable test, it often means the UI isn't telling the truth yet.