Core Web Vitals
Largest Contentful Paint (LCP), Interaction to Next Paint (INP), Cumulative Layout Shift (CLS), First Contentful Paint (FCP). Captured natively by every real browser in the run.
Evaluat runs performance tests with real browsers. One isolated browser per virtual user. That's the architectural decision everything else follows from. Here's what that means in practice and why it changes the numbers.
When you start a 1,000-user test in Evaluat, the platform provisions 1,000 isolated browser instances. Each instance has its own memory. Its own CPU. Its own cache. Its own cookies. Its own network stack. Nothing crosses between them.
This sounds obvious until you notice that most "browser-based" load testing tools don't do it. They share one browser across many simulated users, because it's cheaper to run. The contention they're measuring isn't real. The numbers come out optimistic. Production then surprises you.
Performance under load is mostly about contention. If your test tool simulates 100 concurrent users inside one browser process, you're not measuring 100 concurrent experiences. You're measuring one browser doing 100 things in a row. The LCP you record under that model has nothing to do with what real users see at peak.
Evaluat's design forces the contention to be real. Browser instances are independent. The numbers come out matching what your customer's Chrome would record on their machine, against your production-shaped server, with the load you're testing.
For every virtual user in every test, Evaluat captures the six things below. Aggregated across the run, addressable per session, and exportable as raw data if you need it.
Largest Contentful Paint (LCP), Interaction to Next Paint (INP), Cumulative Layout Shift (CLS), First Contentful Paint (FCP). Captured natively by every real browser in the run.
DNS lookup, TCP connect, TLS negotiation, Time to First Byte, response, DOM Content Loaded, page load. Every phase, per URL, percentile-selectable.
Every HTTP request the browser made: method, URL, status, timing breakdown, size, MIME type, originating page. Searchable across millions of rows per run.
Every console.log, warning, error, exception, and resource load failure. Deduplicated with counts so the loud ones surface first.
A full video of the browser viewport for every virtual user's session. Scrub through it. Find the broken moment. No reconstruction from logs.
Every scripted action — navigate, click, type, wait — timestamped to the millisecond, with the CSS selector targeted and pass/fail outcome.
An Evaluat test is composed from four reusable parts. Build them once, recombine them for performance tests, smoke tests, and monitors.
A scenario is a user journey: a sequence of steps like "navigate to the homepage, click the product category, click a product, add to cart, proceed to checkout." Scenarios are reusable building blocks. A single test can run many scenarios in parallel with weighted distribution.
Datasets inject variable data into scenarios. 1,000 different UTM combinations. 1,000 different search terms. 1,000 different user records. Each virtual user picks a row, so no two users follow exactly the same path. Cache effects can't pretend to be performance.
Cookie banners. Newsletter modals. Geolocation prompts. Chat widgets. These break naive performance tests. Popup handlers in Evaluat are persistent project-level rules — "if this element appears, click that button" — that apply across every scenario, so your test scripts stay focused on the journey.
The test plan ties everything together. Region. Timezone. Locale. Viewport. Browser speed. Load shape (Duration or Sessions). Ramp-up profile. Which scenarios run at what weight. The plan is what you click "run" on.
Every test plan controls these. None of them are buried behind a "contact sales" gate. You get the same configuration on Starter as on Enterprise.
Pick the geographic origin for your virtual users. Latency from London is different to latency from Frankfurt. The test should know which one you care about.
Match the segment you're modelling. A Dutch checkout should run with nl_NL locale and Europe/Amsterdam timezone, not en_US.
Set the browser dimensions. Mobile (375×812) to desktop (1920×1080) and beyond. The viewport changes the layout, which changes LCP and CLS.
How fast actions happen between steps. Bot-speed clicking exposes different bugs than human-paced clicking. Pick the one that matches your test goal.
Ramp-up duration, steady-state duration, ramp-down duration. Combined with target concurrency, this is how you describe load tests, stress tests, spike tests, and soak tests through one configuration.
Toggle which project-level popup handlers apply to this run. Useful when you want to test what happens if the cookie banner doesn't dismiss.