Is GA4 Accurate? What You Can Fix and What You Can't

Show article contentsHide article contents
- So is GA4 accurate? The short answer
- Why your real visitors go missing: consent and cookies (structural)
- How GA4 fills the gap with guesses, and labels them as facts (structural)
- Does GA4 count bots as real visitors? (structural)
- Does GA4 sample your data? (structural, on the free version)
- Why GA4 hides rows of your own data (structural, inside the reports)
- Why GA4 never matches Google Ads, Search Console, or your own database
- Ad blockers and browsers: one limit is shared, one is GA4's alone
- The fixable stuff is real, but fixing it isn't the answer
- Where every analytics tool, including Clickport, is still limited
- So should you trust GA4, or switch?
"GA4 is steaming hot garbage." That is an actual thread title on the Google Analytics subreddit, and it has a lot of upvotes. The person who wrote it said GA4 was showing "about 10% of our actual traffic." Someone on a different thread watched GA4 report 500 percent more sessions than their sales system, about 85 percent of it filed under "Direct."
So which is it. Is GA4 wildly undercounting, or wildly overcounting? The honest answer is both can be true on the same account, and once you understand why, you stop trusting the number on the screen.
Here is the short version, and then I will back up every piece of it. No, GA4 is not accurate. Across the real-world studies, it typically misses somewhere between 10 and 30 percent of your real traffic, and that can climb past 55 percent once a cookie consent banner is involved. Google's own documentation does not put a single number on it, but it does spell out the machinery that causes the loss. It only ever undercounts real humans. The reason it sometimes looks too high is bots, which it cannot filter and cannot even show you.
The useful way to think about this is to split GA4's accuracy problems into two piles. One pile is fixable: things you misconfigured, that you can clean up, after which GA4 reports correctly. The other pile is structural: built into how GA4 works, impossible to fix with any setting, on any plan. Almost every other article on this topic blurs the two together. The structural pile is the whole point, because that is the part that is a reason to leave, not just a reason to fiddle with settings.
Fair warning before we start: I build a competing analytics tool, Clickport. So I will be straight with you about where my own tool hits the exact same walls GA4 does, because a couple of these problems are not GA4 problems, they are physics. The parts where a different kind of tool genuinely wins, I will show you. The parts where nothing client-side wins (client-side just means the tracking runs as a small script inside the visitor's own browser, which is how GA4, Clickport, and almost every analytics tool work), I will say so plainly.
- No, GA4 is not accurate, and it only ever undercounts. An independent review of 33 accounts checked against real CRM and sales records found GA4 missed 11.2 percent of traffic on sites with no consent banner and 20.3 percent on sites with one. It never overcounts real visits.
- The single biggest gap is structural, not a setup mistake. In that same study, GA4 lost 15.8 percent of visitors who were never shown a consent banner versus 55.6 percent of visitors who were. A cookie banner can erase more than half your traffic, because people who decline are invisible to GA4 by design.
- GA4 is blind to bots and cannot even show you the problem. Google's own docs confirm it only filters known bots from an industry list, you cannot turn it on or off, and there is no human-versus-bot report anywhere. We sent 1,000 bots to a site running GA4 and it filtered zero.
- Free GA4 quietly samples any Exploration above 10 million events, hides small rows with thresholds you cannot change, and dumps your rarer pages into one '(other)' line. The only real escapes are a separate BigQuery pipeline or paying for Analytics 360.
- Duplicate tags, messy UTMs and missing filters are genuinely fixable. But fix every one of them and GA4 still loses a structural floor of roughly 10 to 30 percent. That leftover, driven by consent and cookies, is the real reason to switch. And no client-side tool, mine included, beats an ad blocker.
So is GA4 accurate? The short answer
No. GA4 undercounts real human traffic, every time, and it is never the other direction. The best public number we have comes from a review by Andy Crestodina of Orbit Media, who compared 33 Google Analytics accounts against the businesses' real records: the actual newsletter signups in their email tool, the real demo requests in their CRM, the real orders in their store. Across about 60 comparisons, GA4 always reported fewer than reality. On sites with no cookie banner it missed 11.2 percent. On sites with a banner it missed 20.3 percent.
That is the floor, by the way, measured by a careful analyst on real accounts. On your site it could be worse. The same study found that when you isolate just the effect of the consent banner, the loss jumps from 15.8 percent to 55.6 percent. More on that in a second, because it is the single biggest thing wrong with GA4 and it is not something you can fix.
The reason GA4 sometimes looks too big instead of too small is a different problem entirely: bots. GA4 cannot filter most of them and gives you no report to even see them, so they pad your numbers while real people go missing. You end up with a total that is wrong in both directions at once.
Hold onto the two-pile idea as we go. Some of what follows you can fix this afternoon. Most of it you cannot fix at all.
Why your real visitors go missing: consent and cookies (structural)
This is the big one, so it goes first. GA4 works by setting cookies in the visitor's browser. Under GDPR and similar laws, you usually need permission before you set those cookies, which is why half the web now greets you with a "Accept cookies?" banner. Here is the catch: when someone clicks "Reject," GA4 is not allowed to set its cookies, so that visitor becomes more or less invisible. They read your page, they maybe buy something, and GA4 either never counts them or counts them as a vague fragment.
You cannot fix this with a setting, because it is not a bug. It is the whole legal arrangement GA4 operates under. A cookie banner that actually works is a cookie banner that loses you data.
The Orbit Media study put a number on it that I keep coming back to. On their own site, with no banner shown, GA4 missed 15.8 percent of visitors. With the banner shown, it missed 55.6 percent. Same site, same visitors. The banner alone erased more than half the traffic.
This is exactly where a different kind of tool changes the math, and I want to be precise about why. A privacy-first, cookieless analytics tool does not set cookies and does not need a consent banner at all, so there is no "Reject" button to lose people behind. That is not a clever trick to recover lost data. It is simply not losing it in the first place. Clickport counts those declined visitors because it never had to ask. If you want the legal background on why cookieless tools skip the banner, I wrote about that in is Google Analytics legal and on the privacy-first page.
That is the cleanest win in this whole article. Not "more accurate modeling." Zero loss instead of up-to-55-percent loss, because the question that loses the data is never asked.
How GA4 fills the gap with guesses, and labels them as facts (structural)
Here is the part that surprises people. When a visitor declines consent, GA4 does not just leave a blank. It estimates that visitor using machine learning, and then it blends the estimate into the same reports as your real, measured data, with no label telling you which is which.
Google calls this "behavioral modeling" and "modeled conversions." Their own documentation is open about the blend: GA4 reports a mix of observed and estimated activity together. So when your report says 8 conversions, that might be 5 it actually saw and 3 it guessed. You have no way to tell them apart in the normal interface.
There are two real problems with this. First, you are making decisions on a number that is part guess, presented as if it were all measurement. Second, and this one is sneaky: the modeling only switches on for big sites. Google's docs spell out the thresholds, and they are high, roughly 1,000 consent-denied events a day for a week, plus 1,000 consenting users a day. Most small and mid-size sites never come close. So for the average business, GA4 does not even model the missing visitors. It just drops them.
A cookieless tool sidesteps this whole thing, and again not by being cleverer at guessing. It does not need to model the consent-denied visitors because it did not lose them. There is nothing to estimate.
Does GA4 count bots as real visitors? (structural)
Yes, constantly, and it will not show you the problem even when you go looking. This is the reason GA4 sometimes looks too high.
GA4's bot filtering sounds reassuring until you read how it works. Per Google's documentation, it automatically excludes traffic from "known" bots and spiders using an industry list (the IAB list). That is the entire defense. There is no on/off switch, you cannot adjust it, and crucially there is no report anywhere that shows you how much was filtered or what got through. There is no human-versus-bot breakdown in GA4. At all. On any plan.
The catch is that the "known bot" list only catches bots that politely identify themselves. Modern bad bots do not. They show up wearing a normal Chrome user-agent, from a normal-looking IP, and GA4 waves them straight through as people. And this is not a rounding error anymore. Imperva's 2025 report found bots are now 51 percent of all web traffic, with bad bots alone at 37 percent. Cloudflare's CEO expects bot traffic to exceed human traffic by 2027.
I did not want to argue this one from studies, so I ran a test. We sent 1,000 bots to a site running both GA4 and Clickport, in five waves, from obvious junk bots up to stealth bots on residential connections. You can read the full bot test here. The headline:
Notice that last column, because it matters. 200 of those bots beat my tool too. No client-side analytics tool catches a sophisticated bot on a residential connection that behaves like a person. The difference is not that Clickport is magic. It is that GA4 catches roughly none of the realistic ones and refuses to even show you the question, while a tool built to look at bot signals catches most of them and puts the human-versus-bot split right in front of you. Across the sites we measure, a median of about 20 percent of incoming traffic is bots, and 57 percent of those would have sailed past GA4's filter. If you have ever seen a weird spike of "Direct" traffic (Direct is GA4's label for visits it cannot trace to any source, which is exactly where uncaught bots and lost referrers pile up), this is usually why, and I dug into that in direct traffic spikes.
Does GA4 sample your data? (structural, on the free version)
Yes, on the free version almost everyone uses, and there is no off switch inside GA4. "Sampling" just means GA4 stops counting all your data and starts estimating from a slice of it, then scales the answer up. Google's own wording for a sampled report is that it is "directionally accurate," which is a polite way of saying "roughly right, do not trust the decimals."
When does it kick in? On a free GA4 property, any Exploration (the flexible report builder) starts sampling once your query has to look at more than 10 million events in the date range. An "event" is any single action GA4 records: a page load, a click, a scroll, a form submit. One visitor easily fires ten or twenty of them, which is why 10 million events is far fewer visitors than it sounds, and a three-month report on a normal-busy site sails right past it. The paid version, Analytics 360, raises the bar to 100 million and up, but that has historically started in the tens of thousands of dollars a year.
There is a common myth that "standard reports are never sampled, so just use those." It is half true. The basic reports run on small pre-summed tables and stay unsampled, right up until you add a second dimension, a comparison, or a filter. Do any of that and GA4 quietly falls back to the samplable data underneath. So the moment you ask a report an interesting question, you are back in sampling territory.
A cookieless tool like Clickport queries every event it recorded, so there is no sampling setting to disable, because the thing does not exist. This is also why I keep the whole product deliberately simple: full data, no estimation layer to second-guess.
Why GA4 hides rows of your own data (structural, inside the reports)
This one genuinely upsets people the first time they hit it, because it looks like data loss when it is actually data hiding. GA4 collected the data. It just refuses to show you some of the rows.
There are two separate mechanisms, both documented by Google. The first is data thresholding. To stop you identifying individual people, GA4 withholds any row where the numbers are small, especially when you have certain Google features turned on. Google states flatly that "data thresholds are system defined. You can't adjust them." You will see your totals look right, then a breakdown comes up almost empty, with no clear warning why.
The second is the "(other)" row. GA4 caps how many rows a report will hold, and dumps everything past the cap into a single line labelled "(other)." Google's own example is brutal: a site with 150,000 pages and a 100,000-row limit has its least-common 50,000 pages collapsed into one "(other)" bucket. If you run a content site with a long tail, a chunk of your pages effectively vanish into that line.
How do we know the data is really there and just hidden? Because GA4's raw BigQuery export, the underground pipe to the actual collected events, has none of this. No thresholding, no "(other)," no sampling. The rows are all sitting there. The hiding happens only in the interface you actually look at. That is the tell: it is a reporting limitation, not a collection one. If you have wrestled with the cousin of this problem, the "(not set)" label, I broke that down in (not set) in GA4.
Clickport does not threshold rows, because it has no individual identities to protect in the first place, so there is nothing to hide for privacy. It is the kind of problem that only exists once you have built your analytics on personal identifiers.
Why GA4 never matches Google Ads, Search Console, or your own database
If you have ever put GA4 next to Google Ads and watched the conversion counts disagree, you are not doing anything wrong. They are not supposed to match, and Google says so. The catch is they will not even tell you that up front. Short version: your own order or signup database is the closest thing to the truth, and every browser-based tool, GA4 included, sits below it. Here is why each one disagrees.
Start with the strangest one: GA4 disagrees with itself. The number on your GA4 screen does not even match GA4's own raw data export for the same dates. The on-screen version has extra layers piled on top: the same sampling, thresholding, and modeling from earlier, plus a quick-estimate shortcut Google uses to count very large tables fast. Google documents the precision as plus or minus 1.63 percent on session counts. Small, but it means the "exact" figure on your screen is itself an estimate.
Then the cross-tool stuff, which all has real, boring reasons:
- Google Ads ties a conversion to the day of the click. GA4 ties it to the day of the conversion. Those can be up to 90 days apart, so the same sale lands in different weeks in different tools. Ads also strips out invalid clicks that GA4 happily counts.
- Search Console counts a click on the search results page, before the visitor even arrives. GA4 counts a session after your tracking script fires. Google explicitly says the two are not meant to match.
- Your database counts every real order. GA4 counts the ones whose browser tracking survived the journey, which loops us right back to consent, ad blockers, and payment redirects.
Some of this is fixable in the sense that you can line up the attribution settings and date ranges to get the tools closer. The differences will never fully reconcile, though, and the reason GA4 sits below your database is structural.
Ad blockers and browsers: one limit is shared, one is GA4's alone
Now the honesty section, because here is where my own tool stops winning on one axis and keeps winning on another. There are two different things going on and people mash them together. Neither one fits the fixable-versus-structural split cleanly: one is a limit every tool shares, so it is not a reason to switch, and the other is GA4's alone.
The shared limit: ad blockers. A big slice of people run an ad blocker or a privacy browser that blocks the analytics script before it loads. Around 29.5 percent of internet users worldwide use an ad blocker, higher on desktop. Tools like uBlock Origin block Google Analytics by default, and Brave blocks trackers for all 100-million-plus users out of the box. Most marketing pages skip this part: an ad blocker blocks any client-side analytics script. It blocks GA4, and it blocks Clickport. Moving your tracker to your own domain helps a bit but does not reliably escape it. So on this one, no client-side tool is the hero. If a visitor blocks the script, nobody on the client side sees them. I would rather just tell you that.
The GA4-only limit: cookie expiry. This one Clickport actually does dodge, because it is a cookie problem and Clickport has no cookies. Safari's tracking prevention caps cookies set by script to 7 days, sometimes 24 hours. Firefox jails cookies per-site by default. For GA4, which recognizes returning visitors by a cookie, that means the same person gets counted as a brand-new visitor over and over, quietly inflating your "new users" and wrecking any returning-visitor analysis. A cookieless tool has no client-side cookie to expire, so this entire class of decay simply does not apply.
Clickport blocked
Clickport not affected (no cookies)
So the honest line is this. I cannot beat an ad blocker, and neither can GA4. But unlike GA4, Clickport does not quietly rot on Safari.
The fixable stuff is real, but fixing it isn't the answer
Time to be fair to GA4. A good chunk of the inaccuracy people complain about really is self-inflicted and really is fixable. If your GA4 looks wrong, check these before you blame the tool:
- Duplicate tags. The tag installed twice (theme plus a plugin plus Tag Manager) double-counts everything. That "GA4 shows 500 percent more sessions" horror story? Usually this, or bots.
- Messy UTMs. A UTM is the little tag you add to a link (the
utm_source=newsletterbit) so GA4 knows where a click came from. TypeNewsletteron one link andnewsletteron another and they become two different sources, and inconsistent tags scatter your traffic. - Missing filters. No internal-traffic filter means your own team's visits count. No referral exclusions means your payment provider shows up as a "source."
- The 14-month wall. GA4 defaults to deleting your data after a set window. I covered that trap in GA4 data retention.
Fix all of those and GA4 reports more cleanly. I mean that. But here is the verdict no one else seems willing to write down plainly: when you have fixed every fixable thing, you are still left with the structural floor. Real-world studies put that leftover loss at roughly 10 to 30 percent in normal conditions, and over 50 percent with a consent banner, while Google's own documentation explains the machinery behind it: sampling, thresholding, and modeled data. You cannot configure your way out of consent rejection, cookie expiry, bot blindness, sampling, thresholding, or modeled data. They are the product.
✓ Inconsistent UTMs
✓ Missing internal/referral filters
✓ Short data retention
✓ Cross-domain setup (across two of your own domains)
✗ Cookie expiry on Safari/Firefox
✗ Bot blindness
✗ Sampling
✗ Thresholding and "(other)"
✗ Modeled data shown as fact
Want to see roughly how much the structural floor is costing you specifically? We built a free GA4 data loss estimator that takes your traffic mix and ballparks how much GA4 is likely missing from ad blockers, consent, internal limits, and bots. No signup, takes a minute.
Where every analytics tool, including Clickport, is still limited
I am not going to end on a sales pitch that pretends my tool is perfect, because it is not, and you would rightly stop trusting the rest of this article. Three honest limits that apply to any tool like mine:
- Ad blockers can block Clickport too. Same as GA4. Our own docs put the potential loss anywhere from 6 to 60 percent depending on audience. A first-party setup (serving the tracker from your own domain) reduces it. Nothing eliminates it.
- Cookieless means no cross-session identity. Because Clickport sets no cookies, it cannot follow the same person across visits. That means no returning-visitor reports, no lifetime-value, no multi-week customer journeys. That is a deliberate trade for privacy, not an oversight, but if those reports are your whole world, know it going in.
- Soft 404s fool everyone. A broken page that returns a normal "OK" status to the browser is invisible to any client-side tool, mine included. We do auto-catch real 404s that GA4's setup misses on 39 percent of sites we tested, but the soft kind is a shared blind spot.
That is the full picture. Not "Clickport sees everything." Clickport sees the structural stuff GA4 is built to miss, and shares the handful of limits that are genuinely just physics.
So should you trust GA4, or switch?
Here is the decision, stripped down.
If the inaccuracy bothering you is in the fixable column, duplicate tags, messy UTMs, missing filters, then fix it and stay on GA4. It is free and it is fine for a directional view once it is cleaned up. Switching tools to solve a duplicate-tag problem is using a moving truck to carry a grocery bag.
But if what is breaking your decisions is the structural floor, the consent loss, the bots you cannot see, the sampling, the hidden rows, the guessed conversions, then no amount of GA4 configuration recovers it, because that is how GA4 is built. That is the actual switch trigger. Not annoyance. Architecture.
And notice the two numbers GA4 can never give you about itself: how much it loses to consent, and how much of your traffic is bots. A cookieless tool with real bot detection can show you both. That is the difference between a tool that hides its own blind spots and one that puts them on the dashboard.
If you have read this far, you already half-suspect your numbers are off. They probably are, in both directions. You can try Clickport free, no cookie banner, no sampling, no hidden rows, and see the gap for yourself. Or if you just want the lay of the land first, here is my honest GA4 versus Clickport comparison and a roundup of the main GA4 alternatives. I answer every email, so if your numbers look strange and you cannot work out why, write to me.

Comments
Loading comments...
Leave a comment