Track 404 Pages in GA4: Why 39% Get Missed Silently

Show article contentsHide article contents
  1. Why GA4 doesn't auto-detect 404s
  2. The standard GTM setup (and how it actually works)
  3. We tested it on 87 popular websites. 39% slipped through.
  4. The soft 404 problem
  5. Single-page apps make it worse
  6. The redirect-to-error-page anti-pattern, observed live
  7. Non-English titles silently break the trigger
  8. What auto-detection at the tracker layer looks like
  9. From detection to fix: the session drill-down
  10. How other tools handle it (they mostly don't)
  11. FAQ
  12. Stop maintaining a 404 tracking pipeline

Every GA4 tutorial recommends the same Tag Manager trigger: fire on pages whose title contains "404" or "not found." We tested that recipe against 87 popular websites. It would have caught 53. The other 34, including major news outlets, government sites, and ecommerce platforms, would slip through silently. And that's before you get to soft 404s and single-page apps. This is one of the most-asked GA4 problems for a reason.

Key Takeaways
  • GA4 does not auto-detect 404 pages. Enhanced Measurement covers ten interactions (scrolls, downloads, video, forms, outbound clicks) but not 404s. The recommended fix is a Google Tag Manager trigger that fires on pages whose title contains '404' or 'not found.'
  • We tested that trigger against 87 popular websites in May 2026. It would have caught 53 of them. The other 34 silently slipped through, including major news outlets, government sites, and SaaS platforms.
  • Soft 404s (HTTP 200 with not-found content) break the recipe entirely. 6.9% of sites in the audit returned HTTP 200 for a guaranteed-nonexistent URL, including segment.com, vercel.com, and sueddeutsche.de.
  • Non-English titles break English-only regex matching. At least 11 sites returned localized 404 messages: German, French, Spanish, Italian, Norwegian, and Finnish. A multi-language fallback recovers most of them.
  • Auto-detection at the tracker layer plus a session drill-down is the workflow that closes the loop: see the broken URL, open the session that hit it, look at the page right before the 404 to find what linked to it, redirect, done. One-line install with Clickport.

Why GA4 doesn't auto-detect 404s

GA4 ships with a feature called Enhanced Measurement that auto-tracks ten interactions out of the box. None of them are 404s. The structural reason: a 404 is an HTTP response code. JavaScript runs after the response. By the time GA4's tracker loads, the status code is gone. There's nothing for it to read.

GA4 Enhanced Measurement: what fires automatically
page_view
scroll (90%)
click (outbound)
file_download
view_search_results
video_start
video_progress
video_complete
form_start
form_submit
404 (not on the list)
Source: Google Analytics Help, Enhanced Measurement reference, accessed May 2026.

That's the full list. Scroll, click, search, video, file, form. All of them are DOM interactions that gtag can intercept by attaching event listeners. None of them require knowing the HTTP response code, which is why 404 detection isn't there. My read: this isn't an oversight Google forgot. It's a feature category they decided not to build.

I've looked for an official Google statement explaining the omission. I can't find one. The GA4 release notes don't mention 404s. Search Central nudges you toward the Page Indexing Report in Search Console, which is fine for SEO triage but only covers what Googlebot crawled. Real-user 404 traffic from your email links, your app redirects, your stale bookmarks: invisible to GSC. You're on your own.

The standard GTM setup (and how it actually works)

Every popular GA4 tutorial converges on the same recipe: a Google Tag Manager trigger that fires when the document title contains "404" or "not found." The shape of the setup is identical across Analytics Mania, MeasureSchool, and Bounteous. Here's the full process.

You start in GTM. You create a JavaScript Variable that reads document.title (it's not a built-in variable, which trips a lot of people up). Then a Page View trigger that fires on "Some Page Views," with the condition {{JS - Page Title}} contains 404. Then a GA4 Event tag with your Measurement ID, event name page_not_found, parameters for page_location and page_referrer so you can later trace which links sent users to the broken URL. You preview in debug mode, hit a real 404 URL on your site, confirm the tag fires, then publish the container.

The official GA4 docs don't walk you through any of this. The closest thing to a Google-blessed alternative is firing a gtag('event', 'exception', { description: '...', fatal: false }) from a <script> tag injected into your 404 template, but the docs describe exception as a crashes-and-errors event, not a page-level signal. The community settled on the GTM page-title approach because it doesn't need developer access and works with whatever 404 template your CMS already has.

That's the theory. In practice, the trigger fires on the wrong pages, misses the right ones, doesn't catch soft 404s, breaks on SPAs, and silently fails on any site whose 404 page isn't in English. It also takes up to 48 hours for GA4 to surface the data after you publish the trigger, which makes the test-and-verify loop slow and demoralizing.

The bigger problem is that the recipe assumes every 404 page on every website does what the tutorial says it should. We checked.

I wrote a small Node script that hits a guaranteed-nonexistent URL on a curated list of well-known sites. The list spans SaaS (Stripe, Notion, Slack, Vercel), ecommerce (Shopify, Etsy, Patagonia, Nike), news (NYT, Guardian, Reuters), government (NASA, CDC, IRS), WordPress publishers, open-source foundations, privacy-focused tools, and a deliberate slice of European-language sites (German, French, Spanish, Italian, Norwegian, Finnish). 91 sites probed. 87 returned a successful HTTP response.

For each site, I followed redirects, captured the final HTTP status, captured the server-rendered <title>, and scanned the response body for "not found" phrases in eight languages. The methodology is checked into the repo so anyone can reproduce it. The findings are unambiguous.

Standard GA4 + GTM 404 trigger, tested in the wild
39%
of popular sites slip through silently
87 sites probed across SaaS, ecommerce, news, government, EU publishers. May 2026. methodology

34 of 87 sites had a 404 page whose server-rendered <title> did not contain "404" or "not found." The standard GTM trigger fires only when the title matches that string. On those 34 sites, it silently does nothing. The list isn't a parade of obscure indie blogs. It's Slack ("There's been a glitch…"), Intercom (kept its homepage title), Netlify (kept its homepage title), Etsy ("Etsy - Your place to buy and sell..."), Reuters ("We can't find that page"), and The Atlantic ("The Atlantic"). Six sites returned an empty <title> for the bad URL, which is even worse: there's nothing to match against.

Put another way: if you implemented the most-recommended 404 tracking method on a site that aggregates traffic from these brands, you'd undercount your broken-link hits by roughly 4 in 10. You wouldn't know it. The dashboard would just look low.

This is one of those findings where the data isn't surprising once you see it, but it's also not what the tutorials say. The tutorials assume every 404 page sets a title that contains the literal string "404." The web doesn't do that.

The soft 404 problem

The deeper failure isn't title formatting. It's HTTP status codes. A soft 404 is a page that tells the user "this content doesn't exist" while the server returns HTTP 200. Google's official definition is exactly that: "a URL that returns a page telling the user that the page does not exist and also a 200 (success) status code."

WHAT THE USER SEES
A "Page Not Found" page. Sad emoji, a search box, a link back home. The browser address bar still shows the URL they tried to reach.
WHAT GA4 SEES
A pageview event. Status: 200. Page title: whatever the template put there. No flag for "this was a 404." Indistinguishable from a real visit.

In the 87-site audit, 6.9% of sites returned HTTP 200 for a clearly nonexistent URL. That includes segment.com (rendered Twilio's "Page not found" page), vercel.com (redirected to login), patagonia.com (a checkout-queue holding page), startpage.com (an explicit "Page Not Found" template, but with status 200), hackernoon.com (kept the homepage shell), and sueddeutsche.de (kept the homepage's title). Six sites in 87 means roughly one in fifteen popular websites delivers soft 404s by default.

Why this matters for tracking: client-side analytics tools, including GA4 and including Plausible and Fathom, have no access to HTTP response codes. The browser receives the response, JavaScript runs, the analytics tag fires. By that point the status code is gone. The only signal left is the page title or body content, and on a soft 404 those signals are sometimes deliberately hidden by the CMS to keep the user calm.

There's a real SEO cost to this beyond the analytics gap. Google's Gary Illyes has confirmed that soft 404s still consume crawl budget even when returning 200 OK: "crawlers use the status codes to interpret whether a fetch was successful, even if the contents of the page is basically just an error message." Put another way: every soft 404 burns one of the slots Googlebot would otherwise spend on a real page. Glenn Gabe at GSQi has documented cases where redirected URLs were treated as soft 404s, causing ranking and traffic loss because the link equity the redirect should have passed wasn't passed.

The GTM dataLayer workaround, where you add a dataLayer.push({event:'page_not_found'}) directly inside your 404 template, can catch soft 404s if you also explicitly instrument the soft-404 path. But that's circular: if your server already knows the page is a soft 404, the right fix is to return a real 404 status code, not to keep returning 200 and patch around it in analytics.

Single-page apps make it worse

The page-title trigger assumes the browser performs a full page load when the visitor hits a bad URL. Single-page apps don't work that way. They intercept the route with history.pushState, render a 404 component client-side, and never reload. Three things go wrong at once.

First, the server returns HTTP 200 by default for every SPA route, including invalid ones, because the server's job is just to deliver the SPA shell. Every SPA 404 is a soft 404 unless the framework explicitly opts out. Second, GA4's "Page changes based on browser history events" toggle (part of Enhanced Measurement) fires page_view on every pushState, including transitions into the 404 component. The 404 lands in GA4 as an indistinguishable pageview. Third, the page title at the moment GA4 reads it depends on whether the framework's router updated document.title before the analytics tag re-fires. Most don't, by default.

The fix is framework-specific and tutorials almost never cover it.

SPA framework: where the 404 trigger fails
Framework Failure mode
React Router v6 Catch-all route renders a 404 component with no document.title update. The page-title trigger sees the previous page's title.
Next.js App Router not-found.tsx is a Server Component by default, so client-side gtag calls require a 'use client' wrapper. Streamed responses return 200, not 404.
Vue + Vue Router Catch-all :pathMatch(.*)* fires history events but the title trigger needs a router.afterEach hook to update title before the GA4 event.
Angular Router If send_page_view: false isn't set on the gtag config, every NavigationEnd handler fires a duplicate page_view. The wildcard 404 route inherits this, so 404s land as a doubled event.
SvelteKit 2 SvelteKit 2's shallow-routing change broke GA4's automatic pageview detection on client navigations (issue #11499, opened then closed in early 2024 with a recommended workaround). 404s still require separate onMount instrumentation inside +error.svelte.
Sources: framework docs, plus the linked GitHub issues. Verified May 2026.

SvelteKit's the messiest story of the bunch. GA4's auto-firing on pushState was broken by v2's shallow-routing patch, the documented workarounds use the reactive $page store (which can double-fire on certain transitions), and the 404 case in +error.svelte needs its own onMount block because no framework-level hook intercepts errors. If you've got a SvelteKit site and you've been wondering why your 404 numbers are always zero, that's the reason. See also our docs on SPA tracking for the broader pattern.

The redirect-to-error-page anti-pattern, observed live

This one I want to talk about because it surfaced from real customer data we almost didn't see ourselves. We recently fixed a query bug in our own dashboard that had been hiding real-visitor 404s from a panel. When the data came back, one customer site, a high-traffic B2B media platform, immediately stood out.

Over 36 days, that site accumulated 2,385 events flagged as 404s, spread across 1,068 distinct broken URLs. Real backlinks pointing to dead pages, internal links to deprecated products, typo paths, the long tail of a content-heavy publisher running for two decades. The expected shape.

What was unexpected: 1,041 of those hits, 43.6% of the total, collapsed onto just two pathnames. Both were variants of a single /PageNotFoundError/ URL that the site's CMS redirects every dead link to via a server-side 301. If you were the one looking at this site's dashboard, those 1,041 hits would look like a single row. The original URL the visitor tried to reach is gone. The referrer that sent them there is also gone, replaced by the redirect chain.

36 days of 404 traffic on one customer site
Funneled into 2 redirect paths
1,041 hits. Original URL lost.
Tracked with original URL intact
1,344 hits across 1,066 individually-tracked URLs
Source: aggregate query against one Clickport customer site, March 27 - May 2, 2026. Site identity withheld.

I think this pattern is more common than people realize. Plenty of CMSes ship with "redirect 404s to a friendly error page" as a default, sometimes with a help-desk article telling administrators it's a best practice. From a UX angle it's defensible. From an analytics angle it's a complete blackout: you know somebody hit something broken, you have no idea what.

The fix on the site side is to render the error template at the requested URL, not redirect. That preserves both the failed URL and the referrer. If you can't change the redirect, the next-best signal is the in-site session before the 404 fired, which is what the session drill-down is for, and which we'll get to in a moment.

Non-English titles silently break the trigger

The standard GTM trigger condition is Page Title contains 404 or Page Title contains not found. Both strings are English. If your visitors arrive on a German error page that says "Seite nicht gefunden," the trigger doesn't fire. You're invisible to your own tracking on that locale.

The audit picked this up clearly. At least 11 sites in the 87 returned a localized 404 title. Some included "404" or English text alongside the localized string and would still match. Others didn't. Here's the split.

Localized 404 titles in the 87-site audit
Caught by English regex
spiegel.de: "DER SPIEGEL - Fehler 404"
repubblica.it: "La Repubblica | 404"
abc.es: "Página no encontrada (error 404) - ABC"
Silently missed
lefigaro.fr: "Page introuvable"
aftenposten.no: "Fant ikke siden"
nrk.no: "NRK beklager! Vi kan ikke finne siden..."
hs.fi: "Sivua ei löydy | HS.fi"
rtl.de: "Seite nicht gefunden | RTL.de"
Source: Clickport audit of 91 popular websites, May 2026.

Three sites caught their localized 404 because a copywriter happened to put "404" or "Fehler 404" in the title. Five sites are completely invisible to the standard trigger. If you serve traffic in any of those languages and you set up GA4 by following an English-language tutorial, you're undercounting in those markets.

The fix is mechanical: replace the GTM contains condition with a regex matching every locale you ship in. A reasonable starting set:

/(404|not found|nicht gefunden|introuvable|non trouv|no encontrad|non trovat|ikke funnet|hittades inte|löytynyt|nenalezeno|页面未找到|存在しません)/i

Paste your 404 page's title here to see what the standard recipe does and what a multi-language fallback recovers:

Test your 404 page's title
Paste a title to see whether the standard GA4 + GTM trigger would catch it.

For sites with empty titles or homepage-equivalent titles, no regex helps. Those sites need server-side instrumentation: a dataLayer push from inside the 404 template, or an explicit flag set by the page that renders. Which is most of the value of building auto-detection into the tracker itself, instead of bolting it on with GTM.

What auto-detection at the tracker layer looks like

Clickport's tracker checks the page title against a regex (/404|not found/i) on every pageview. If it matches, the event is flagged with is_404 = 1 and surfaces in the Pages panel under the "404" sub-tab. That gets you to the same place as the GTM trigger, with no GTM, no published-and-wait cycle, no dataLayer push, no JavaScript variable to declare. But it has the same blind spots: soft 404s with sanitized titles, non-English titles, and SPAs with stale document.title.

The override fixes those. In your 404 template, set:

<script>
  window.cpConfig = window.cpConfig || {};
  window.cpConfig.is_404 = true;
</script>

before the tracker loads. Now the 404 detection doesn't depend on the title at all. Soft 404s, non-English titles, SPA error components: all flagged correctly because the page told the tracker explicitly. One line per template. One time. Documented in script configuration.

That's still not a perfect setup. If the soft 404 is something you don't control, like a SaaS app embedded on your site, the override doesn't reach it. If your SPA framework returns a 404 component without rendering your template, the override doesn't load. The honest framing: auto-detection plus a one-line override catches the union of "the title looks like a 404" and "you know where to put the override," which on most sites is the overwhelming majority of broken-URL hits. Soft 404s on third-party content remain a structural blind spot for any client-side analytics tool.

What it doesn't ask you to do: maintain a GTM container, declare a JavaScript variable, register a custom dimension, wait 48 hours for the event to populate, or migrate your custom report from UA Custom Reports to GA4 Explorations when Google deprecates the surface (which Google did, in July 2023). The detection is in the tracker. The dashboard reads it.

If this sounds simpler, that's the point. Start a free 30-day trial and see your real 404 traffic in roughly the time it takes to read this paragraph.

From detection to fix: the session drill-down

Knowing that /products/discontinued-item/ got 47 hits last month is half a finding. The other half is which page on your site links to that broken URL, so you can fix the actual link. Without that signal, every 404 in your report is a triage task you have to manually trace by clicking around your own site.

Clickport's Pages panel ties the 404 sub-tab into the dashboard's cross-filter. Click any broken-URL row and the entire dashboard filters to sessions that hit that pathname. Open the Sessions tab in the same view and you see the sessions that landed on the 404, sorted by recency. Each session expands into its full page sequence: every page the visitor visited in order, plus the source channel, the device, the country, the entry point, and engagement before the broken hit. The page right before the 404 is almost always the one that linked to the dead URL, even if the dashboard doesn't label it that way. Most of the time you can fix it in three clicks: dashboard, session, page editor.

For external broken links, the session shows the referrer's domain and channel. If a popular external article points to a URL on your site that died, that's the case for a 301 redirect rather than a fix-the-link-on-our-end change. Same drill-down, different conclusion.

This workflow is the actual reason 404 tracking matters. The number is interesting. The trace is what gets the link fixed. GA4's standard Exploration view, set up correctly, can produce something similar, but you'd need a custom Exploration with the right dimensions and a manual filter, repeated each time you want to investigate. The cross-filter turns a 30-second click into the workflow for every row.

How other tools handle it (they mostly don't)

I checked the docs of seven privacy-friendly analytics tools to see how each handles 404 tracking. None auto-detect. All require some form of manual instrumentation on the 404 template. The setup ranges from "drop in a one-line script call" (Plausible, Fathom, Pirsch) to "improvise a custom event because there's no documented pattern" (Simple Analytics, Umami, PostHog).

Plausible's docs recommend plausible('404') on DOMContentLoaded plus a goal definition. Fathom's docs suggest dynamically setting the event name to 404: [pathname], which encodes the URL into the event identifier. Matomo's FAQ shows _paq.push(['setDocumentTitle', '404/URL = ' + encodeURIComponent(document.location.pathname + document.location.search) + ' /From = ' + encodeURIComponent(document.referrer)]), which is the same template-injection pattern with referrer encoding. Pirsch's docs call pirschNotFound() and explicitly note: "there is no way for the script to figure out if the content was found or not." That sentence is the cleanest summary of why client-side 404 detection is a hard problem.

The differences are surface-level. All of them require you to know this is a category of work in the first place, find the right docs, edit the right template, deploy, and then check whether the events show up. None ship with detection in the tracker. I chose a different default because I think 404 visibility is the kind of thing your analytics tool should do without asking you to remember.

FAQ

Why doesn't GA4 track 404s automatically?

A 404 is an HTTP response code. JavaScript runs after the response. By the time the GA4 tracker fires, the status code is no longer accessible. Enhanced Measurement only auto-tracks DOM interactions (scroll, click, form, video, file download, search) because those are observable from JavaScript at runtime. Adding 404 detection would require either a server-side hook that Google chose not to ship, or a client-side regex on document.title, which is exactly the brittle thing the rest of this article describes.

How do I track 404s in GA4 without GTM?

Two options. The first is to set a custom event in GA4's Admin panel: Admin > Events > Create event, name it page_not_found, condition event_name = page_view AND page_title contains 404. That's a remapping rule, not new instrumentation. It only works on data already collected with a matching title. The second is to inject a gtag('event', 'page_not_found', {page_location: window.location.href, page_referrer: document.referrer}) call directly into your 404 template. That requires developer access but doesn't depend on the title. Both options miss soft 404s and SPAs.

Does GA4 detect soft 404s?

No. Soft 404s return HTTP 200, so they look identical to a successful pageview from any client-side tool's perspective. Google Search Console reports soft 404s as a separate Coverage error, but GSC only sees pages Googlebot crawled. Real-user soft 404 traffic from email links, app redirects, and bookmarks is invisible to both GA4 and GSC. The structural fix is server-side: configure your CMS to return a real 404 status code for missing content.

How do I track 404s in a Next.js or React app?

In Next.js App Router, mark your not-found.tsx (or a child component of it) as 'use client', use useEffect plus usePathname to fire window.gtag('event', 'page_not_found', { page_path: path }). In React Router v6, fire the event from inside the catch-all route component's useEffect. Both depend on the framework rendering the 404 component to begin with, which it doesn't always do for streamed responses or for routes that fall through to a generic catch-all without a status code change. See SPA tracking docs for the broader pattern.

Why does my 404 trigger fire on every page?

The most common cause is the Page Title contains 404 condition matching legitimate pages whose title happens to include the string "404," like a blog post titled "What HTTP 404 actually means" or a product titled "Model 404 Audio Interface." If your trigger uses contains, switch to matches RegEx ^.*( 404 |Page Not Found| not found).*$ with explicit word boundaries. Or use the dataLayer approach where the 404 template explicitly pushes the event, which avoids title matching entirely.

Why are my 404 numbers different from Search Console's?

GSC counts URLs Googlebot tried to crawl that returned 404 (or that Google identified as soft 404s). GA4 counts pageviews from real users that hit your 404 template. The two datasets overlap but don't equal each other. Googlebot's crawl includes URLs nobody has visited yet. User traffic includes app links, email clicks, and bookmarks that Googlebot may never crawl. Use GSC to find broken URLs you've forgotten to redirect; use a real-time analytics view to find URLs that are actively bleeding traffic right now.

What's the simplest setup that actually works?

If you control the 404 template, add a one-line config flag in the template (window.cpConfig.is_404 = true for Clickport, dataLayer.push({event:'page_not_found'}) for GTM) so detection doesn't depend on the title. If you can't edit the template, fall back to the regex-on-title approach with a multi-language pattern, and accept that soft 404s and SPAs will silently undercount. There's no setup that catches everything client-side, because some of the failure modes are structural.

Stop maintaining a 404 tracking pipeline

You shouldn't have to declare a JavaScript variable, write a regex, register a custom dimension, and wait 48 hours to find out which links on your site are broken. A web analytics tool that ships in 2026 should know what a 404 is.

Clickport detects 404s automatically from the page title, with a one-line config flag for soft 404s and SPAs. Click any broken-URL row in the dashboard and the rest of the dashboard cross-filters: Sources, Sessions, Countries, Devices. The session drill-down shows the full page sequence, so the page right before the 404 is the one that linked to it. The fix is three clicks instead of a triage project.

Start a free 30-day trial. No credit card. See your real 404 traffic in 60 seconds.

David Karpik

David Karpik

Founder of Clickport Analytics
Building privacy-focused analytics for website owners who respect their visitors.

Comments

Loading comments...

Leave a comment