Thursday, September 25, 2025
HomeAffiliate MarketingNiche SelectionHow long does it take for AI assistant to have hallucinations to...

How long does it take for AI assistant to have hallucinations to link? (16 million URLs were studied)

AI assistants like Chatgpt and Claude can hallucinate URLs on your website and will directly access visitors on your website. But how often does it happen?

To find out, we looked at the HTTP status of 16 million unique URLs referenced by Chatgpt, Chelplexity, Copilot, Gemini, Claude and Mistral.

We found that AI assistants deliver visitors to page 404 2.87 times more frequent More than Google search.

Chatgpt is the greatest criminal, with 1.01% click URLs and all referenced URLs return 2.38% of the 404 status (compared to the baseline 404 rates of 0.15% and 0.84% ​​respectively).

Here is what we found:

For the first test, we used anonymous data from the free analysis tool, Network Analysis. This allows us to see actual access to the URLs enabled on the real website.

This is the methodology:

  • We use Web Analytics data to find all URLs with AI assistants such as Chatgpt or Cllexity as a referral.
  • If the page title contains a phrase “404” or “Not Found”, we mark the URL as a page 404.
  • For each AI assistant, we compared the number of possible 404 pages to the total number of reference URLs that found their 404 rate.

Chatgpt has the highest rate of 404 pages, where 1.01% of all referenced URLs contain page titles of “404” or “not found”.

Claude follows 0.58% URLs, followed by Copilot (0.34%), Confused (0.31%) and Gemini (0.21%). Mistral has the lowest rate (0.12%), but also sends the lowest recommended traffic, making it the smallest sample in this test.

Recommender Possible 404 pages Total unique URL 404 rate
chatgpt 84465 8332436 1.01%
Puzzled 3529 1133084 0.31%
Co-pilot 1466 431319 0.34%
Gemini 734 351242 0.21%
Claude 550 95293 0.58%
Mistral 8 6760 0.12%

Google’s 404 base interest rate

This is not a perfect test. Around 404 pages may not include “404” or “Not Found” in the page title. Not all links that are hallucinated by AI assistants receive clicks (and therefore won’t appear in Web Analytics data), so we may underestimate the total number of hallucinated URLs.

A small portion of these 404 pages may also be the real 404 pages, rather than the illusion URL. We can add additional context to these data by comparing the “basic interest rates” on page 404. To do this, we viewed the 404 rate of all unique URLs using Google as a referrer (629 million unique URLs). This 404 rate is 0.15%.

In this additional case, it is clear that the 404 AI assistants have a significantly higher rate than Google’s “basic” 404 rate. Chatgpt, Claude, Copilot, Gelplexity and Gemini all seem to create hallucinatory URLs.

The average 404 rate for all AI assistants is 0.43%. Compared to the 404 rate of URLs mentioned by Google, AI Assistant sends visitors to 404 pages 2.87x Google search rate ((0.43/0.15).

We also used similar tests Brand radarOur large number of searchable databases, including millions of AI assistant prompts and outputs. Using this data we can see all the URLs referenced by the AI ​​assistant, not just those clicked URLs.

  • We found all URLs referenced by Chatgpt, Chelplexity, Copilot and Gemini in our brand radar database.
  • For these URLs are also stored in our crawler database (65% of the total URLs), we retrieved the latest HTTP status.
  • For each AI assistant, we calculated the 404 rate of the URL referenced in the crawl database.

The 404 rate of the referenced URL (not only references and Click URL) is much higher than our previous tests.

Similarly, Chatgpt’s highest speeds of 404 pages (2.38%), followed by confusion (0.87%) and Gemini (0.86%), close-range continuous. Copilot has the lowest 404 rate at 0.54%.

This test also has limitations. As before, some of the people in these 404 pages will Return to 404 status For some reason, besides hallucinations. We also underestimated the total number of 404 URLs because we can only see the HTTP status of those URLs in our crawler database (I wish the hallucinatory URLs were missing in our crawler database because they never existed because they never existed).

As before, we want to compare these numbers to the “baseline” 404 rate. To do this, we extracted all unique URLs from the top 20 locations of 400,000 SERPs.

67% of these URLs are in our Crawler database, allowing us to determine a 404 rate of 0.84%. (Or simply, Google’s top 20 returns 0.84% ​​of URLs in the 404 status.)

The 404 rate for confusing (0.87%) and Gemini (0.86%) is very close to the 404 rate for Google SERP (0.84%).

This is probably because Gemini and confusion use Google Search Index to retrieve URLs: their 404 rate reflects the 404 URL rate in the underlying source, Google. If so, they seem to have lower hallucinations than Chatgpt.

Copilot uses Bing to search the index, so Copilot’s 404 rate may reflect Bing’s 404 rate.

AI Assistant Unique quotation URL URL in Crawler DB 404 rate
chatgpt 2,452,776 1,524,277 2.38%
Puzzled 3,471,754 2,450,016 0.87%
Co-pilot 1,485,355 1,120,780 0.54%
Gemini 1,354,171 641,603 0.86%

I suspect there are two main reasons why there are two hallucinations links.

Some referenced URLs Used Valid, but now returns to the 404 status. Artificial intelligence assistants use web search and their combination of internal knowledge. Some URLs they reference may exist at once, but have since been deleted or moved (no Redirect the original page) – Especially when relying solely on internal knowledge.

(This also explains why there are a large number of pages of these 404 pages in our Crawler database.)

In a sense, they fit the expected URL pattern for a given website, but they don’t actually exist, and the other part is a real hallucination.

For AHREFS blogs, the most common hallucination URLs are similar pages /blog/internal-links/and /blog/newsletter/. Given that we wrote articles on our blog about SEO topics and provided newsletters, these URLs fit into the pattern of a typical AHREFS blog page, but they don’t actually exist.

Some of these illusions’ links may also exist in our Crawler database. If the published AI-generated content contains an illusion URL, our crawler will try to get it. and 74% of new pages contain some AI-generated contentthis seems likely.

If you want to measure the impact of an hallucinatory URL, the best data source you can use is your own website analysis. Here is how you can test it yourself:

1. Filter your website analysis to display AI traffic

Start by filtering your website analytics to show the access you get from the AI ​​assistant. If using GA4, you need to apply regular expressions to the session source dimension in the exploration report.

Thierry ngutegure of salt. The following rules are recommended. When a new AI assistant appears, you need to update the expression, or they change the referrer information:

.*gpt.*|.*chatgpt.*|.*openai.*|.*writesonic.*|.*nimble.*|.*perplexity.*|.*claude.*|.*gemini.*google.*|.*copilot.*microsoft*|.*outrider.*|.*google.*bard.*|.*bard.*google.*|.*bard.*|.*deepseek.*|.*mistral.*|.*edgeservices.*|.*neeva.*

If you use ahrefs’ Network Analysisjust use the built-in “AI Search” channel filter:

Select any time period you are interested in and export the data to Google Sheets.

2. Generate an application script to return HTTP status

Next, ask ChatGpt (or your AI assistant of choice) to generate an application script to return the HTTP status of the URL in Google tables. Then, in your Google Sheets, navigate to Extensions > Application Scriptspaste and save your script.

Create a new column in Google Tables, call your script, target the cell containing your URL (e.g. =getthttpstatus(a2)), and apply to the entire column.

(This can take a while if you have thousands of URLs – for large websites, it’s better to use crawls.)

3. Filter to 404 status and > 10 visitors

Next, filter your sheet to display URLs that return only 404 status codes and Receive visitors.

I set a threshold for receiving URLs of over 10 visitors per month, but you can use any threshold to make sense for your website.

You can manually check some of these URLs to confirm their hallucinations (rather than real website pages that are unavailable for some other reason).

4. 301 redirect (if it makes sense)

If your page with hallucinations receives a lot of visits, it may be worth it 301 Redirection The hallucination URL goes to the relevant page on your website (if any).

You need to guess what the page of hallucination might be, but usually, a separate URL is enough for educational guesses (visitors of hallucination URLs). /blog/keywords/ Our real guide may benefit from keyword research).

Alternatively, if you don’t want to create 301 redirected spider webs, you can update 404 pages to include a useful resource list that will disappoint LLM visitors, useful resources that may find useful (such as your most popular content, or your newsletter subscription page).

Should I care about this?

In our last measure, AI assistants (mainly ChatGpt) accounted for Accounts for 0.25% of total website traffic39.35% compared to Google. 1.01% of Chatgpt’s referral traffic resulted in 404 pages of traffic, and the illusion’s URL affected a small portion of the average website traffic that had been very small.

This is a useful exercise to understand another trait of AI search, but it does not represent some huge growth leverage. If you can minimize the impact of hallucinatory URLs Very little effortit might be worth it.

Therefore, we will Network Analysis This will help you find the illusion URL in just two clicks. If you are looking for a simple Google Analytics alternative that can run up to 1 million events per month, check out:

Questions or comments about this study? Let me know on LinkedIn.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments