Tuesday, December 16, 2025
HomeAffiliate MarketingNiche SelectionHow to detect content generated by AI

How to detect content generated by AI

In April 2025, we analyzed 900,000 newly created web pages and found that 74.2% of people included AI-generated content.

With the rapid growth of generative AI, businesses, educators and publishers are asking a key question: How do we distinguish what humans write and what machines produce?

Answer: This is possible, but not foolproof. This is a way to effectively detect AI detection, the limitations you need to understand and a better way to get more reliable results.

Learn more in our research: 74% of new pages include AI content (researched 900,000 pages)

Some people may even be able to detect AI content. Possible, but there are some important warnings.

AI-generated text often has unique statistical and style patterns. These patterns are not always obvious to human readers, but they can often be detected by specially constructed detection models.

In short, all AI detectors work by comparing patterns in text with a large number of human-written and AI-generated examples.

Traditionally, this is done through statistical detection: features such as word and n-gram frequency, common syntactic structures, style selection, and even statistical measures such as confusion (predictability of word selection) and burstiness (change of sentence length), and then mark anomalies.

Function Type explain
Word frequency Frequencies such as “the” or “cat” appear in the calculation example: The: 3, Cat: 2
n-gram frequency Measurement sequences, such as Bigrams: “The Cat” appears twice, “Cat Sat” appears once
Syntactic structure Identify patterns such as topic-verb-object (SVO) structures, such as “cat SAT”, “cat yawn”
Style selection Pay attention to tone, perspective or form; for example, third person, neutral tone
Puzzled Calculate predictability for each word based on context – Lower confusion usually means more predictable (probably machine-generated) text
break out Compare the changes in sentence lengths; AI text may show consistent lengths, while human text is more variable

The third, uncommon approach is watermarking – inserting hidden signals into AI-generated text when created.

Like UV markers on currency, these signals can be checked later to confirm whether the text comes from a specific model, but this only works if the model owner chooses to implement.

As of now, no major LLM providers like OpenAI, Anthropic or Google have confirmed that they use watermarks on public-oriented model outputs. (Why do they punish users?)

learn more: How does an AI content detector work? Data scientist’s answer

There are a lot of AI detection tools available, from free browser-based checkers to enterprise-class platforms with API integration.

If you are an AHREFS user, you can run our AI content detector directly Website ExplorerPage check feature. Just open Website Explorer, Enter the URL you want to check and navigate to Page check Reports, you can click the AI Detector tab to view the analysis and do it with other key SEO metrics:

Good detectors will not only give you a Yes-no judgment: they also break down text and show you the possibility that different paragraphs are AI-generated, provide an overall article-level possibility score, and in some cases it may even be tried using which model (such as GPT-4O) to create content.

We conducted a small-scale test comparing several of the most popular AI detectors to see how they perform in practice. The following table shows our results:

According to my tests, AHREFS’s AI detectors and sea creek leaks are the best performing AI detectors, with gptzero and simpality.ai following closely behind. On the other side of the scale, grammar and writers performed the worst in my tests.

AI content detector Fraction
ahrefs 13/18
Sea and tandem 13/18
gptzero 12/18
Originality 12/18
Scribbr 10/18
Zerogpt 9/18
grammar 6/18
writer 4/18

Learn more in my full post: 8 Best AI Detectors for Testing and Comparison

Like LLM, AI detectors are probabilistic, and they estimate the possibility, not the certainty. They can be highly accurate, but false positives are inevitable. That’s why you shouldn’t make a decision based on a single result. Perform multiple examinations, look for patterns, and combine findings with other evidence.

Regardless of the tools or technology used, all AI detectors have the same basic limitations.

  • Large editing or “humanizing” AI text may evade detection. “Post-processing (such as resentation, synonym exchange, rescheduling paragraphs, or running text through a grammar checker) may destroy the statistical signals the detector is looking for, thus reducing its accuracy.
  • Basic detectors may lack accuracy and advanced capabilities. Detection tools require frequent updates to stay ahead of new AI models – generated AI will evolve rapidly, and detectors need regular retraining to identify the latest writing styles and avoidance techniques. At AHREFS, our detectors support multiple leading models, including those from OpenAI, Humans, Meta, Mixtral, and Qwen, so you can check content based on a wider range of possible sources.
  • Validity varies by language, content type, and model. Detectors trained primarily in English prose may struggle with technical writing, poetry, or less common language.
  • Ambiguous cases, such as AI-edited human texts, can blur the results. These hybrid workflows create mixed signals and may even obfuscate advanced systems.
  • Even the best tools can produce false positives or negative factors. Statistical detection is by no means incorrect, and occasional misclassification is inevitable, as the patterns these systems rely on can overlap between humans and artificial intelligence, while subtle editing or atypical writing styles can easily blur these differences.

Remember: False allegations based on incorrect AI detection results can seriously damage the reputation of an individual, company, or academic institution.

With these limitations in mind, it is best to use other methods to confirm the output of any detector before reaching a conclusion.

Human judgment is very useful for adding context to the results of AI detectors. By examining contexts (such as patterns of multiple articles, post history on social media, or the surroundings of publication), you can better measure the likelihood of AI being involved in writing.

The logos to look for:

  • There are no subtle quirks in too consistent tones. Human writing is inherently a bit confusing and unpredictable, with little difference in style, rhythm and word choices, reflecting personality and background. AI-generated text may sometimes lack these flaws, resulting in a uniform tone that feels too polished or mechanical.
  • lengthy. AI is very good at extending simple ideas into long explanations.
  • Lack of new information. AI output is often read as a general or surface level (this is especially noticeable on LinkedIn: many AI-generated comments simply retell the original author’s ideas with new words without adding any meaningful perspective or value).
  • Word selection for storytelling. AI prefers idioms that have a slightly “closed” “The evolving landscape”recipe hook (“This is not X…it is Y”), or overuse of EM dashes and emojis.
  • Incentives. Is there any obvious motivation for authors to use AI content?

I see you, chatgpt.

None of these signs provides clear evidence for AI content, but they can add useful context to other forms of evidence.

If you run an AI detector on just one article, unreliable results can be problematic. However, this issue becomes less important when you look at the results at scale. Running this process on many pages allows you to gain a clearer understanding of how AI is used as part of a company’s broader marketing strategy.

With ahrefs’ Top page Report Website Explorer, You can see the “AI Content Level” column for almost any website page. From there, you can even check any separate URL and learn about the AI models that may be used in the creation of that page.

Here is a video discussed through this process:

For a quick tip: Use this report to discover top-level, massive AI-generated content and consider creating your own version of AI. If it’s ranking, it’s meeting search intent, which gives you potential opportunities, as well as your AI content workflow.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments