Fidelity study

Does AI hallucinate subreddit names? We tested 100

If a tool lets a language model suggest where to look on Reddit, the obvious worry is invented communities. So we measured it.

Start a claim See how it works

By Bhupendra Singh Chauhan, Founder, Reddit Research Pipeline · May 18, 2026 · Updated May 22, 2026

The worry

Our wizard asks a language model to propose subreddits for a topic. The standard objection writes itself: language models make things up, so won’t it send you to communities that don’t exist?

Rather than guess, we ran a fidelity test we could repeat.

What we measured

We took ten hypotheses spanning deliberately different domains — cold email, indie SaaS, vape addiction, egg freezing, kubernetes, runner’s knee, toddler screen time, novel writing, crypto MEV, chronic insomnia. For each we asked the model for ten subreddit names, then probed all 100 against Reddit’s own r/<name>/about.json endpoint.

The results

OutcomeCount

Valid — live, public, ≥500 members90

Hallucinated — 404, no such sub1

Private or restricted (real, unreadable)5

Too small — under 500 members3

Network error1

The single hallucination was r/quittingvaping. The other nine “losses” were real communities the validator correctly filters out for being private or tiny — not the model’s fault.

So the real hallucination rate was 1%

Only one name in a hundred was invented. Everything else either passed or was a genuine community our own rules dropped for being unreadable or too small to matter.

Why we still validate every name

A 1% invented-name rate is low, but “low” isn’t “zero”, and a dead link in a research tool erodes trust fast. So every suggested subreddit — model-proposed or not — is checked against about.json before it reaches you. Hallucinations 404 and get dropped automatically.

The model earns its place by finding the niche, cross-cutting communities Reddit’s own keyword search misses — r/eggfreezing, r/runninginjuries, r/platformengineering — not by being trusted blindly.

Reproduce it

The test is an idempotent script (scripts/test-suggest-fidelity.mjs) — re-run it whenever the model changes.

How the pipeline works

Does AI hallucinate subreddit names? We tested 100

The worry

What we measured

The results

So the real hallucination rate was 1%

Why we still validate every name

Reproduce it

Keep reading

Write content about what your audience actually asks

Map the landscape before you bet on a direction

Quoted vs tokenized Reddit search: an A/B test

Validate what people actually say, not what you wish they would.