Ask a general-purpose chatbot for sources on almost any topic and it will give you a clean, formatted list. Authors, year, journal, a plausible title, sometimes a DOI. It looks exactly like a real bibliography.
Check the references and a chunk of them won't exist. The authors are real people who never wrote that paper. The journal is real; the article isn't. The DOI resolves to nothing, or to something completely different.
This is not a bug that will be patched next quarter. It's a direct consequence of how these models work, and understanding why will save you from a genuinely bad afternoon.
The model isn't looking anything up
A language model generates text by predicting the next most likely token, over and over. When you ask it for a citation, it is not querying a database of papers. It is producing the sequence of words that a citation to your topic would probably look like.
And it is very good at that. It has read millions of real citations, so it knows the shape: a surname, a plausible co-author, a year in the right range, a journal that publishes in roughly the right area, a title that sounds like a real paper on your topic. Every individual piece is statistically reasonable. The combination just happens to be fictional.
The model has no concept of "this paper exists" versus "this paper is plausible." Those are the same thing to it. That's why it states the fake citation with exactly the same confidence as a real one — there is no internal difference between them.
Why this is more dangerous than an obvious error
If a tool gave you garbage, you'd ignore it. The problem with fabricated citations is that they are almost right. They pass a glance. They survive a busy student copying them into a reference list at 1 a.m. They've made it into submitted essays, grant applications, and at least a few published papers and legal briefs that became cautionary tales.
The failure mode is specifically that it looks trustworthy. A confidently wrong source is worse than no source, because it costs you nothing to accept and a great deal to have accepted.
How to actually use AI for sources
None of this means AI is useless for research. It means you have to use it for the things it's good at and verify the things it isn't.
- Treat any citation from a general chatbot as a lead, not a source. Before it goes anywhere near your bibliography, find the actual paper — search the title in Google Scholar, Semantic Scholar, or your library. If you can't find it, it probably doesn't exist.
- Check the DOI resolves to the paper the model claimed, not just to a paper.
- Never cite something you haven't opened. This was always the rule. AI just made breaking it frictionless.
The better pattern: AI that's grounded in real sources
The deeper fix is to use AI that can only draw from sources that actually exist — your own library — instead of one improvising from the entire internet's worth of patterns.
When the model is restricted to papers you've already added, it can't fabricate a reference, because it isn't generating references at all. It's pointing you at a passage in a real PDF you can open and read. The citation is real by construction. If the answer isn't in your sources, a well-built tool tells you that, instead of inventing one to be helpful.
That's the line that matters: an assistant that says "here's what your sources say, and here's where" is doing research with you. One that produces a confident paragraph with three made-up references is doing something that looks like research and isn't.
It's the difference we built Folio around. Ask a question and the assistant answers only from the sources in your library, with a citation you can click straight through to. No invented papers, because there's nothing to invent — just what you've actually read.
Use the chatbots. They're genuinely useful for thinking out loud, rephrasing, getting unstuck. Just don't let one hand you a bibliography and trust it. It isn't lying to you, exactly. It just can't tell the difference between a real source and a convincing one — and on that particular question, the difference is the whole game.
Folio's assistant answers only from your own sources — every citation links to a real paper you added. Join the waitlist.