Ask HN: Does acceptance of Wikipedia as reliable source foreshadow same for AI?
When Wikipedia started it was denounced as an unacceptable reference source compared to standards like Encyclopedia Britannica. Now its information is regularly cited here, generally without dispute. The output/product of AI/LLM is denounced here as hallucination-prone and unacceptable for reference. Will that too pass just as Wikipedia's early days engendered fierce debate as to its suitability?
When AI outputs have extensive primary source citation and are peer reviewed then sure. Even now with deep research and web browsing capabilities it doesn’t vet its sources though. It has a long way to go.
Why doesn't AI do what the rest of us do and cite the sources linked to by Wikipedia rather than the pedia itself? ;)
Do you have a source on Wikipedia vs Britannica, because from what I heard, they were pretty close, from even early on for Wikipedia. (https://www.cnet.com/tech/tech-industry/study-wikipedia-as-a...)
Neither is a primary source though, but the advantage of linking to an encyclopedia is that it provides a plain-English summarization of the primary sources, as well as pertinent reference to those sources.
Large language models have historically been incapable of providing sources, but newer models are gaining the ability to do so, which is making them as useful as encyclopedias. Until everyone starts using sourced output from LLMs, we get to see who is blindly trusting hallucinations. (https://www.infodocket.com/2024/12/06/report-media-personali...)
Early Wikipedia was pretty terrible in general and was even worse than Encarta relative to a printed encyclopedia.
Today it is better in terms of comprehensiveness but it is still poorly written compared to the work of professional writers and editors. In part because good writing is not valued and well-written passages get arbitrary edits from amateurs.
>Jimmy Wales' Wikipedia comes close to Britannica in terms of the accuracy of its science entries, a Nature investigation finds.
https://www.nature.com/articles/438900a (2005)
FWIW, Perplexity Pro now provides numerous primary source references/citations for its answers.
You can use Wikipedia in many, many ways.
"Source of reliable information" is one of them.
"Source of how a topic has changed over time" is another.
"Source of what disputes are more common regarding specific parts of each article" is another.
And so on...
Even if most people don't actually do that and trust every small bit of information (like the original amount of hair of some classical composer had), some other people will in fact track and try to understand the path of that information and whether is truthful or not, relevant or not.
Maybe one day LLMs will allow that kind of thing as well. I don't know. Currently they don't offer that choice.
Does that answer your question?
It always bothered me that people go "ew, wikipedia" when you simply refer them to it rather than literally rewording the article (which they would accept, or at least reply with "source? source? s-s-source? so-source?" to which you could successfully post any source from the wiki article). Reference is not a proof, it's information that you have to check yourself if you want the latter. I guess same will happen with LLMs. Any link to a chat will be met by some with "ew, llm" regardless of the fact whether it's checked and informative and saves effort to formulate, or is just-generated crap. With all the issues with llms, it's us who don't understand the basics when it comes to handling the info. This is by semi-natural social design, I believe.
FYI, Wikipedia is not considered a reliable source for academic research.
Is is "good enough" for the general public, but that is not the same thing.
Encyclopedias are tertiary sources: they're road atlases to find real sources. Academic research deals in analysis of primary sources and draws from reliable secondary sources. Not only does it not make sense for academic work to cite Wikipedia, but it's actually a rule of the Wikipedia project not to host pages that would be intended to be in the cite record.
You generally can't even cite encyclopedias in high school research reports.
Wikipedia has citations so you can verify its information. An LLM with citations can also be verified, and if the citations are consistently accurate people will start to trust it.
Wikipedia has a very dedicated fact-checking team who try to enforce accuracy, and (at least for non-political articles) most agree they do very well. Perhaps someone will develop a reliable automated fact-checker, then it can be applied to LLM output to “bless” it or point out the mistakes.
People already trust LLM's for everything. It is treated like a magic answer machine. It was even back in the le-old ChatGPT 3.5 days.