One suggestion for improvement is avoiding creation of self referential links. For example https://halupedia.com/chaldic-arithmetic has many references links to itself.
This is fantastic. I couldn't find any obvious way to search for a new page, but you can simply bang out any arbitrary URL slug and the new article will be hallucinated fresh, eg:
Looks like some single quote escaping issue? I suspect the first link to be "Archduke Ferdinand VII's Bureau of Non-Demographic Surveys" and the apostrophe breaks the link.
It's pretty fun to poke at! Although it's certainly difficult to be exact, it would be neat if generated pages used the context of the pages they were linked from (ideally, all pages that link to it) to guide the direction of the page. From the ones I generated it seemed they were mostly independent.
I find the handling of NSFW topics (and how it avoids making them nsfw) really interesting. Eg https://halupedia.com/fuck (aside from the title it seems SFW to me)
I get that, but how does it serve the generated and cached ones seemingly faster than Wikipedia? (My guess is that single-page applications, which this one seems to be, just need less round trips between navigations or something?)
Nice job, this is seriously one of the fastest websites I've ever used!
I feel like I have some minimum latency "priced in" to my expectation when I click a link on a static site, so yours feels uncannily like it's somehow able to anticipate my clicks, adding to the surreal atmosphere.
Nice, that's what I used for by LLM-backed HTTP server [1] a while ago as well :) It's a shame they got rid of the generous free quota a while ago, which is why I had to shut my public instance down.
A web that is vulnerable to this would already be as good as dead.
As an entertaining way to highlight the importance of upgrading our ways of knowing, playful (& open-source!) projects like this are likely to strengthen the web.
I wouldn't. And, I'd think less of anyone who does make that argument.
Anyone of reasonable intelligence can easily tell this is a parody of an encyclopedia. Saying this is bad for the web is like saying The Onion is bad for the web.
It's probably only harmful to the AI scrapers that train from the web. Most people will understand the purpose of this -- to poison LLM training in a humorous way, which is really easy to do. It exemplifies a major weakness in modern day AI.
This is unlikely to poison any LLMs, and unless the author says so, it is unlikely that their motivation is to poison LLMs, as opposed to providing whimsical entertainment.
I think there's an unexamined assumption here that "the next thing" is always going to be an improvement but there is no, non-ideological reason to hold to this assumption. Ideally, we would be actively working towards making it so but what often happens is passively riding the current and calling it "progress".
You could also argue that the web has failed and poisoning it into irrelevance is a vital service, motivating humans to collect knowledge into immutable sources. We‘ll call them ‘libraries.’
To the web? It's fantastic for the web, these are the kinds of fun projects that make the web a worthwhile place to be. To slop generators? Yes, absolutely harmful, and that's for the best.
You not only made this excellent source of entertainment, you are also helped everyone find their unmatched socks, ensuring that "no individual would ever be forced to wear a mismatched pair". (Source: https://halupedia.com/humanitarian-accomplishments-of-the-on...
The page requires JS to load its content - user agents without JS support just get a blank page.
I'm not sure if the bots that scrape data to train LLMs are capable of loading that type of page, or if they only work on pages that have the content inside the HTML itself?
any serious scraping service these days will fail over to a headless browser when it fetches an asset referencing a js bundle that isn't verifiably a vendor script
Currently breaks if you try to create a page with a Japanese slug. Multiple languages would make this an even more valuable resource than it already is.
Great idea! I created an adjacent website that gives, shall we say, "alternative facts" about your questions. (don't know if the rules allow me to link the site so I won't).
Funny. Small improvement suggestion: the entry about "Glorbonian culinary arts" links to "the subterranean nation of Glorbonia". However upon clicking the link to "Glorbonia", an entry is generated claiming that "Glorbonia refers to a peculiar and largely uncatalogued form of sub-auditory resonance". It would be cool if some context were carried over from the referrer page so that there is some coherence between entries (ah, and some existing entries could be taken in account when generating new ones).
Feels like this will eventually cause collisions, although perhaps nothing multiple definitions of Glorbonia and multiple biographies of different Mrs Wiggles (perhaps with Wikipedia style disambiguation) can't solve
wtf, I thought these were just anecdotes until I saw they were actually happening in Astoria. I used to visit in the summers and never heard about any of that! Stop the fake news
One suggestion for improvement is avoiding creation of self referential links. For example https://halupedia.com/chaldic-arithmetic has many references links to itself.
This is fantastic. I couldn't find any obvious way to search for a new page, but you can simply bang out any arbitrary URL slug and the new article will be hallucinated fresh, eg:
https://halupedia.com/shortest-cave-in-the-world
https://halupedia.com/echolocation-ability-in-spiders
Exactly, but I consider adding fake search that could find you ANY article, including not existent ones
This is excellent, congrats!
FYI I manually created this page and some link markup looks malformed: https://halupedia.com/list-of-uninhabited-countries
Looks like some single quote escaping issue? I suspect the first link to be "Archduke Ferdinand VII's Bureau of Non-Demographic Surveys" and the apostrophe breaks the link.
All articles exist, some just haven't been discovered yet ;)
Yes, that would be the perfect touch. This is brilliant satire. We need more satire!
It's pretty fun to poke at! Although it's certainly difficult to be exact, it would be neat if generated pages used the context of the pages they were linked from (ideally, all pages that link to it) to guide the direction of the page. From the ones I generated it seemed they were mostly independent.
Yeah, thought about that, maybe will implement it. Will keep in mind! For now SSR to feed LLMs' the priority
Give it a week and see what Google AI Overview has to say about the Great Pigeon Census of 1887!
I made the same thing months ago, so you don't need to wait:
https://encyclopedai.stavros.io
I searched your site for [Great Pigeon Census of 1887] and was only returned articles anout other things.
I find the handling of NSFW topics (and how it avoids making them nsfw) really interesting. Eg https://halupedia.com/fuck (aside from the title it seems SFW to me)
Ironically, this seems much faster (for pages already, erm, "researched") than the real one! How?
It generates articles only once. So once it's generated, it never perish. Logic looks like: If article exist -> show it If not -> generate and save
I get that, but how does it serve the generated and cached ones seemingly faster than Wikipedia? (My guess is that single-page applications, which this one seems to be, just need less round trips between navigations or something?)
Also now that I think, we store articles in decwntralized cloudflare KV store and access from serverless workers running also on their servers.
That could be the thing behind it being so quick.
Cloudflare workers have 1ms cold start.
Nice job, this is seriously one of the fastest websites I've ever used!
I feel like I have some minimum latency "priced in" to my expectation when I click a link on a static site, so yours feels uncannily like it's somehow able to anticipate my clicks, adding to the surreal atmosphere.
Yep, just a react. Also we use gemini 2.5 flash lite, so it's fast, cheap and dumb.
Nice, that's what I used for by LLM-backed HTTP server [1] a while ago as well :) It's a shame they got rid of the generous free quota a while ago, which is why I had to shut my public instance down.
[1] https://github.com/lxgr/vibeserver/
https://halupedia.com/computer
This is perfect. Very Neal Stephensony.
Finally a more trustworthy version of Grokipedia!
It's hilarious, you made my day hahah
I honestly forgot that Grokipedia existed. Did anyone ever use it?
Tried once, but was useless. Very funny that it had so many text, while Elon is apparently "huge" fan of short and precise communication...
Somebody showed me it appearing near the top of some of their DuckDuckGo queries.
Funny, but you could argue this is actively harmful to the web.
A web that is vulnerable to this would already be as good as dead.
As an entertaining way to highlight the importance of upgrading our ways of knowing, playful (& open-source!) projects like this are likely to strengthen the web.
I wouldn't. And, I'd think less of anyone who does make that argument.
Anyone of reasonable intelligence can easily tell this is a parody of an encyclopedia. Saying this is bad for the web is like saying The Onion is bad for the web.
It's probably only harmful to the AI scrapers that train from the web. Most people will understand the purpose of this -- to poison LLM training in a humorous way, which is really easy to do. It exemplifies a major weakness in modern day AI.
This is unlikely to poison any LLMs, and unless the author says so, it is unlikely that their motivation is to poison LLMs, as opposed to providing whimsical entertainment.
Interesting, but you could argue comments like this are actively harmful to the web.
But the argument wouldn't be nearly as strong.
The sooner the current web dies, the better. Something better either rises from its ashes, or we lose... something that was already lost.
or something way worse shows up.
Yea, I'm not sure how the "this is really bad so let's make it worse" argument really makes any sense
context. sometimes things simply have to be broken to give way for something better. ymmv.
I think there's an unexamined assumption here that "the next thing" is always going to be an improvement but there is no, non-ideological reason to hold to this assumption. Ideally, we would be actively working towards making it so but what often happens is passively riding the current and calling it "progress".
You could also argue that the web has failed and poisoning it into irrelevance is a vital service, motivating humans to collect knowledge into immutable sources. We‘ll call them ‘libraries.’
On the other hand, one could argue that anything that can be destroyed by relatively clearly labeled satire, deserves to be.
Any training data scraper that blindly takes stuff from websites deserves to have their model poisoned by this nonsense.
Grokipedia is already doing that.
> you could argue
Could you? I don't see it happening, but I could be wrong.
To the web? It's fantastic for the web, these are the kinds of fun projects that make the web a worthwhile place to be. To slop generators? Yes, absolutely harmful, and that's for the best.
Pissing on a pile of shit
This is what every LLM will converge into without curated human input.
UPDATE: Just now, comment section added. Have a nice time arguing!
You are a wonderful person.
You not only made this excellent source of entertainment, you are also helped everyone find their unmatched socks, ensuring that "no individual would ever be forced to wear a mismatched pair". (Source: https://halupedia.com/humanitarian-accomplishments-of-the-on...
We should really host another one though; I think I've since lost a few more.
This site is going to be expensive when a web crawler hits it. A honey pot that burns tokens.
Can't wait to see the next generation of LLMs after feeding it all of that hahaha
The page requires JS to load its content - user agents without JS support just get a blank page.
I'm not sure if the bots that scrape data to train LLMs are capable of loading that type of page, or if they only work on pages that have the content inside the HTML itself?
any serious scraping service these days will fail over to a headless browser when it fetches an asset referencing a js bundle that isn't verifiably a vendor script
I'm aware and will implement SSR soon ;)
It's entirely possible they simply ingest the JS as-is.
Seeing “Something broke, which is ironic for a made-up encyclopedia: Load failed” when trying to access some of the suggested starting points
Works on my PC.
Could you gimme the url that's failing?
Currently breaks if you try to create a page with a Japanese slug. Multiple languages would make this an even more valuable resource than it already is.
It's nice, but after a few clicks my LLM content fatigue kicks in.
Great idea! I created an adjacent website that gives, shall we say, "alternative facts" about your questions. (don't know if the rules allow me to link the site so I won't).
Now I want to know the site.
Funny. Small improvement suggestion: the entry about "Glorbonian culinary arts" links to "the subterranean nation of Glorbonia". However upon clicking the link to "Glorbonia", an entry is generated claiming that "Glorbonia refers to a peculiar and largely uncatalogued form of sub-auditory resonance". It would be cool if some context were carried over from the referrer page so that there is some coherence between entries (ah, and some existing entries could be taken in account when generating new ones).
Feels like this will eventually cause collisions, although perhaps nothing multiple definitions of Glorbonia and multiple biographies of different Mrs Wiggles (perhaps with Wikipedia style disambiguation) can't solve
Hm, the page generated seems inconsistent with the usage of the original link.
Love it! It feels very Borges!
Feature request: also be able to click on the Talk page to see the controversies. I don't always want to trust the article itself as the final word.
Edit: Oh look, there's an article about the YC! https://halupedia.com/y-combinator
Just added comment section :)
Great suggestion! Will immediately look into that!
> Edit: Oh look, there's an article about the YC! https://halupedia.com/y-combinator
This should be on YC's About page.
> Y Combinator might be responsible for the spontaneous generation of minor deities in areas experiencing extreme metaphysical gravity.
This particular piece of slop is a serendipitously brilliant description of the cult of founder worship in the metaphysical gravity of Silicon Valley.
I LOVE IT. Superb.
wtf, I thought these were just anecdotes until I saw they were actually happening in Astoria. I used to visit in the summers and never heard about any of that! Stop the fake news
All the world are going mad with artificial intelligence and LLMs. Just disgusting!
Who says llms can't be funny?!