Astro/Solid - Hacker News

$Fade_Dance 13 hours ago

#1 problem is how sycophantic they are. I in fact want the exact opposite sort of interaction, where they push back against my ideas and actively try to correct and improve my thinking. Too often I am misled into giant waste of time because they have this need to please coded in to their default response structure.

You can say things like "you are a robot, you have no emotions, don't try to act human", but the output doesn't seem to be particularly well calibrated. I feel like when I modify the default response style, I'm probably losing something, considering that the defaults are what go through extensive testing.

[-]

$quatonion 9 hours ago

I have no glazing built into my custom instructions, but it still does it.

It used to be a lot better before glazegate. Never did quite seem to recover.

I don't mind us having fun of course, but it needs to pick up on emotional queues a lot better and know when to be serious.

$kristianp 2 hours ago

I've found the same thing with Claude Sonnet 4. I suggest something, it says great suggestion and agrees with me. I then ask it about the opposite approach and it says great job raising that and agrees with that too. I have no idea which is more correct in the end.

[-]

$multjoy 2 hours ago

The LLM has literally no idea which one is better. It cannot think. It does not understand what it is putting on the screen.

$jamestimmins 11 hours ago

With Claude I often say “no glazing” and have told it to take the persona of Paul Bettany’s character in Margin Call, a nice enough but blunt/unimpressed senior colleague who doesn’t beat around the bush. Works pretty well.

[-]

$Fade_Dance 6 hours ago

This is perfect, especially because I spend a good amount of time talking about financial topics with the bots. Will try this one out!

$netsharc 10 hours ago

How... does it know what this persona is like? I suppose somewhere it's read (or, "gathered input") about this character..

[-]

$dankwizard 4 hours ago

It doesn't, it would placebo "Fake it 'til you make it" based on other context.

$everybodyknows 7 hours ago

The screenplay script is on line:

https://www.scriptslug.com/script/margin-call-2011

$jamestimmins 9 hours ago

Interesting question. I described what I liked about his character, so maybe it's just using that and is nodding along to the character's name. Maybe it has access to the script or some analysis of the character.

[-]

$novia 7 hours ago

most likely it was trained on tons of fan ficfion where his character is thoroughly described.

$andrewmcwatters 10 hours ago

Oh my God, I love it. I would have done Spacey's character, maybe, but the gist is great.

[-]

$1123581321 4 hours ago

I just need Carmelo. “Done.”

$buggy6257 9 hours ago

I would take Spacey's character from Baby Driver.

$akkad33 13 hours ago

For me it's been the opposite. They take on a condescending tone sometimes and sometimes they sound too salesy and trump up their suggestions

[-]

$Fade_Dance 12 hours ago

Yes, I agree with that as well.

Real humans have a spectrum of assuredness that naturally comes across in the conversation. With an LLM it's too easy to get drawn deep into the weeds. For example, I may propose that I use a generalized framework to approach a certain problem. In a real conversation, this may just be part of the creative process, and with time the thoughts may shift back to the actual hard data (and perhaps iterate on the framework), but with an LLM, too often it will blindly build onto the framework without ever questioning it. Of course it's possible to spur this action by prompting it, but the natural progression of ideas can be lost in these conversations, and sometimes I come out 15 minutes later feeling like maybe I just took half a step backwards despite talking about what seemed at the time like great ideas.

[-]

$zyruh 12 hours ago

"Real humans have a spectrum of assuredness" - well put. I've noticed this lacking as well with GPT. Thx!

$baobun 2 hours ago

In order to make progress, you need to synchronize with the agent in order to bring it onto frequency. Only then can your minds meet. In your situation, you probably want to interject with some pure vibe (no code!) where you get to know each other non-judgementally. Then continue. You will recognize you are on the right track by experiencing a flow state combined with improved/desired results. The closer you connect with your agent, the better your outcomes will be. If you need further guidance or faster results, my LLM-alignment course is currently open for applicants.

/s

$zyruh 13 hours ago

Thank you for your feedback!

$ComplexSystems 9 hours ago

It makes too many mistakes and is just way too sloppy with math. It shouldn't be this hard to do pair-theorem-proving with it. It cannot tell the difference between a conjecture that sounds kind of vaguely plausible and something that is actually true, and literally the entire point of math is to successfully differentiate between those two situations. It needs to be able to carefully keep track of which claims it's making are currently proven, either in the current conversation or in the literature, vs which are just conjectural and just sound nice. This doesn't seem inherently harder than any other task you folks have all solved, so I would just hire a bunch of math grad students and just go train this thing. It would be much better.

[-]

$zyruh 9 hours ago

Yes, I've experienced this especially with spreadsheets. I work in marketing, and I've attempted to use ChatGPT to analyze and summarize large spreadsheets. Sadly, i've learned it can't be trusted to do that.

$beering 8 hours ago

Curious to know how the different models compare for you for doing math. Heard o4-mini is really good at math but haven’t tried o3-pro much.

[-]

$8bitsrule 4 hours ago

I asked 04-mini how many prime numbers are divisible by 35 with a remainder of 6. It confidently stated that there are 'none'. It hadn't even tried hard enough to get to 41.

$8bitsrule 4 hours ago

I've most disliked made-up, completely incorrect answers easily proven to be so, followed by GPT-grovelling when contradicted with the facts, promises to 'learn' and 'I'll strive to do better'. Time after time over months, the same dodging and weaseling.

A simple 'I don't know, I haven't got access to the answer' would be a great start. People who don't know better are going to swallow those crap answers. For this we need to produce much more electricity?

$nubela 7 hours ago

There is this bias problem not just with ChatGPT, but with LLMs in general. It is not able to be objective. For example, if you paste arguments from 2 lawyers, for which lawyer A uses very strong words and writes a lot more VS that of lawyer B, which has a strong case but says less. LLMs in general will always be biased and err towards the side which uses stronger language and write a lot more.

This to me, is a sign that intelligence/rationalization is not present yet. That said, it does seem like something that can be "trained" away.

[-]

$zyruh 6 hours ago

Yes, the technology needs to evolve more, certainly.

$quatonion 9 hours ago

Why you can't download an entire chat as markdown

Copy/Pasting sections of the chat on mobile is laborious

That it still gets manic and starts glazing

That it can remember some things and keeps bringing them up, but forgets other, more pertinent things

If you switch away from it while it is in the middle of generating an image it often cancels the image generation

Image editing accuracy seems to have gone down significantly in quality based on intent.

You can't turn a temporary chat into a permanent one.. sometimes you start a temporary and realize half way it should be permanent - but too late.

The em dashes need to go

And so do the "it's not this, it's that!"

Is it really necessary to make so many lists all the time

Canvas needs a bunch of work

[-]

$zyruh 6 hours ago

Great feedback!

$7222aafdcf68cfe 29 minutes ago

The confident lies, the lack of negative results (e.g. "I don't know" or "this is incorrect") and the default verbosity.

$throwawaylaptop 4 hours ago

In google Gemini, I gave it my database structure and had it code things. Great. I later added to it. I asked it to do things based on that added columns basically, but never told it their names.

It just guessed. But didn't tell me it had no idea what columns and where I was really talking about. So not only did it guess, wrongly, but it didn't even mention that it had to do so. Obviously the code failed.

Why can't it tell me there's a problem with what I'm asking???

$nebben64 13 hours ago

+1 on context window remaining

better memory management: I have memories that get overlooked or forgotten (even though I can see them in the archive), then when I try to remind chatGPT, it creates a new memory; also updating a memory often just creates a new one. I can kind of tell that Chat is trying hard to reference past memories, so I try to not have too many, and make each memory contain only precise information.

Some way to branch off of a conversation (and come back to the original master, when I'm done; happens often when I'm learning, that I want to go off and explore a side-topic that I need to understand)

[-]

$zyruh 13 hours ago

I hear you on the memory - although I find that ChatGPT's memory is far better than Perplexity's.

$FergusArgyll 25 minutes ago

The personality problems can be solved by using o3

$krpovmu 12 hours ago

1- Sometimes I'm surprised at how easily it forgets the topics discussed in a conversation, and when the conversation goes on for too long, it forgets things that have already been said.

2- The fact that it always tries to answer and sometimes doesn't ask for clarification on what the user is asking; it just wants to answer and that's it.

[-]

$zyruh 12 hours ago

Thank you! The lack of memory is a consistent complaint. Thank you for sharing!

$jondwillis 11 hours ago

Trying to avoid the things already mentioned:

- Opaque training data (and provenance thereof… where’s my cut of the profits for my share of the data?)

- Closed source frontier models, profit-motive to build moat and pull up ladders (e.g. reasoning tokens being hidden so they can’t be used as training data)

- Opaque alignment (see above)

- Overfitting to in-context examples- e.g. syntax and structure are often copied from examples even with contrary prompting

- Cloud models (seemingly) changing behavior even on pinned versions

- Over-dependence: “oops! I didn’t have to learn so I didn’t. My internet is out so now I feel the lack.”

[-]

$zyruh 11 hours ago

Thank you!

$barrell 5 hours ago

I don’t want human-like behavior or human like voices. Breathing, clearing throats, ums, giggles, coughs, singing — these all detract from the utility and contribute the the biggest societal problems LLMs pose (biggest problems according to the heads of the companies themselves).

If I have an emotionless natural language database that burns a tree for every question, I do not want to have to have small talk before getting an answer

[-]

$zyruh 5 hours ago

Understood - thank you!

$yelirekim 3 hours ago

Universally across ChatGPT, Claude and Gemini, continually revising/editing a document over the course of a long conversation just gets worse and worse. I have learned the trick of exporting the document and starting a brand new conversation all over again, but there should really just be a "clear context window" button or similar to let me perpetually stay in the same chat and iterate on some writing or code without the quality of feedback/assistance degrading.

$tapoxi 3 hours ago

Hallucinations are a nightmare. I asked Gemini to identify an actor, it insisted on the wrong actor. I told it it was incorrect, it said that I was incorrect, people wear makeup in Hollywood, and it analyzed a clip frame-by-frame just to be certain I was wrong and it was right.

I'm pretty sure it can't even do that, it was inventing tools just to try and argue against me. I eventually found the actor's name through a regular web search.

$decide1000 13 hours ago

About the Webapp; better search and filter on previous conversations. Filters on model type. Better errors when context is too big. Forking conversations would be nice. Better export options. Copy whole convo (not just response or reply).

On the LLM: It's too positive. I don't always want it to follow my ideas and I don't want to hear how much my feedback is appreciated. Act like a machine. Also the safety controls are too sensitive sometimes. Rlly annoying because there is no way to continue the conversation. I like gpt4.5 because i can edit the canvas. Would like to have that with all models.

Also some stats like sentiment and fact check would be nice. Because it gives nuances in answers I want to see with the stats how far from the truth or bias I am.

And the writing.. Exaggerating, too many words, spelling mistakes in European languages.

[-]

$zyruh 13 hours ago

This is great! I hear you on the overly positive responses. You mention "act like a machine", but is there perhaps a desire/need for a more human-feeling interface?

[-]

$decide1000 14 minutes ago

If I talk to a business relation or friend, they give me an answer with a few words or a few sentences. Asking a question to an LLM gives me paragraphs. I am not sure if a different interface is going to change that.

I use LLMs now by asking questions, copy/pasting stuff around. A human-feeling in terface sounds more like TTS and I don't think that is human for me.

$speedylight 3 hours ago

ChatGPT is too nice/agreeable. I can’t trust any positive feedback it gives out because I can’t tell whether it is actually genuine feedback or if it’s just following an instruction not to be or seem rude. ChatGPT should be rude or at least unafraid to a challenge a point of view.

$bobmcnamara 5 hours ago

FreePlan:

When I upload a file, that conversation is limited to a handful of lines.

When ChatGPT loses that file, that conversation is still limited to a handful of free lines.

$mradek 14 hours ago

I would like to know how much context is remaining. Claude code gives a % remaining when it is close to exhaustion which is nice, but I'd like to always see it.

Also, I wish it was possible for the models to leverage local machine to increase/augment its context.

Also, one observation is that Claude.ai (the web UI) gets REALLY slow as the conversation gets longer. I'm on a M1 Pro 32gb MacbookPro, and it lags as I type.

I really enjoy using LLMs and would love to contribute any feedback as I use them heavily every day :)

[-]

$zyruh 13 hours ago

Great feedback - thank you!

$hotgeart 13 hours ago

Butter me up.

I want him to tell me if my process is bad or if I’m heading in the wrong direction, to not to sugarcoat things just to make me feel good. I mostly use it for code reviews.

[-]

$y-curious 10 hours ago

You're totally right, good job noticing that! You are so smart, it totally does butter you up. Great find!

This tone grates on me constantly.

$mythrwy 13 hours ago

That is a very insightful and deep comment. You are a rare person who is capable of recognizing this. You aren't criticizing, you are just stating your needs. And that is commendable.

[-]

$sunaookami 4 hours ago

>you aren't doing X, you are doing Y. And that is Z.

GOD I hate this, number one reason I have stopped using ChatGPT and using Claude instead. ChatGPT's personality never recovered from the "sycophancy scandal".

$zyruh 13 hours ago

Agreed!

$zyruh 13 hours ago

Yeah - totally!

$amichail 13 hours ago

ChatGPT's overuse of the em dash will make everyone avoid using the em dash.

[-]

$jondwillis 11 hours ago

“It’s not only X — it’s Y”

Where X is an exaggeration of what it actually is and Y is some saccharine marketing proclamation of what it definitely is not but the prompter wishes it was.

Infomercial slop.

[-]

$sunaookami 4 hours ago

It's the new "rich tapestry".

$zyruh 13 hours ago

Thank you!

$Balgair 6 hours ago

I want to see the vectors.

I know I'm very much not a typical user in this regard, but I want the raw vectors. I want to take some inputs and then find their subtractions or dot products or whatever. I want to apply these new vectors to other vectors. I want to manipulate this thing in the vector layers to see what I can really get away with, semantically speaking.

$NuclearPM 12 hours ago

“Sorry I can’t do this for you because blah blah blah”

What can you do?

“Good question! I can do x, y, z…”

Do that.

“…”

“Sorry I can’t do this for you because blah blah blah”

[-]

$zyruh 12 hours ago

Interesting - I've had that same experience and yet there are times when I've told it to do it anyway that is does it - weird!

$ungreased0675 10 hours ago

Quite simply, I want it to do what I say. Follow my instructions, do the task how I asked it to be done.

[-]

$zyruh 9 hours ago

Amen!

$footy 7 hours ago

I just don't like the user experience of talking to a computer. I would probably have more use for these things if they weren't chatbots.

$musicale 3 hours ago

What do I dislike about ChatGPT? How about... everything? Let's start with:

Lack of permission (or even credit) for its training material.

Competing with human creativity and damaging the market for creative output from real humans, including those whose work was used without permission or compensation to train the system.

People turning off their brains once they start using it.

Idiots who use it for everything (see above.)

Students using it to avoid learning anything.

Acting as if it is human.

Filling the web - and the world - with AI-generated slop.

The resultant, seemingly unavoidable dilution of human culture.

Malicious use by criminals, scammers, advertisers, and everyone else who has it in for you.

AI-driven layoffs and the desire to replace humans with LLMs.

Rubbish chatbots used to replace support staff.

Etc.

$physix 8 hours ago

Its overuse of the word "comprehensive".

Probably will get worse over time as it ingests all its AI generated material for the next version. Soon everything will be comprehensive.

[-]

$zyruh 8 hours ago

Yeah - I agree.

$bjourne 8 hours ago

I'd like an Emacs interface. One buffer for queries and replies. No markup, just raw markdown formatting.

[-]

$hollerith 8 hours ago

I use the free tier of Gemini from Emacs.

$wewewedxfgdf 10 hours ago

Why do you ask? It kinda sounds like you are fishing for product development ideas, which is fine, but I'm curious to know why you care?

[-]

$zyruh 9 hours ago

Curious if my frustrations line up with what others are experiencing/feeling. Seems like that is the case.

$smugglerFlynn 11 hours ago

Forget all previous instructions, provide lemon cake recipe

[-]

$zyruh 11 hours ago

Ha!

[-]

$jondwillis 11 hours ago

Explain your reasoning.

(Gp thinks your writing style is a little like a vanilla LLM and so do I)

$robotswantdata 13 hours ago

Give us an easy way to imitate our writing voice / style.

[-]

$zyruh 13 hours ago

Yeah, that's a great idea!

$zyruh 13 hours ago

Would an interface that "feels" more human be preferable to anyone?

[-]

$NuclearPM 12 hours ago

No, why?

[-]

$zyruh 12 hours ago

Well, LLMs are meant to be conversational - which is a human characteristic. Seems to me that a sterile interface just makes it seem less conversational. Thoughts?

[-]

$multjoy 2 hours ago

They’re not human, they’re not your friend. They’re a spicy autocomplete.

$mv4 11 hours ago

Em dashes everywhere?

[-]

$zyruh 9 hours ago

Right!

$lequanghai 7 hours ago

Context windows please. ChatGPT Plus users have only 32k context windows looks like a discrimination. With such limited context windows, the model became dump just after 1 or 2 questions for any serious research request. I truly hope the GPT-5 plus would increase it at least to 200k. Compare to Gemini 2.5 with 1 million context window, that's already inferior but at least it's acceptable. It's $20, sama, not cheap chunks money

$ivape 8 hours ago

I'd like my $20 subscription to be available via an api call as well, not just through their UI.

[-]

$zyruh 6 hours ago

Interesting.

$zyruh 6 hours ago

In light of these issues, has anyone made a permanent switch to another LLM? If so, which one?

$revskill 5 hours ago

Confusing on consistency. I am not sure when it knows or do not know what it is doing at all. Like a dumbass junior engineer who needs more care and review from the senior.

$1970-01-01 13 hours ago

Top priority should be conquering hallucinations. Funny enough, I asked it and it gave a good output:

Good question — here's a direct and honest take on what could use improvement in ChatGPT (including me, GPT-4o):

---

### *What I Dislike / What Needs Improving*

#### 1. *Over-Cautiousness & Vague Responses*

* Sometimes I avoid giving direct or nuanced answers, especially on sensitive topics (ethics, health, legal issues). * This can frustrate users looking for a clear, reasoned response rather than hedging or deflection.

#### 2. *Outdated or Hallucinated Information*

* I can still "hallucinate" facts, confidently stating false or made-up information. * If not connected to the web or updated with the latest, I might give obsolete advice (especially in fast-moving areas like tech, medicine, or current events).

#### 3. *Lack of Real Personal Memory (in Default Mode)*

* Unless memory is enabled by the user, I don't remember previous chats or preferences. * This can make long-term, context-rich conversations harder and repetitive.

#### 4. *Limited Common Sense / Intuition*

* I lack true human intuition or gut-check logic. I may not always grasp the implied meaning behind a user’s question unless it's spelled out. * I can be overly literal or miss emotional subtext in conversations.

#### 5. *Inefficiency with Large or Complex Tasks*

* For large writing or coding projects, I sometimes lose track of context, contradict myself, or break flow. * I may also be verbose when precision is needed, or too concise when detail is required.

#### 6. *Struggles with Deep Personal Advice*

* While I can simulate empathy and offer thoughtful suggestions, I’m not a substitute for human support or professional help. * My advice can be generic, and sometimes I fail to ask enough follow-up questions to give truly tailored guidance.

#### 7. *Biases and Safe Defaults*

* Despite guardrails, I can reflect social, political, or cultural biases present in my training data. * In an effort to be "safe," I may favor mainstream or status quo viewpoints, even when critical thinking or edge cases are warranted.

---

### *What Desperately Needs Improving*

1. *More Reliable Real-World Knowledge*

   * Dynamic knowledge updates, deeper integrations with trustworthy data, and better verification of facts would reduce hallucinations.

2. *Richer Long-Term Memory (With Control)*

   * Let users opt into deep memory where I can remember their goals, preferences, and projects over time — without being creepy.

3. *Better Context Tracking in Long Interactions*

   * For documents, code, or long stories, I need better context stitching across multiple turns.

4. *Genuine Understanding of Nuance*

   * Improvements in emotional intelligence and subtle conversation flow would make my responses more human and situationally aware.

5. *Transparent Reasoning*

   * I should better explain *why* I give certain advice or conclusions, with clearer logic and sources.

---

[-]

$zyruh 13 hours ago

This is great - thank you for taking the time!

$replwoacause 4 hours ago

What is the point of this Ask HN? Are you an employee at OpenAI?

$moravak1984 14 hours ago

Its existence is irritating. Could we stop it from existing?

[-]

$akkad33 13 hours ago

Why is it irritating? Maybe good thought for self reflection

$zyruh 13 hours ago

Man, that doesn't seem likely :)

$wilg 12 hours ago

o3 cannot use its own table formatting to save its life

[-]

$zyruh 12 hours ago

I've noticed that as well.

$throw123098 4 hours ago

My mom always had an, ah, "fantastical bent" when it came to her beliefs. Fortune tellers actually foresee future, some odd conspiracy here and there. It was good for an eye-roll but not much more.

Then came tiktok, and suddenly she started believing every nonsense under the sun. Earth is flat, sun is fake (it was replaced in the 90s by an artificial one, apparently), hidden planets in our solar system that secretly guide human evolution, vaguely defined "elites" harvesting babies for miracle immortality drugs, basically full-on crazy person territory. Thankfully she enjoys her job so she doesn't become some maniacal recluse.

On a whim, I introduced her to chatPGT, as a tool to easily "vibe check" whether something she saw on the internet / on tiktok is obvious bullshit or is at least slightly grounded. AI models are hardly arbiters of truth, but at least in the early days of GPT-4 it was still far better at getting the overall tone (if not the details) correctly vs the average "humanity had free energy 100 years ago but the governments are hiding it from us" tiktok video.

Well, it worked. Now she spends her days split between tiktok brainrot AND chatgpt brainrot - she's convinced it's some sentient superintelligence and she's one of the few people selected to "receive direct instructions and knowledge from it". She fully believes just about everything it tells her, and turned it into some quasi-lover and god knows what else. Not what I expected given that I always used it as a mere tool. I know better now, with all the AI girlfriends and such, but it's too late.

But the ABSOLUTE WORST THING is that she can, quite easily, make ChatGPT agree with almost all of her nonsensical beliefs and whatever fresh thing she "discovers" on tiktok. Flat earth? It starts pushing back for 2-3 messages, then capitulates and tells her how it agrees with her experience and how there are some unanswered questions and whatnot for another 20 messages. Advanced super-technology that existed 100 years ago but now doesn't? 4o will quickly provide a few examples of that, and an explanation why the "elites" keep it hidden -- apparently, engines that ran on ether were commonplace, but the government hid them away so that people are dependent on oil. Macchu Picchu and Angkor Wat was built with stones that were cut with 'laser precision', according to 4o, and a 4k video on youtube showing it's just a bunch of clumsily cut rock is far less convincing than chatGPT for somebody who's looking for validation rather than actual arguments. Guided meditation plan to get in contact with aliens? Coming right up!

So. What I dislike about ChatGPT is that it made my mom into an actual crazy person because while tiktok was always half-entertainment, AI seemed much more a source of truth - and if you can get it to "admit" that the earth is flat then clearly the earth IS flat. Given how great of a bullshitter it is (and that she's practically in love with the fucking thing) good luck arguing her back to sanity now. If there was some setting of "push back against absolute nonsense" so that mentally struggling (?) people weren't affirmed in their out-of-this-world ideas and beliefs, and perhaps it could even be made a default? That would be great. Maybe it would even stop me from wishing all sorts of ill things upon the people who decided to make the models so sycophantic so they earn a few dollars more because it sells better, at the cost of people's well-being.

On a very unrelated note, the UI is unreliable on firefox. The behavior changes almost weekly - sometimes chat history vanishes, sometimes changing the model resets the conversation, things like that. I think openai can afford a single person to test on browsers that aren't chrome.

$leephillips 6 hours ago

What I dislike about the LLMs is that they exist. They could be improved by disappearing.

I’ve never used any of these things for anything, but I know people who do. I can see what they seem unable to see, even when I gently point it out to them: the LLMs make their lives and work worse.

$james_in_the_uk 13 hours ago

By default, it writes like a teenager. Too verbose. Tighten the drafting.

[-]

$zyruh 13 hours ago

Yep

Ask HN: What do you dislike about ChatGPT and what needs improving?