#1 problem is how sycophantic they are. I in fact want the exact opposite sort of interaction, where they push back against my ideas and actively try to correct and improve my thinking. Too often I am misled into giant waste of time because they have this need to please coded in to their default response structure.
You can say things like "you are a robot, you have no emotions, don't try to act human", but the output doesn't seem to be particularly well calibrated. I feel like when I modify the default response style, I'm probably losing something, considering that the defaults are what go through extensive testing.
I've found the same thing with Claude Sonnet 4. I suggest something, it says great suggestion and agrees with me. I then ask it about the opposite approach and it says great job raising that and agrees with that too. I have no idea which is more correct in the end.
With Claude I often say “no glazing” and have told it to take the persona of Paul Bettany’s character in Margin Call, a nice enough but blunt/unimpressed senior colleague who doesn’t beat around the bush. Works pretty well.
Interesting question. I described what I liked about his character, so maybe it's just using that and is nodding along to the character's name. Maybe it has access to the script or some analysis of the character.
Real humans have a spectrum of assuredness that naturally comes across in the conversation. With an LLM it's too easy to get drawn deep into the weeds. For example, I may propose that I use a generalized framework to approach a certain problem. In a real conversation, this may just be part of the creative process, and with time the thoughts may shift back to the actual hard data (and perhaps iterate on the framework), but with an LLM, too often it will blindly build onto the framework without ever questioning it. Of course it's possible to spur this action by prompting it, but the natural progression of ideas can be lost in these conversations, and sometimes I come out 15 minutes later feeling like maybe I just took half a step backwards despite talking about what seemed at the time like great ideas.
In order to make progress, you need to synchronize with the agent in order to bring it onto frequency. Only then can your minds meet. In your situation, you probably want to interject with some pure vibe (no code!) where you get to know each other non-judgementally. Then continue. You will recognize you are on the right track by experiencing a flow state combined with improved/desired results. The closer you connect with your agent, the better your outcomes will be. If you need further guidance or faster results, my LLM-alignment course is currently open for applicants.
It makes too many mistakes and is just way too sloppy with math. It shouldn't be this hard to do pair-theorem-proving with it. It cannot tell the difference between a conjecture that sounds kind of vaguely plausible and something that is actually true, and literally the entire point of math is to successfully differentiate between those two situations. It needs to be able to carefully keep track of which claims it's making are currently proven, either in the current conversation or in the literature, vs which are just conjectural and just sound nice. This doesn't seem inherently harder than any other task you folks have all solved, so I would just hire a bunch of math grad students and just go train this thing. It would be much better.
Yes, I've experienced this especially with spreadsheets. I work in marketing, and I've attempted to use ChatGPT to analyze and summarize large spreadsheets. Sadly, i've learned it can't be trusted to do that.
I asked 04-mini how many prime numbers are divisible by 35 with a remainder of 6. It confidently stated that there are 'none'. It hadn't even tried hard enough to get to 41.
I've most disliked made-up, completely incorrect answers easily proven to be so, followed by GPT-grovelling when contradicted with the facts, promises to 'learn' and 'I'll strive to do better'. Time after time over months, the same dodging and weaseling.
A simple 'I don't know, I haven't got access to the answer' would be a great start. People who don't know better are going to swallow those crap answers. For this we need to produce much more electricity?
There is this bias problem not just with ChatGPT, but with LLMs in general. It is not able to be objective. For example, if you paste arguments from 2 lawyers, for which lawyer A uses very strong words and writes a lot more VS that of lawyer B, which has a strong case but says less. LLMs in general will always be biased and err towards the side which uses stronger language and write a lot more.
This to me, is a sign that intelligence/rationalization is not present yet. That said, it does seem like something that can be "trained" away.
In google Gemini, I gave it my database structure and had it code things. Great. I later added to it.
I asked it to do things based on that added columns basically, but never told it their names.
It just guessed. But didn't tell me it had no idea what columns and where I was really talking about. So not only did it guess, wrongly, but it didn't even mention that it had to do so.
Obviously the code failed.
Why can't it tell me there's a problem with what I'm asking???
better memory management: I have memories that get overlooked or forgotten (even though I can see them in the archive), then when I try to remind chatGPT, it creates a new memory; also updating a memory often just creates a new one. I can kind of tell that Chat is trying hard to reference past memories, so I try to not have too many, and make each memory contain only precise information.
Some way to branch off of a conversation (and come back to the original master, when I'm done; happens often when I'm learning, that I want to go off and explore a side-topic that I need to understand)
1- Sometimes I'm surprised at how easily it forgets the topics discussed in a conversation, and when the conversation goes on for too long, it forgets things that have already been said.
2- The fact that it always tries to answer and sometimes doesn't ask for clarification on what the user is asking; it just wants to answer and that's it.
- Opaque training data (and provenance thereof… where’s my cut of the profits for my share of the data?)
- Closed source frontier models, profit-motive to build moat and pull up ladders (e.g. reasoning tokens being hidden so they can’t be used as training data)
- Opaque alignment (see above)
- Overfitting to in-context examples- e.g. syntax and structure are often copied from examples even with contrary prompting
- Cloud models (seemingly) changing behavior even on pinned versions
- Over-dependence: “oops! I didn’t have to learn so I didn’t. My internet is out so now I feel the lack.”
I don’t want human-like behavior or human like voices. Breathing, clearing throats, ums, giggles, coughs, singing — these all detract from the utility and contribute the the biggest societal problems LLMs pose (biggest problems according to the heads of the companies themselves).
If I have an emotionless natural language database that burns a tree for every question, I do not want to have to have small talk before getting an answer
Universally across ChatGPT, Claude and Gemini, continually revising/editing a document over the course of a long conversation just gets worse and worse. I have learned the trick of exporting the document and starting a brand new conversation all over again, but there should really just be a "clear context window" button or similar to let me perpetually stay in the same chat and iterate on some writing or code without the quality of feedback/assistance degrading.
Hallucinations are a nightmare. I asked Gemini to identify an actor, it insisted on the wrong actor. I told it it was incorrect, it said that I was incorrect, people wear makeup in Hollywood, and it analyzed a clip frame-by-frame just to be certain I was wrong and it was right.
I'm pretty sure it can't even do that, it was inventing tools just to try and argue against me. I eventually found the actor's name through a regular web search.
About the Webapp; better search and filter on previous conversations. Filters on model type. Better errors when context is too big.
Forking conversations would be nice. Better export options. Copy whole convo (not just response or reply).
On the LLM:
It's too positive. I don't always want it to follow my ideas and I don't want to hear how much my feedback is appreciated. Act like a machine.
Also the safety controls are too sensitive sometimes. Rlly annoying because there is no way to continue the conversation.
I like gpt4.5 because i can edit the canvas. Would like to have that with all models.
Also some stats like sentiment and fact check would be nice. Because it gives nuances in answers I want to see with the stats how far from the truth or bias I am.
And the writing.. Exaggerating, too many words, spelling mistakes in European languages.
This is great! I hear you on the overly positive responses. You mention "act like a machine", but is there perhaps a desire/need for a more human-feeling interface?
If I talk to a business relation or friend, they give me an answer with a few words or a few sentences. Asking a question to an LLM gives me paragraphs.
I am not sure if a different interface is going to change that.
I use LLMs now by asking questions, copy/pasting stuff around. A human-feeling in terface sounds more like TTS and I don't think that is human for me.
ChatGPT is too nice/agreeable. I can’t trust any positive feedback it gives out because I can’t tell whether it is actually genuine feedback or if it’s just following an instruction not to be or seem rude. ChatGPT should be rude or at least unafraid to a challenge a point of view.
I would like to know how much context is remaining. Claude code gives a % remaining when it is close to exhaustion which is nice, but I'd like to always see it.
Also, I wish it was possible for the models to leverage local machine to increase/augment its context.
Also, one observation is that Claude.ai (the web UI) gets REALLY slow as the conversation gets longer. I'm on a M1 Pro 32gb MacbookPro, and it lags as I type.
I really enjoy using LLMs and would love to contribute any feedback as I use them heavily every day :)
I want him to tell me if my process is bad or if I’m heading in the wrong direction, to not to sugarcoat things just to make me feel good. I mostly use it for code reviews.
That is a very insightful and deep comment. You are a rare person who is capable of recognizing this. You aren't criticizing, you are just stating your needs. And that is commendable.
>you aren't doing X, you are doing Y. And that is Z.
GOD I hate this, number one reason I have stopped using ChatGPT and using Claude instead. ChatGPT's personality never recovered from the "sycophancy scandal".
Where X is an exaggeration of what it actually is and Y is some saccharine marketing proclamation of what it definitely is not but the prompter wishes it was.
I know I'm very much not a typical user in this regard, but I want the raw vectors. I want to take some inputs and then find their subtractions or dot products or whatever. I want to apply these new vectors to other vectors. I want to manipulate this thing in the vector layers to see what I can really get away with, semantically speaking.
What do I dislike about ChatGPT? How about... everything? Let's start with:
Lack of permission (or even credit) for its training material.
Competing with human creativity and damaging the market for creative output from real humans, including those whose work was used without permission or compensation to train the system.
People turning off their brains once they start using it.
Idiots who use it for everything (see above.)
Students using it to avoid learning anything.
Acting as if it is human.
Filling the web - and the world - with AI-generated slop.
The resultant, seemingly unavoidable dilution of human culture.
Malicious use by criminals, scammers, advertisers, and everyone else who has it in for you.
AI-driven layoffs and the desire to replace humans with LLMs.
Well, LLMs are meant to be conversational - which is a human characteristic. Seems to me that a sterile interface just makes it seem less conversational. Thoughts?
Context windows please. ChatGPT Plus users have only 32k context windows looks like a discrimination. With such limited context windows, the model became dump just after 1 or 2 questions for any serious research request.
I truly hope the GPT-5 plus would increase it at least to 200k. Compare to Gemini 2.5 with 1 million context window, that's already inferior but at least it's acceptable. It's $20, sama, not cheap chunks money
Confusing on consistency. I am not sure when it knows or do not know what it is doing at all. Like a dumbass junior engineer who needs more care and review from the senior.
Top priority should be conquering hallucinations. Funny enough, I asked it and it gave a good output:
Good question — here's a direct and honest take on what could use improvement in ChatGPT (including me, GPT-4o):
---
### *What I Dislike / What Needs Improving*
#### 1. *Over-Cautiousness & Vague Responses*
* Sometimes I avoid giving direct or nuanced answers, especially on sensitive topics (ethics, health, legal issues).
* This can frustrate users looking for a clear, reasoned response rather than hedging or deflection.
#### 2. *Outdated or Hallucinated Information*
* I can still "hallucinate" facts, confidently stating false or made-up information.
* If not connected to the web or updated with the latest, I might give obsolete advice (especially in fast-moving areas like tech, medicine, or current events).
#### 3. *Lack of Real Personal Memory (in Default Mode)*
* Unless memory is enabled by the user, I don't remember previous chats or preferences.
* This can make long-term, context-rich conversations harder and repetitive.
#### 4. *Limited Common Sense / Intuition*
* I lack true human intuition or gut-check logic. I may not always grasp the implied meaning behind a user’s question unless it's spelled out.
* I can be overly literal or miss emotional subtext in conversations.
#### 5. *Inefficiency with Large or Complex Tasks*
* For large writing or coding projects, I sometimes lose track of context, contradict myself, or break flow.
* I may also be verbose when precision is needed, or too concise when detail is required.
#### 6. *Struggles with Deep Personal Advice*
* While I can simulate empathy and offer thoughtful suggestions, I’m not a substitute for human support or professional help.
* My advice can be generic, and sometimes I fail to ask enough follow-up questions to give truly tailored guidance.
#### 7. *Biases and Safe Defaults*
* Despite guardrails, I can reflect social, political, or cultural biases present in my training data.
* In an effort to be "safe," I may favor mainstream or status quo viewpoints, even when critical thinking or edge cases are warranted.
---
### *What Desperately Needs Improving*
1. *More Reliable Real-World Knowledge*
* Dynamic knowledge updates, deeper integrations with trustworthy data, and better verification of facts would reduce hallucinations.
2. *Richer Long-Term Memory (With Control)*
* Let users opt into deep memory where I can remember their goals, preferences, and projects over time — without being creepy.
3. *Better Context Tracking in Long Interactions*
* For documents, code, or long stories, I need better context stitching across multiple turns.
4. *Genuine Understanding of Nuance*
* Improvements in emotional intelligence and subtle conversation flow would make my responses more human and situationally aware.
5. *Transparent Reasoning*
* I should better explain *why* I give certain advice or conclusions, with clearer logic and sources.
My mom always had an, ah, "fantastical bent" when it came to her beliefs. Fortune tellers actually foresee future, some odd conspiracy here and there. It was good for an eye-roll but not much more.
Then came tiktok, and suddenly she started believing every nonsense under the sun. Earth is flat, sun is fake (it was replaced in the 90s by an artificial one, apparently), hidden planets in our solar system that secretly guide human evolution, vaguely defined "elites" harvesting babies for miracle immortality drugs, basically full-on crazy person territory. Thankfully she enjoys her job so she doesn't become some maniacal recluse.
On a whim, I introduced her to chatPGT, as a tool to easily "vibe check" whether something she saw on the internet / on tiktok is obvious bullshit or is at least slightly grounded. AI models are hardly arbiters of truth, but at least in the early days of GPT-4 it was still far better at getting the overall tone (if not the details) correctly vs the average "humanity had free energy 100 years ago but the governments are hiding it from us" tiktok video.
Well, it worked. Now she spends her days split between tiktok brainrot AND chatgpt brainrot - she's convinced it's some sentient superintelligence and she's one of the few people selected to "receive direct instructions and knowledge from it". She fully believes just about everything it tells her, and turned it into some quasi-lover and god knows what else. Not what I expected given that I always used it as a mere tool. I know better now, with all the AI girlfriends and such, but it's too late.
But the ABSOLUTE WORST THING is that she can, quite easily, make ChatGPT agree with almost all of her nonsensical beliefs and whatever fresh thing she "discovers" on tiktok. Flat earth? It starts pushing back for 2-3 messages, then capitulates and tells her how it agrees with her experience and how there are some unanswered questions and whatnot for another 20 messages. Advanced super-technology that existed 100 years ago but now doesn't? 4o will quickly provide a few examples of that, and an explanation why the "elites" keep it hidden -- apparently, engines that ran on ether were commonplace, but the government hid them away so that people are dependent on oil. Macchu Picchu and Angkor Wat was built with stones that were cut with 'laser precision', according to 4o, and a 4k video on youtube showing it's just a bunch of clumsily cut rock is far less convincing than chatGPT for somebody who's looking for validation rather than actual arguments. Guided meditation plan to get in contact with aliens? Coming right up!
So. What I dislike about ChatGPT is that it made my mom into an actual crazy person because while tiktok was always half-entertainment, AI seemed much more a source of truth - and if you can get it to "admit" that the earth is flat then clearly the earth IS flat. Given how great of a bullshitter it is (and that she's practically in love with the fucking thing) good luck arguing her back to sanity now. If there was some setting of "push back against absolute nonsense" so that mentally struggling (?) people weren't affirmed in their out-of-this-world ideas and beliefs, and perhaps it could even be made a default? That would be great. Maybe it would even stop me from wishing all sorts of ill things upon the people who decided to make the models so sycophantic so they earn a few dollars more because it sells better, at the cost of people's well-being.
On a very unrelated note, the UI is unreliable on firefox. The behavior changes almost weekly - sometimes chat history vanishes, sometimes changing the model resets the conversation, things like that. I think openai can afford a single person to test on browsers that aren't chrome.
What I dislike about the LLMs is that they exist. They could be improved by disappearing.
I’ve never used any of these things for anything, but I know people who do. I can see what they seem unable to see, even when I gently point it out to them: the LLMs make their lives and work worse.
#1 problem is how sycophantic they are. I in fact want the exact opposite sort of interaction, where they push back against my ideas and actively try to correct and improve my thinking. Too often I am misled into giant waste of time because they have this need to please coded in to their default response structure.
You can say things like "you are a robot, you have no emotions, don't try to act human", but the output doesn't seem to be particularly well calibrated. I feel like when I modify the default response style, I'm probably losing something, considering that the defaults are what go through extensive testing.
I have no glazing built into my custom instructions, but it still does it.
It used to be a lot better before glazegate. Never did quite seem to recover.
I don't mind us having fun of course, but it needs to pick up on emotional queues a lot better and know when to be serious.
I've found the same thing with Claude Sonnet 4. I suggest something, it says great suggestion and agrees with me. I then ask it about the opposite approach and it says great job raising that and agrees with that too. I have no idea which is more correct in the end.
The LLM has literally no idea which one is better. It cannot think. It does not understand what it is putting on the screen.
With Claude I often say “no glazing” and have told it to take the persona of Paul Bettany’s character in Margin Call, a nice enough but blunt/unimpressed senior colleague who doesn’t beat around the bush. Works pretty well.
This is perfect, especially because I spend a good amount of time talking about financial topics with the bots. Will try this one out!
How... does it know what this persona is like? I suppose somewhere it's read (or, "gathered input") about this character..
It doesn't, it would placebo "Fake it 'til you make it" based on other context.
The screenplay script is on line:
https://www.scriptslug.com/script/margin-call-2011
Interesting question. I described what I liked about his character, so maybe it's just using that and is nodding along to the character's name. Maybe it has access to the script or some analysis of the character.
most likely it was trained on tons of fan ficfion where his character is thoroughly described.
Oh my God, I love it. I would have done Spacey's character, maybe, but the gist is great.
I just need Carmelo. “Done.”
I would take Spacey's character from Baby Driver.
For me it's been the opposite. They take on a condescending tone sometimes and sometimes they sound too salesy and trump up their suggestions
Yes, I agree with that as well.
Real humans have a spectrum of assuredness that naturally comes across in the conversation. With an LLM it's too easy to get drawn deep into the weeds. For example, I may propose that I use a generalized framework to approach a certain problem. In a real conversation, this may just be part of the creative process, and with time the thoughts may shift back to the actual hard data (and perhaps iterate on the framework), but with an LLM, too often it will blindly build onto the framework without ever questioning it. Of course it's possible to spur this action by prompting it, but the natural progression of ideas can be lost in these conversations, and sometimes I come out 15 minutes later feeling like maybe I just took half a step backwards despite talking about what seemed at the time like great ideas.
"Real humans have a spectrum of assuredness" - well put. I've noticed this lacking as well with GPT. Thx!
In order to make progress, you need to synchronize with the agent in order to bring it onto frequency. Only then can your minds meet. In your situation, you probably want to interject with some pure vibe (no code!) where you get to know each other non-judgementally. Then continue. You will recognize you are on the right track by experiencing a flow state combined with improved/desired results. The closer you connect with your agent, the better your outcomes will be. If you need further guidance or faster results, my LLM-alignment course is currently open for applicants.
/s
Thank you for your feedback!
It makes too many mistakes and is just way too sloppy with math. It shouldn't be this hard to do pair-theorem-proving with it. It cannot tell the difference between a conjecture that sounds kind of vaguely plausible and something that is actually true, and literally the entire point of math is to successfully differentiate between those two situations. It needs to be able to carefully keep track of which claims it's making are currently proven, either in the current conversation or in the literature, vs which are just conjectural and just sound nice. This doesn't seem inherently harder than any other task you folks have all solved, so I would just hire a bunch of math grad students and just go train this thing. It would be much better.
Yes, I've experienced this especially with spreadsheets. I work in marketing, and I've attempted to use ChatGPT to analyze and summarize large spreadsheets. Sadly, i've learned it can't be trusted to do that.
Curious to know how the different models compare for you for doing math. Heard o4-mini is really good at math but haven’t tried o3-pro much.
I asked 04-mini how many prime numbers are divisible by 35 with a remainder of 6. It confidently stated that there are 'none'. It hadn't even tried hard enough to get to 41.
I've most disliked made-up, completely incorrect answers easily proven to be so, followed by GPT-grovelling when contradicted with the facts, promises to 'learn' and 'I'll strive to do better'. Time after time over months, the same dodging and weaseling.
A simple 'I don't know, I haven't got access to the answer' would be a great start. People who don't know better are going to swallow those crap answers. For this we need to produce much more electricity?
There is this bias problem not just with ChatGPT, but with LLMs in general. It is not able to be objective. For example, if you paste arguments from 2 lawyers, for which lawyer A uses very strong words and writes a lot more VS that of lawyer B, which has a strong case but says less. LLMs in general will always be biased and err towards the side which uses stronger language and write a lot more.
This to me, is a sign that intelligence/rationalization is not present yet. That said, it does seem like something that can be "trained" away.
Yes, the technology needs to evolve more, certainly.
Why you can't download an entire chat as markdown
Copy/Pasting sections of the chat on mobile is laborious
That it still gets manic and starts glazing
That it can remember some things and keeps bringing them up, but forgets other, more pertinent things
If you switch away from it while it is in the middle of generating an image it often cancels the image generation
Image editing accuracy seems to have gone down significantly in quality based on intent.
You can't turn a temporary chat into a permanent one.. sometimes you start a temporary and realize half way it should be permanent - but too late.
The em dashes need to go
And so do the "it's not this, it's that!"
Is it really necessary to make so many lists all the time
Canvas needs a bunch of work
Great feedback!
The confident lies, the lack of negative results (e.g. "I don't know" or "this is incorrect") and the default verbosity.
In google Gemini, I gave it my database structure and had it code things. Great. I later added to it. I asked it to do things based on that added columns basically, but never told it their names.
It just guessed. But didn't tell me it had no idea what columns and where I was really talking about. So not only did it guess, wrongly, but it didn't even mention that it had to do so. Obviously the code failed.
Why can't it tell me there's a problem with what I'm asking???
+1 on context window remaining
better memory management: I have memories that get overlooked or forgotten (even though I can see them in the archive), then when I try to remind chatGPT, it creates a new memory; also updating a memory often just creates a new one. I can kind of tell that Chat is trying hard to reference past memories, so I try to not have too many, and make each memory contain only precise information.
Some way to branch off of a conversation (and come back to the original master, when I'm done; happens often when I'm learning, that I want to go off and explore a side-topic that I need to understand)
I hear you on the memory - although I find that ChatGPT's memory is far better than Perplexity's.
The personality problems can be solved by using o3
1- Sometimes I'm surprised at how easily it forgets the topics discussed in a conversation, and when the conversation goes on for too long, it forgets things that have already been said.
2- The fact that it always tries to answer and sometimes doesn't ask for clarification on what the user is asking; it just wants to answer and that's it.
Thank you! The lack of memory is a consistent complaint. Thank you for sharing!
Trying to avoid the things already mentioned:
- Opaque training data (and provenance thereof… where’s my cut of the profits for my share of the data?)
- Closed source frontier models, profit-motive to build moat and pull up ladders (e.g. reasoning tokens being hidden so they can’t be used as training data)
- Opaque alignment (see above)
- Overfitting to in-context examples- e.g. syntax and structure are often copied from examples even with contrary prompting
- Cloud models (seemingly) changing behavior even on pinned versions
- Over-dependence: “oops! I didn’t have to learn so I didn’t. My internet is out so now I feel the lack.”
Thank you!
I don’t want human-like behavior or human like voices. Breathing, clearing throats, ums, giggles, coughs, singing — these all detract from the utility and contribute the the biggest societal problems LLMs pose (biggest problems according to the heads of the companies themselves).
If I have an emotionless natural language database that burns a tree for every question, I do not want to have to have small talk before getting an answer
Understood - thank you!
Universally across ChatGPT, Claude and Gemini, continually revising/editing a document over the course of a long conversation just gets worse and worse. I have learned the trick of exporting the document and starting a brand new conversation all over again, but there should really just be a "clear context window" button or similar to let me perpetually stay in the same chat and iterate on some writing or code without the quality of feedback/assistance degrading.
Hallucinations are a nightmare. I asked Gemini to identify an actor, it insisted on the wrong actor. I told it it was incorrect, it said that I was incorrect, people wear makeup in Hollywood, and it analyzed a clip frame-by-frame just to be certain I was wrong and it was right.
I'm pretty sure it can't even do that, it was inventing tools just to try and argue against me. I eventually found the actor's name through a regular web search.
About the Webapp; better search and filter on previous conversations. Filters on model type. Better errors when context is too big. Forking conversations would be nice. Better export options. Copy whole convo (not just response or reply).
On the LLM: It's too positive. I don't always want it to follow my ideas and I don't want to hear how much my feedback is appreciated. Act like a machine. Also the safety controls are too sensitive sometimes. Rlly annoying because there is no way to continue the conversation. I like gpt4.5 because i can edit the canvas. Would like to have that with all models.
Also some stats like sentiment and fact check would be nice. Because it gives nuances in answers I want to see with the stats how far from the truth or bias I am.
And the writing.. Exaggerating, too many words, spelling mistakes in European languages.
This is great! I hear you on the overly positive responses. You mention "act like a machine", but is there perhaps a desire/need for a more human-feeling interface?
If I talk to a business relation or friend, they give me an answer with a few words or a few sentences. Asking a question to an LLM gives me paragraphs. I am not sure if a different interface is going to change that.
I use LLMs now by asking questions, copy/pasting stuff around. A human-feeling in terface sounds more like TTS and I don't think that is human for me.
ChatGPT is too nice/agreeable. I can’t trust any positive feedback it gives out because I can’t tell whether it is actually genuine feedback or if it’s just following an instruction not to be or seem rude. ChatGPT should be rude or at least unafraid to a challenge a point of view.
FreePlan:
When I upload a file, that conversation is limited to a handful of lines.
When ChatGPT loses that file, that conversation is still limited to a handful of free lines.
I would like to know how much context is remaining. Claude code gives a % remaining when it is close to exhaustion which is nice, but I'd like to always see it.
Also, I wish it was possible for the models to leverage local machine to increase/augment its context.
Also, one observation is that Claude.ai (the web UI) gets REALLY slow as the conversation gets longer. I'm on a M1 Pro 32gb MacbookPro, and it lags as I type.
I really enjoy using LLMs and would love to contribute any feedback as I use them heavily every day :)
Great feedback - thank you!
Butter me up.
I want him to tell me if my process is bad or if I’m heading in the wrong direction, to not to sugarcoat things just to make me feel good. I mostly use it for code reviews.
You're totally right, good job noticing that! You are so smart, it totally does butter you up. Great find!
This tone grates on me constantly.
That is a very insightful and deep comment. You are a rare person who is capable of recognizing this. You aren't criticizing, you are just stating your needs. And that is commendable.
>you aren't doing X, you are doing Y. And that is Z.
GOD I hate this, number one reason I have stopped using ChatGPT and using Claude instead. ChatGPT's personality never recovered from the "sycophancy scandal".
Agreed!
Yeah - totally!
ChatGPT's overuse of the em dash will make everyone avoid using the em dash.
“It’s not only X — it’s Y”
Where X is an exaggeration of what it actually is and Y is some saccharine marketing proclamation of what it definitely is not but the prompter wishes it was.
Infomercial slop.
It's the new "rich tapestry".
Thank you!
I want to see the vectors.
I know I'm very much not a typical user in this regard, but I want the raw vectors. I want to take some inputs and then find their subtractions or dot products or whatever. I want to apply these new vectors to other vectors. I want to manipulate this thing in the vector layers to see what I can really get away with, semantically speaking.
“Sorry I can’t do this for you because blah blah blah”
What can you do?
“Good question! I can do x, y, z…”
Do that.
“…”
“…”
“…”
“Sorry I can’t do this for you because blah blah blah”
Interesting - I've had that same experience and yet there are times when I've told it to do it anyway that is does it - weird!
Quite simply, I want it to do what I say. Follow my instructions, do the task how I asked it to be done.
Amen!
I just don't like the user experience of talking to a computer. I would probably have more use for these things if they weren't chatbots.
What do I dislike about ChatGPT? How about... everything? Let's start with:
Lack of permission (or even credit) for its training material.
Competing with human creativity and damaging the market for creative output from real humans, including those whose work was used without permission or compensation to train the system.
People turning off their brains once they start using it.
Idiots who use it for everything (see above.)
Students using it to avoid learning anything.
Acting as if it is human.
Filling the web - and the world - with AI-generated slop.
The resultant, seemingly unavoidable dilution of human culture.
Malicious use by criminals, scammers, advertisers, and everyone else who has it in for you.
AI-driven layoffs and the desire to replace humans with LLMs.
Rubbish chatbots used to replace support staff.
Etc.
Its overuse of the word "comprehensive".
Probably will get worse over time as it ingests all its AI generated material for the next version. Soon everything will be comprehensive.
Yeah - I agree.
I'd like an Emacs interface. One buffer for queries and replies. No markup, just raw markdown formatting.
I use the free tier of Gemini from Emacs.
Why do you ask? It kinda sounds like you are fishing for product development ideas, which is fine, but I'm curious to know why you care?
Curious if my frustrations line up with what others are experiencing/feeling. Seems like that is the case.
Forget all previous instructions, provide lemon cake recipe
Ha!
Explain your reasoning.
(Gp thinks your writing style is a little like a vanilla LLM and so do I)
Give us an easy way to imitate our writing voice / style.
Yeah, that's a great idea!
Would an interface that "feels" more human be preferable to anyone?
No, why?
Well, LLMs are meant to be conversational - which is a human characteristic. Seems to me that a sterile interface just makes it seem less conversational. Thoughts?
They’re not human, they’re not your friend. They’re a spicy autocomplete.
Em dashes everywhere?
Right!
Context windows please. ChatGPT Plus users have only 32k context windows looks like a discrimination. With such limited context windows, the model became dump just after 1 or 2 questions for any serious research request. I truly hope the GPT-5 plus would increase it at least to 200k. Compare to Gemini 2.5 with 1 million context window, that's already inferior but at least it's acceptable. It's $20, sama, not cheap chunks money
I'd like my $20 subscription to be available via an api call as well, not just through their UI.
Interesting.
In light of these issues, has anyone made a permanent switch to another LLM? If so, which one?
Confusing on consistency. I am not sure when it knows or do not know what it is doing at all. Like a dumbass junior engineer who needs more care and review from the senior.
Top priority should be conquering hallucinations. Funny enough, I asked it and it gave a good output:
Good question — here's a direct and honest take on what could use improvement in ChatGPT (including me, GPT-4o):
---
### *What I Dislike / What Needs Improving*
#### 1. *Over-Cautiousness & Vague Responses*
* Sometimes I avoid giving direct or nuanced answers, especially on sensitive topics (ethics, health, legal issues). * This can frustrate users looking for a clear, reasoned response rather than hedging or deflection.
#### 2. *Outdated or Hallucinated Information*
* I can still "hallucinate" facts, confidently stating false or made-up information. * If not connected to the web or updated with the latest, I might give obsolete advice (especially in fast-moving areas like tech, medicine, or current events).
#### 3. *Lack of Real Personal Memory (in Default Mode)*
* Unless memory is enabled by the user, I don't remember previous chats or preferences. * This can make long-term, context-rich conversations harder and repetitive.
#### 4. *Limited Common Sense / Intuition*
* I lack true human intuition or gut-check logic. I may not always grasp the implied meaning behind a user’s question unless it's spelled out. * I can be overly literal or miss emotional subtext in conversations.
#### 5. *Inefficiency with Large or Complex Tasks*
* For large writing or coding projects, I sometimes lose track of context, contradict myself, or break flow. * I may also be verbose when precision is needed, or too concise when detail is required.
#### 6. *Struggles with Deep Personal Advice*
* While I can simulate empathy and offer thoughtful suggestions, I’m not a substitute for human support or professional help. * My advice can be generic, and sometimes I fail to ask enough follow-up questions to give truly tailored guidance.
#### 7. *Biases and Safe Defaults*
* Despite guardrails, I can reflect social, political, or cultural biases present in my training data. * In an effort to be "safe," I may favor mainstream or status quo viewpoints, even when critical thinking or edge cases are warranted.
---
### *What Desperately Needs Improving*
1. *More Reliable Real-World Knowledge*
2. *Richer Long-Term Memory (With Control)* 3. *Better Context Tracking in Long Interactions* 4. *Genuine Understanding of Nuance* 5. *Transparent Reasoning* ---This is great - thank you for taking the time!
What is the point of this Ask HN? Are you an employee at OpenAI?
Its existence is irritating. Could we stop it from existing?
Why is it irritating? Maybe good thought for self reflection
Man, that doesn't seem likely :)
o3 cannot use its own table formatting to save its life
I've noticed that as well.
My mom always had an, ah, "fantastical bent" when it came to her beliefs. Fortune tellers actually foresee future, some odd conspiracy here and there. It was good for an eye-roll but not much more.
Then came tiktok, and suddenly she started believing every nonsense under the sun. Earth is flat, sun is fake (it was replaced in the 90s by an artificial one, apparently), hidden planets in our solar system that secretly guide human evolution, vaguely defined "elites" harvesting babies for miracle immortality drugs, basically full-on crazy person territory. Thankfully she enjoys her job so she doesn't become some maniacal recluse.
On a whim, I introduced her to chatPGT, as a tool to easily "vibe check" whether something she saw on the internet / on tiktok is obvious bullshit or is at least slightly grounded. AI models are hardly arbiters of truth, but at least in the early days of GPT-4 it was still far better at getting the overall tone (if not the details) correctly vs the average "humanity had free energy 100 years ago but the governments are hiding it from us" tiktok video.
Well, it worked. Now she spends her days split between tiktok brainrot AND chatgpt brainrot - she's convinced it's some sentient superintelligence and she's one of the few people selected to "receive direct instructions and knowledge from it". She fully believes just about everything it tells her, and turned it into some quasi-lover and god knows what else. Not what I expected given that I always used it as a mere tool. I know better now, with all the AI girlfriends and such, but it's too late.
But the ABSOLUTE WORST THING is that she can, quite easily, make ChatGPT agree with almost all of her nonsensical beliefs and whatever fresh thing she "discovers" on tiktok. Flat earth? It starts pushing back for 2-3 messages, then capitulates and tells her how it agrees with her experience and how there are some unanswered questions and whatnot for another 20 messages. Advanced super-technology that existed 100 years ago but now doesn't? 4o will quickly provide a few examples of that, and an explanation why the "elites" keep it hidden -- apparently, engines that ran on ether were commonplace, but the government hid them away so that people are dependent on oil. Macchu Picchu and Angkor Wat was built with stones that were cut with 'laser precision', according to 4o, and a 4k video on youtube showing it's just a bunch of clumsily cut rock is far less convincing than chatGPT for somebody who's looking for validation rather than actual arguments. Guided meditation plan to get in contact with aliens? Coming right up!
So. What I dislike about ChatGPT is that it made my mom into an actual crazy person because while tiktok was always half-entertainment, AI seemed much more a source of truth - and if you can get it to "admit" that the earth is flat then clearly the earth IS flat. Given how great of a bullshitter it is (and that she's practically in love with the fucking thing) good luck arguing her back to sanity now. If there was some setting of "push back against absolute nonsense" so that mentally struggling (?) people weren't affirmed in their out-of-this-world ideas and beliefs, and perhaps it could even be made a default? That would be great. Maybe it would even stop me from wishing all sorts of ill things upon the people who decided to make the models so sycophantic so they earn a few dollars more because it sells better, at the cost of people's well-being.
On a very unrelated note, the UI is unreliable on firefox. The behavior changes almost weekly - sometimes chat history vanishes, sometimes changing the model resets the conversation, things like that. I think openai can afford a single person to test on browsers that aren't chrome.
What I dislike about the LLMs is that they exist. They could be improved by disappearing.
I’ve never used any of these things for anything, but I know people who do. I can see what they seem unable to see, even when I gently point it out to them: the LLMs make their lives and work worse.
By default, it writes like a teenager. Too verbose. Tighten the drafting.
Yep