Meanwhile they are pushing AI transcription and note taking solutions hard.
Patients are guilted into allowing the doctors to use it. I have gotten pushback when asked to have it turned off.
The messaging is that it all stays local. In reality it’s not and when I last looked it was running on Azure OpenAI in Australia.
I spoke to a practice nurse a few days ago to discuss this.
She said she didn’t think patients would care if they knew the data would be shipped off site. She said people’s problems are not that confidential and their heath data is probably online anyway so who cares.
It's honestly such a big problem. One of my colleagues uses an AI scribe. I can't rely on any of his chart notes because the AI sometimes hallucinates (I've already informed him). It also tends to write a ridiculous amount of detail that are totally unnecessary and leave out important details such that I still need to comb through patient charts for (med rec, consults, etc). In the end it ends up creating more work for me. And if my colleague ever gets a college complaint I have no clue how he's gonna navigate any AI generated errors. I'm all for AI and it's great for things like copywriting, brainstorming and code generation. But from what I'm seeing, it's creating a lot more headache in the clinical setting.
If you're why doesn't this guy just check the AI scribe notes? Well, probably because with the amount of detail it writes, he'd be better off writing a quick soap note.
For all the whinging about bugs and errors around here the software industry in general (some niche sub-fields excepted) long ago decided 80% is good enough to ship and we will figure the rest out later. This entire site is based on startup culture which largely prided itself on MVP moonshots.
Plus plenty of places are perfectly fine with tech dept and the AI fire hose is effectively tech debt on steroids but while it creates it at scale it can also help in understanding it.
It is is own panacea in a way.
I think it is gonna be a while before the industry figures out how to handle this better so might as well just ride the wave and not worry too much about it in software.
Still software is not medicine even if software is required in basically every industry now. It should more conservative and wait till things settle down before jumping in.
It feels very much like AI is creating AI lock-in (if not AI _vendor_ lock-in) by creating so much detailed information that it's futile to consume it without AI tools.
I was updating some gitlab pipelines and some simple testing scripts and it created 3 separate 300+ line README type metadata files (I think even the QUCIKSTART.md was 300 lines).
My (extensive) experience with LLM code generation is that it has the same issues you describe in your field. Hallucinations, over-engineering, misses important requirements/patterns.
But engineers have these same problems. The key is that the content creator (engineers for codegen, doctors for medicine) is still responsible for the output of the AI, as if they wrote it themselves. If they make a mistake with an AI (eg, include false data - hallucinations), they should be held accountable in the same way they would if they made a mistake without it.
Okay but since we know how humans actually behave, they will fully trust the indeterministic machine and give away their thinking. Sadly there is a large swath of humans that will act like this, maybe 20-30%.
Are you willing to put your life in the hands of these people fully using the machines to do everything?
Acting like that smart people aren't getting one shot'ed by these machines is very dangerous. Even worse is how quickly your skills actual degrade. If knew my doctor was using anything LLM related, I would switch doctors.
Very little protections. The entire medical records of a significant percentage of the NZ population were stolen recently and put up for sale online. Zero consequences for the medical practices who adopted the hacked software.
Many AI companies, including Azure with their OpenAI hosting, are more than willing to sign privacy agreements that allow processing sensitive medical data with their models.
The devil is in the details. For example, OAI does not have regional processing for AU [0] and their ZDR does not cover files[1]. Anthropic's ZDR [2] also does not cover files, so you really need to be careful, as a patient/consumer, to ensure that your health, or other sensitive data, that is being processed by SaaS frontier models is not contained in files. Which is asking a a lot of the medical provider to know how their systems work, they won't, which is why I will never opt in.
There is no way to upload files as a part of context with Azure deployments, you have to use the OAI API [0], and without having an architecture diagram of the solution, I am not going to trust it based off of the known native limitations with Azure's OAI implementation.
I spec'd up an implementation of this that uses a hardware button with colors that is in reach of either party. The customer went with a different vendor based on price/"complexity"/training.
The wilful ignorance and total apathy is appalling.
I've had similar experiences in Australia. I emailed one of my docs' practices asking if they use Heidi AI (or anything similar) and that I do not consent. They were using it without my consent.
In the consultation, he tried to give me the schpiel, including the 'it stays local' thing. The Heidi AI website has the scripts for clinicians; he ran through them all.
Oh, their documents for clinicians also mention every two sentences that patient/client consent is not required at all. I wonder why they keep saying that? Hmm.
This doctor knows I am a developer. When I asked him to explain what he meant by 'local data', he said the servers were in Australia. I almost flipped the desk. Aside from the fact that it is mandatory (it's the law! they do not have a choice!), it's ...kind of meaningless where the servers are, especially when he (on behalf of Heidi AI) was trying to sell it as a security or privacy feature. When I pointed that out, he just couldn't wrap his head around it. Of course he can't, he doesn't understand.
AHPRA's "Meeting your professional obligations when using Artificial Intelligence in healthcare" guideline[0] (not any kind of enforceable requirement, unfortunately) has great stuff in it. It encourages using it with the informed consent of patients. Even if my doctor read it and agreed with it, and cared about getting consent, how the hell can he inform patients sufficiently when he has absolutely no idea about, well, anything?
He keeps pushing it and asking me about whether I've changed my mind about allowing him to use it. No! He keeps asking me questions that only confirm he hasn't even done a perfunctory web search about why some people hate LLMs, especially in the context of PII and PHI.
I really do feel for clinicians, but these products are not the answer.
The New Zealand Chief Digital Officer allowed Australian cloud providers to be used as there weren't suitable NZ data centers and this was many years ago.
Health NZ adopted Snowflake. It was about costs/fancy tech. We have always had data centres. Nobody *needs* snowflake. They could have used Apache Spark.
I genuinely think people “care” about it (in quotes). It’s one of the things where nobody cares unless something bad happens, and when bad thing happens, they shrug it off and forget about it a week later.
I’d go as far as saying she’s right. And we’re in a tiny minority for even thinking about it.
The union rep gets it - people improvise when you cut their tools and then threaten discipline for improvising.
That memo is how you make staff hide things instead of asking for help.
The scarier part though is that LLM-written clinical notes probably look fine. That's the whole problem. I built a system where one AI was scoring another AI's work, and it kept giving high marks because the output read well. I had to make the scorer blind to the original coaching text before it started catching real issues. Now imagine that "reads well, isn't right" failure mode in clinical documentation.
Nobody's re-reading the phrasing until a patient outcome goes wrong.
ASR models can output a confidence score along with the text, but it is rarely used in the UI to display the results.. or maybe lost entirely in a subsequent LLM layer.
Physicians need to have it pounded into them that every hallucination is downstream harm. AI has no place in medicine. If they insist on it, then all transcripts must be stored with the raw audio. Which should be accessible side by side, with lines of transcript time coded. It's the only way to actually use these safely, while guarding against hallucinations.
Raw audio is a cool idea! I've seen a similar approach in other domains, "keep the source of truth accessible so you can verify the AI output against it".
I wouldn't go as far as "no place in medicine" though. The Heidi scribe tool mentioned in the article is a good example, because in the end it's the doctor who reviews and signs off.
IMO the problem is AI doing the work with no human verification step, but I can 100% agree I don't want to have vibe-doctor for my next surgery/consult :D
> Physicians need to have it pounded into them that every hallucination is downstream harm.
I think any person using 'AI' knows it makes mistakes. In a medical note, there are often errors at present. A consumer of a medical note has to decide what makes sense and what to ignore, and AI isn't meaningfully changing that. If something matters, it's asked again in follow up.
I know at least one GP that has stopped using Heidi Health for transcription. He (and as I've noticed with transcriptions from my medical professionals) has noticed many errors, far too many to be comfortable. Things might improve, but not yet.
This is where I'm at as a GP. Every few months I give Heidi another try, but I haven't noticed any real improvement over the last two years. It spends lots of words on trivial nonsense and misses clinically significant points and sometimes entire issues. It takes far more time to review and fix the notes than it saves in typing. Presumably it will be good enough one day, but it's not there yet.
Yeah, no privacy or security there. There are some tools explicitly designed at helping healthcare providers produce better notes faster, and a couple of them are AMAZING. I'm an AI-half-empty guy, I'm keenly aware of its shortcomings and deploy it thoughtfully, and even with my skepticism there are a couple of tools that are just plain great. I think using LLMs to create overviews and summaries is a great use of the tech.
I have seen the evolution of these tools and I think they are going to push a fundamental change to medical care. Notes have been getting more and more abused, at least in the US. Big health systems want them for a lot of reasons that have nothing to do with helping a practitioner improve the care of their patient. They want to capture every billable moment of that encounter and potentially prep things like labs, appointments, clinical trial screening, pre-auths, etc. Some of this is good for the patient but a lot isn't. Also, the reality is that many practitioners spend as much, or more time, on the note than on the patient. That clearly isn't to their benefit. There is a reason they sit there and type constantly while talking to you and that doesn't stop when you leave the room. The demands on them to document everything so that all the accounting can happen are actually harming healthcare.
I think there is a chance that these systems will lead to a change where the note isn't the fundamental record of the encounter. Instead different artifacts are created specifically for each entity that needs it. Billing gets their view, and scheduling gets theirs, and, etc etc... It will, hopefully, give the practitioners a chance to get back to focusing on the patient and not ensuring their note quality captured one more billable code. Of course the negative is also likely to happen here too. As practitioners spend less time on the note they will likely not get that back in time with individual patients, but instead on seeing more patients. It will also likely lead to higher bills as the health systems do start squeezing more out of every encounter. There is no perfect here when profit is the driving motivator but with this much change happening I can only hope that it causes the industry as a whole to shake up enough to maybe find a new better optimum to land in.
>I think there is a chance that these systems will lead to a change where the note isn't the fundamental record of the encounter. Instead different artifacts are created specifically for each entity that needs it. Billing gets their view, and scheduling gets theirs, and, etc etc..
This is what an EHR does somewhat. The discrete data elements in the DB and the way they are displayed in the system are a better record than free text notes.
The problem is creating standards so this data is easily exchanged. Anyone can read and parse a free text note - but if we had standards this would be less necessary.
This will always happen as long as there is a combative relationship between private insurers and providers over reimbursements. Each side is using documentation or lack thereof to make their financial case.
Heidi is frustratingly consistent at hallucinating stuff. I've seen it in almost all of the dozen or so summaries I've had from medical people recently (surgeon, physio, consultant). A GP I know tried for a month and then was like 'it's not worth the risk exposure to me or my patients'.
AI doesn't forget and soon all of new zealanders will have their health histories internalised by AI so it can individually calculate insurance premiums without knowing why....
This is a ridiculous sentence. Of course inference-only AI forgets. You can literally just program it that way.
In fact, it's human transcribers who chose whether to forget the details of a case or whether to share the details of an especially funny patient with their buddies at the bar.
This is a blatant violation of patient privacy. That the output is often hallucinated doesn't even matter here. If the hospital wants to use LLMs, better deploy them on-premise or a trusted network at least.
Meanwhile they are pushing AI transcription and note taking solutions hard.
Patients are guilted into allowing the doctors to use it. I have gotten pushback when asked to have it turned off.
The messaging is that it all stays local. In reality it’s not and when I last looked it was running on Azure OpenAI in Australia.
I spoke to a practice nurse a few days ago to discuss this.
She said she didn’t think patients would care if they knew the data would be shipped off site. She said people’s problems are not that confidential and their heath data is probably online anyway so who cares.
It's honestly such a big problem. One of my colleagues uses an AI scribe. I can't rely on any of his chart notes because the AI sometimes hallucinates (I've already informed him). It also tends to write a ridiculous amount of detail that are totally unnecessary and leave out important details such that I still need to comb through patient charts for (med rec, consults, etc). In the end it ends up creating more work for me. And if my colleague ever gets a college complaint I have no clue how he's gonna navigate any AI generated errors. I'm all for AI and it's great for things like copywriting, brainstorming and code generation. But from what I'm seeing, it's creating a lot more headache in the clinical setting.
If you're why doesn't this guy just check the AI scribe notes? Well, probably because with the amount of detail it writes, he'd be better off writing a quick soap note.
> I'm all for AI and it's great for things like copywriting, brainstorming and code generation
It's funny how the assumption is always that LLMs are very useful in an industry other than your own.
I mean they are not wrong.
For all the whinging about bugs and errors around here the software industry in general (some niche sub-fields excepted) long ago decided 80% is good enough to ship and we will figure the rest out later. This entire site is based on startup culture which largely prided itself on MVP moonshots.
Plus plenty of places are perfectly fine with tech dept and the AI fire hose is effectively tech debt on steroids but while it creates it at scale it can also help in understanding it.
It is is own panacea in a way.
I think it is gonna be a while before the industry figures out how to handle this better so might as well just ride the wave and not worry too much about it in software.
Still software is not medicine even if software is required in basically every industry now. It should more conservative and wait till things settle down before jumping in.
It feels very much like AI is creating AI lock-in (if not AI _vendor_ lock-in) by creating so much detailed information that it's futile to consume it without AI tools.
I was updating some gitlab pipelines and some simple testing scripts and it created 3 separate 300+ line README type metadata files (I think even the QUCIKSTART.md was 300 lines).
My (extensive) experience with LLM code generation is that it has the same issues you describe in your field. Hallucinations, over-engineering, misses important requirements/patterns.
But engineers have these same problems. The key is that the content creator (engineers for codegen, doctors for medicine) is still responsible for the output of the AI, as if they wrote it themselves. If they make a mistake with an AI (eg, include false data - hallucinations), they should be held accountable in the same way they would if they made a mistake without it.
Okay but since we know how humans actually behave, they will fully trust the indeterministic machine and give away their thinking. Sadly there is a large swath of humans that will act like this, maybe 20-30%.
Are you willing to put your life in the hands of these people fully using the machines to do everything?
Acting like that smart people aren't getting one shot'ed by these machines is very dangerous. Even worse is how quickly your skills actual degrade. If knew my doctor was using anything LLM related, I would switch doctors.
> I'm all for AI and it's great for things like copywriting, brainstorming and code generation
That's funny. I would have said the same thing about your field prior to reading your comment.
sounds like they need a better instructions.md
Is there nothing like HIPAA there or what?
Very little protections. The entire medical records of a significant percentage of the NZ population were stolen recently and put up for sale online. Zero consequences for the medical practices who adopted the hacked software.
Interesting, a person was telling me recently that NZ privacy laws were quite strong. Perhaps a different category.
https://news.ycombinator.com/item?id=44564349
The laws are, the policing is not. At least not in medical data
Many AI companies, including Azure with their OpenAI hosting, are more than willing to sign privacy agreements that allow processing sensitive medical data with their models.
The devil is in the details. For example, OAI does not have regional processing for AU [0] and their ZDR does not cover files[1]. Anthropic's ZDR [2] also does not cover files, so you really need to be careful, as a patient/consumer, to ensure that your health, or other sensitive data, that is being processed by SaaS frontier models is not contained in files. Which is asking a a lot of the medical provider to know how their systems work, they won't, which is why I will never opt in.
[0] https://developers.openai.com/api/docs/guides/your-data#whic...
[1] https://developers.openai.com/api/docs/guides/your-data#stor...
[2] https://platform.claude.com/docs/en/build-with-claude/zero-d...
Azure OpenAI is not the same as paying OpenAI directly. While you may not be able to pay OpenAI for them to run models in Australia, you can pay Azure: https://azure.microsoft.com/en-au/pricing/details/azure-open...
The models are licensed to Microsoft, and you pay them for the inference.
There is no way to upload files as a part of context with Azure deployments, you have to use the OAI API [0], and without having an architecture diagram of the solution, I am not going to trust it based off of the known native limitations with Azure's OAI implementation.
[0] https://github.com/openai/openai-python/issues/2300
I spec'd up an implementation of this that uses a hardware button with colors that is in reach of either party. The customer went with a different vendor based on price/"complexity"/training.
The wilful ignorance and total apathy is appalling.
I've had similar experiences in Australia. I emailed one of my docs' practices asking if they use Heidi AI (or anything similar) and that I do not consent. They were using it without my consent.
In the consultation, he tried to give me the schpiel, including the 'it stays local' thing. The Heidi AI website has the scripts for clinicians; he ran through them all.
Oh, their documents for clinicians also mention every two sentences that patient/client consent is not required at all. I wonder why they keep saying that? Hmm.
This doctor knows I am a developer. When I asked him to explain what he meant by 'local data', he said the servers were in Australia. I almost flipped the desk. Aside from the fact that it is mandatory (it's the law! they do not have a choice!), it's ...kind of meaningless where the servers are, especially when he (on behalf of Heidi AI) was trying to sell it as a security or privacy feature. When I pointed that out, he just couldn't wrap his head around it. Of course he can't, he doesn't understand.
AHPRA's "Meeting your professional obligations when using Artificial Intelligence in healthcare" guideline[0] (not any kind of enforceable requirement, unfortunately) has great stuff in it. It encourages using it with the informed consent of patients. Even if my doctor read it and agreed with it, and cared about getting consent, how the hell can he inform patients sufficiently when he has absolutely no idea about, well, anything?
He keeps pushing it and asking me about whether I've changed my mind about allowing him to use it. No! He keeps asking me questions that only confirm he hasn't even done a perfunctory web search about why some people hate LLMs, especially in the context of PII and PHI.
I really do feel for clinicians, but these products are not the answer.
[0] https://www.ahpra.gov.au/Resources/Artificial-Intelligence-i...
The New Zealand Chief Digital Officer allowed Australian cloud providers to be used as there weren't suitable NZ data centers and this was many years ago.
Health NZ adopted Snowflake. It was about costs/fancy tech. We have always had data centres. Nobody *needs* snowflake. They could have used Apache Spark.
What are you talking about, NZ has had suitable DC's for decades now.
I genuinely think people “care” about it (in quotes). It’s one of the things where nobody cares unless something bad happens, and when bad thing happens, they shrug it off and forget about it a week later.
I’d go as far as saying she’s right. And we’re in a tiny minority for even thinking about it.
Didn't Health NZ just suffer a major data breach and have patient records ransomed?
There were two serious breaches recently but they were at private companies not HNZ.
The union rep gets it - people improvise when you cut their tools and then threaten discipline for improvising.
That memo is how you make staff hide things instead of asking for help.
The scarier part though is that LLM-written clinical notes probably look fine. That's the whole problem. I built a system where one AI was scoring another AI's work, and it kept giving high marks because the output read well. I had to make the scorer blind to the original coaching text before it started catching real issues. Now imagine that "reads well, isn't right" failure mode in clinical documentation.
Nobody's re-reading the phrasing until a patient outcome goes wrong.
ASR models can output a confidence score along with the text, but it is rarely used in the UI to display the results.. or maybe lost entirely in a subsequent LLM layer.
Physicians need to have it pounded into them that every hallucination is downstream harm. AI has no place in medicine. If they insist on it, then all transcripts must be stored with the raw audio. Which should be accessible side by side, with lines of transcript time coded. It's the only way to actually use these safely, while guarding against hallucinations.
Raw audio is a cool idea! I've seen a similar approach in other domains, "keep the source of truth accessible so you can verify the AI output against it".
I wouldn't go as far as "no place in medicine" though. The Heidi scribe tool mentioned in the article is a good example, because in the end it's the doctor who reviews and signs off.
IMO the problem is AI doing the work with no human verification step, but I can 100% agree I don't want to have vibe-doctor for my next surgery/consult :D
> Physicians need to have it pounded into them that every hallucination is downstream harm.
I think any person using 'AI' knows it makes mistakes. In a medical note, there are often errors at present. A consumer of a medical note has to decide what makes sense and what to ignore, and AI isn't meaningfully changing that. If something matters, it's asked again in follow up.
I know at least one GP that has stopped using Heidi Health for transcription. He (and as I've noticed with transcriptions from my medical professionals) has noticed many errors, far too many to be comfortable. Things might improve, but not yet.
This is where I'm at as a GP. Every few months I give Heidi another try, but I haven't noticed any real improvement over the last two years. It spends lots of words on trivial nonsense and misses clinically significant points and sometimes entire issues. It takes far more time to review and fix the notes than it saves in typing. Presumably it will be good enough one day, but it's not there yet.
It's Gell-Mann amnesia: you notice the errors in fields where you're an expert.
https://en.wiktionary.org/wiki/Gell-Mann_Amnesia_effect
Yeah, no privacy or security there. There are some tools explicitly designed at helping healthcare providers produce better notes faster, and a couple of them are AMAZING. I'm an AI-half-empty guy, I'm keenly aware of its shortcomings and deploy it thoughtfully, and even with my skepticism there are a couple of tools that are just plain great. I think using LLMs to create overviews and summaries is a great use of the tech.
The one my doctor was using got my obs numbers completely wrong.
We had to correct them at the end of the consultation.
Gotta break a few eggs to save 2 minutes of thinking and work
I have seen the evolution of these tools and I think they are going to push a fundamental change to medical care. Notes have been getting more and more abused, at least in the US. Big health systems want them for a lot of reasons that have nothing to do with helping a practitioner improve the care of their patient. They want to capture every billable moment of that encounter and potentially prep things like labs, appointments, clinical trial screening, pre-auths, etc. Some of this is good for the patient but a lot isn't. Also, the reality is that many practitioners spend as much, or more time, on the note than on the patient. That clearly isn't to their benefit. There is a reason they sit there and type constantly while talking to you and that doesn't stop when you leave the room. The demands on them to document everything so that all the accounting can happen are actually harming healthcare.
I think there is a chance that these systems will lead to a change where the note isn't the fundamental record of the encounter. Instead different artifacts are created specifically for each entity that needs it. Billing gets their view, and scheduling gets theirs, and, etc etc... It will, hopefully, give the practitioners a chance to get back to focusing on the patient and not ensuring their note quality captured one more billable code. Of course the negative is also likely to happen here too. As practitioners spend less time on the note they will likely not get that back in time with individual patients, but instead on seeing more patients. It will also likely lead to higher bills as the health systems do start squeezing more out of every encounter. There is no perfect here when profit is the driving motivator but with this much change happening I can only hope that it causes the industry as a whole to shake up enough to maybe find a new better optimum to land in.
>I think there is a chance that these systems will lead to a change where the note isn't the fundamental record of the encounter. Instead different artifacts are created specifically for each entity that needs it. Billing gets their view, and scheduling gets theirs, and, etc etc..
This is what an EHR does somewhat. The discrete data elements in the DB and the way they are displayed in the system are a better record than free text notes.
The problem is creating standards so this data is easily exchanged. Anyone can read and parse a free text note - but if we had standards this would be less necessary.
This will always happen as long as there is a combative relationship between private insurers and providers over reimbursements. Each side is using documentation or lack thereof to make their financial case.
FYI, AI adoption in health in NZ is moving forward, for example https://www.rnz.co.nz/news/national/589774/emergency-doctors...
This is just about not using free/public AI tools.
that's mentioned in OP article.
Heidi is frustratingly consistent at hallucinating stuff. I've seen it in almost all of the dozen or so summaries I've had from medical people recently (surgeon, physio, consultant). A GP I know tried for a month and then was like 'it's not worth the risk exposure to me or my patients'.
AI doesn't forget and soon all of new zealanders will have their health histories internalised by AI so it can individually calculate insurance premiums without knowing why....
This is a ridiculous sentence. Of course inference-only AI forgets. You can literally just program it that way.
In fact, it's human transcribers who chose whether to forget the details of a case or whether to share the details of an especially funny patient with their buddies at the bar.
This is a blatant violation of patient privacy. That the output is often hallucinated doesn't even matter here. If the hospital wants to use LLMs, better deploy them on-premise or a trusted network at least.
Noticing that people in the west would rather speak about privacy than more efficient or cheaper healthcare.
Enterprises are ok sharing their code base with OpenAI. I think it should be okay for patients.
Enterprises do that if they choose. Patients can choose as well. And it's their choice, not anyone else's