Nice, I'm looking forward to seeing how this performs in practice. FFmpeg's previous AAC encoder produced poor quality output and often had irritating chirping artifacts, so I've always had to install Apple's Core Audio encoder on any computer I do video recording on to get decent sound. I've done A/B/X comparisons and found that a 320kbps MP3 sounds better than a 320kbps AAC encoded by FFmpeg, but about the same as a 256kbps AAC encoded by Core Audio. If installing Core Audio is no longer necessary, that'll be a huge improvement and people who use something like OBS to do screen recordings or streaming will get a massive sound quality boost the next time they update.
Don't get me wrong, this sort of thing is a valuable exercise and we are better off with better encoders for these older codecs. But look at the numbers for Opus on this benchmark. It simply blows all the AAC encoders out of the water even at 64 kbps.
I read almost all the way through your comment thinking there was a decent probability you were saying this new AAC encoder was written with Claude Opus.
The biggest advantage for having a good AAC encoder isn't efficiency, it's that for nearly the past 2 decades the de facto standard for live streamed video has been RTMP with H.264 video and AAC audio. There is basically no support for any other codecs. If you want to send a video stream to Youtube or Twitch, you will be sending H.264 and AAC. If you want an idea of how ubiquitous this is, I just checked in OBS and it will not even let you select different video and audio codecs in streaming mode, it just (correctly) assumes that anybody who's streaming will be streaming H.264 and AAC.
Plus, at 96+ kbps (assuming an Apple-quality AAC-LC encoder) Opus loses its quality advantage. So at higher bitrates, the benefit of choosing Opus is that encoders/decoders are royalty-free.
Choosing a lossy audio codec has become such a no brainer. Either use opus and be done with it or if for some reason opus cannot be used then use aac for compatibility with insane high bitrate for good quality without having to do research on what encoder and mode to pick.
Still having a good quality and default aac encoder is great. Though I don't get why it is mainly CBR.
I would like Opus, but I’m using a subsonic client on iOS and my choice has been Flac (Alac?), MP3, or AAC. Opus wouldn’t play (There are some that supported it, but I didn’t like their UX).
Older I get, more it seems it’s possible to ping pong between rewrites for good reasons (ex. here, metric maxes but I find it hard to believe VBR and not-48 kHz are silly things and not worth investing it)
> The encoder was mainly optimized for 48Khz audio. Get over it. It's 2026, resampling is free, 48Khz is the standard. 44.1Khz will work, and so will 96Khz but use 48Khz if you want the best quality.
AAC has a strange quirk that the window size is dependent on the sampling rate, thus requiring a complete psychoacoustics reoptimization of all encoder parameters for each sampling rate, since a 20msec window sounds very different than a 60msec window, to human ears.
More or less. Streaming is often done with 48, video content has ben 48 for a while now, so unless you still produce content for CDs it is the standard.
44100 Hz had reasons no longer really needed (storing audio in 3 samples per line in VHS: 490 lines × 3 samples × 30 GPS = 44100 sample/s).
Qualitywise both are more than enough snd 99.99% of people would not be able to tell it apart in a blind test. Higher sample rates than 48kHz only needed when you want to pitch down ultrasonic recordings (of whales, bats and other such animals for example).
Aside from this higher than 48 kHz sample rates may have only downsides, like increased size and potential distortion in the ultrasonic frequency range that has sidebands in the audible range. Yet there is a persistent, but unscientific "more-is-better"-crowd in the HiFi-sector.
> Higher sample rates than 48kHz only needed when you want to pitch down ultrasonic recordings (of whales, bats and other such animals for example).
There are numerous use cases for higher sample rates that go beyond this but it's hard to talk about it without starting flame wars filled with junk science.
Say it or don't but "I have evidence otherwise but don't think I should say" is just as bad a flame war gateway as tempting the junk science audiophiles directly.
Higher sample rates are lower latency for the same block size and resampling is not "free" (pick 2: performance, aliasing, latency) so there can be advantages to working with audio archived at higher sample rates.
But all the advantages come down to professional or editing use cases. There's next to zero advantage to using it as a storage format for listening. Just like 24 bit audio (do you have an amp with 96dB SNR?).
Just personally, I have seen little evidence (personally, professionally, or academically) that there is any advantage for lossless audio for consumer applications. For professional applications there are plenty, and it's endlessly tiring to convince people that "no, actually I need 96kHz for my use case."
I know that with oscilloscopes, it’s recommended to use 5x instead of nuquist 2x of the highest frequency you want to use., but the most reasonable argument I’ve heard for higher than 48kHz sampling is digital audio effects.
But for the end result 48kHz is more than necessary. I can’t even hear any frequency above 17kHz.
>FFmpeg's AAC DEcoder is busted with regards to stereo PNS, and the bug may be in other AAC decoders too, so we work around it in the encoder. Since no other encoder used PNS, the bug was not found until now.
I don't know what PNS is, but I bet this has been bothering someone's niche use-case for 20 years
It doesn't let me edit the link, but I'm confused by what even happened here... I posted this from my phone and that wrong link doesn't show up in my clipboard history.
Nice, I'm looking forward to seeing how this performs in practice. FFmpeg's previous AAC encoder produced poor quality output and often had irritating chirping artifacts, so I've always had to install Apple's Core Audio encoder on any computer I do video recording on to get decent sound. I've done A/B/X comparisons and found that a 320kbps MP3 sounds better than a 320kbps AAC encoded by FFmpeg, but about the same as a 256kbps AAC encoded by Core Audio. If installing Core Audio is no longer necessary, that'll be a huge improvement and people who use something like OBS to do screen recordings or streaming will get a massive sound quality boost the next time they update.
Why not use a lossless codec if you care about quality? Or use Opus, descent for specht and works pretty much anywhere these days.
> Why not use a lossless codec if you care about quality?
(1) Lossy codecs are transparent at half the file size (or less) of FLAC/ALAC.
(2) AAC (strictly, AAC-LC) is universal, where FLAC and Opus are not yet there.
Man what a showcase for Opus this is.
Don't get me wrong, this sort of thing is a valuable exercise and we are better off with better encoders for these older codecs. But look at the numbers for Opus on this benchmark. It simply blows all the AAC encoders out of the water even at 64 kbps.
> Man what a showcase for Opus this is.
I take it you mean this Opus (https://en.wikipedia.org/wiki/Opus_(audio_format)) not that Opus (https://en.wikipedia.org/wiki/Claude_(AI)).
I read almost all the way through your comment thinking there was a decent probability you were saying this new AAC encoder was written with Claude Opus.
The biggest advantage for having a good AAC encoder isn't efficiency, it's that for nearly the past 2 decades the de facto standard for live streamed video has been RTMP with H.264 video and AAC audio. There is basically no support for any other codecs. If you want to send a video stream to Youtube or Twitch, you will be sending H.264 and AAC. If you want an idea of how ubiquitous this is, I just checked in OBS and it will not even let you select different video and audio codecs in streaming mode, it just (correctly) assumes that anybody who's streaming will be streaming H.264 and AAC.
Plus, at 96+ kbps (assuming an Apple-quality AAC-LC encoder) Opus loses its quality advantage. So at higher bitrates, the benefit of choosing Opus is that encoders/decoders are royalty-free.
Sample accurate editing is with AAC is a pain though. Especially if you also have video, because frame rates are usually incompatible.
If you want flexibility without fully transcoding both audio and video, Opus is your friend
Choosing a lossy audio codec has become such a no brainer. Either use opus and be done with it or if for some reason opus cannot be used then use aac for compatibility with insane high bitrate for good quality without having to do research on what encoder and mode to pick.
Still having a good quality and default aac encoder is great. Though I don't get why it is mainly CBR.
I would like Opus, but I’m using a subsonic client on iOS and my choice has been Flac (Alac?), MP3, or AAC. Opus wouldn’t play (There are some that supported it, but I didn’t like their UX).
You might like Poppy (in beta), which supports all media servers (including OpenSubsonic/Navidrome) and Opus as a first-class music format. https://www.reddit.com/r/PoppyApp/comments/1tiyki0/about_pop...
Older I get, more it seems it’s possible to ping pong between rewrites for good reasons (ex. here, metric maxes but I find it hard to believe VBR and not-48 kHz are silly things and not worth investing it)
> The encoder was mainly optimized for 48Khz audio. Get over it. It's 2026, resampling is free, 48Khz is the standard. 44.1Khz will work, and so will 96Khz but use 48Khz if you want the best quality.
Is 48kHz really the standard nowadays?
48kHz makes alignment between video and audio so much easier. (I.e.: Lip synchronization after edits)
AAC has a strange quirk that the window size is dependent on the sampling rate, thus requiring a complete psychoacoustics reoptimization of all encoder parameters for each sampling rate, since a 20msec window sounds very different than a 60msec window, to human ears.
This was of course fixed in Opus.
I know the opus codec assumes everything is 48kHz and will resample inputs to that.
For one, audio transcription services that use Whisper will sample the input down to 16Khz mono first.
48kHz has been the recommended setting with Premiere Pro as long as I can remember.
44.1kHz, isn't that what lameMP3 uses as default?
It's what CDs use, so it would make sense for mp3 encoders to follow suit.
Yes, pretty much all new hardware uses it as default output setting as well (by that I mean laptops, phones, smart speakers, etc.)
More or less. Streaming is often done with 48, video content has ben 48 for a while now, so unless you still produce content for CDs it is the standard.
44100 Hz had reasons no longer really needed (storing audio in 3 samples per line in VHS: 490 lines × 3 samples × 30 GPS = 44100 sample/s).
Qualitywise both are more than enough snd 99.99% of people would not be able to tell it apart in a blind test. Higher sample rates than 48kHz only needed when you want to pitch down ultrasonic recordings (of whales, bats and other such animals for example).
Aside from this higher than 48 kHz sample rates may have only downsides, like increased size and potential distortion in the ultrasonic frequency range that has sidebands in the audible range. Yet there is a persistent, but unscientific "more-is-better"-crowd in the HiFi-sector.
> Higher sample rates than 48kHz only needed when you want to pitch down ultrasonic recordings (of whales, bats and other such animals for example).
There are numerous use cases for higher sample rates that go beyond this but it's hard to talk about it without starting flame wars filled with junk science.
Say it or don't but "I have evidence otherwise but don't think I should say" is just as bad a flame war gateway as tempting the junk science audiophiles directly.
Higher sample rates are lower latency for the same block size and resampling is not "free" (pick 2: performance, aliasing, latency) so there can be advantages to working with audio archived at higher sample rates.
But all the advantages come down to professional or editing use cases. There's next to zero advantage to using it as a storage format for listening. Just like 24 bit audio (do you have an amp with 96dB SNR?).
Just personally, I have seen little evidence (personally, professionally, or academically) that there is any advantage for lossless audio for consumer applications. For professional applications there are plenty, and it's endlessly tiring to convince people that "no, actually I need 96kHz for my use case."
I know that with oscilloscopes, it’s recommended to use 5x instead of nuquist 2x of the highest frequency you want to use., but the most reasonable argument I’ve heard for higher than 48kHz sampling is digital audio effects.
But for the end result 48kHz is more than necessary. I can’t even hear any frequency above 17kHz.
>FFmpeg's AAC DEcoder is busted with regards to stereo PNS, and the bug may be in other AAC decoders too, so we work around it in the encoder. Since no other encoder used PNS, the bug was not found until now.
I don't know what PNS is, but I bet this has been bothering someone's niche use-case for 20 years
https://www.audiolabs-erlangen.de/content/resources/aesCodin...
A very welcomed addition, hopefully I can replace fdk-aac
Flagged for the wrong link.
Hopefully they see this - there's still time to edit the submission link.
It doesn't let me edit the link, but I'm confused by what even happened here... I posted this from my phone and that wrong link doesn't show up in my clipboard history.
Link should be: https://hydrogenaudio.org/index.php/topic,129691.0.html
It's fixed now.
Our software follows redirs and somehow we got a 302 to our own IP. Perhaps it is someone's idea of a bot detector?
Your options are:
* quick email to HN@ycombinator.com with a "Help Me please!! and link ( mods can edit link in and sideline (hide) these comments )
* Just live with the rotting fish head of public boo boo (we've all made mistakes, as the Dalek said whilst climbing down off the dustbin)
* I can kill the whole thing dead.