Astro/Solid - Hacker News

$twelvechairs 6 hours ago

Almost all of this is solved by basically putting quotes around strings.

Yaml has its uses cases where you want things json doesnt do like recursion or anchors/aliases/tags. Or at least it has had - perhaps cue/dhall/hcl solves things better. Jsonnet is another. I havent tried enough to test how much better they are.

[-]

$lucideer 3 hours ago

I feel like these two tenets - (1) yaml should require quotes & (2) the value in yaml is in recursion/anchors - are fundamentally the opposite of why yaml exists & why people use it.

The distinguishing draw of yaml is largely the "easiness" of not having explicit opening or - more importantly - closing delimeters. This is done using a combination of white-space delimiting for structure, & heuristic parsing for values. The latter is fundamentally flawed, but yaml fans think the flaws are a worthwhile trade-off. If you're going to bring delimiters in as a requirement, imho yaml loses its raison d'être.

Recursion/anchors/etc. on the other hand are optional extras that few use & some parsers don't even support. If they were the driving value of yaml they'd be more ubiquitous.

Disclaimer: I hate yaml & wish it didn't exist, but I do understand why it does & I frankly don't have a great suggestion for alternatives that would fill those needs. Toml is also flawed.

[-]

$com2kid 12 minutes ago

Genuinely curious - What major flaws does TOML have? I've used it before and it seems like a simple no-nonsense config language. Plenty of blog articles about the flaws behind YAML, I don't really see complaints about TOML!

$darkwater 2 hours ago

I see where you are coming from but YAML anchors are definitely a great and powerful feature that deserves more attention. The other day I was refactoring a broken [1] k8s deployment based on a 3rd-party Helm chart and since I didn't have the time to migrate to a better chart, YAML anchors permitted me to easily reduce YAML duplication, with everything else (Helm, Kustomize, Flux, Kubernetes) completely unaware of anything. Just a standard YAML pattern.

[1] the broken part was due to an ex-coworker that cheated his way out of GitOps and left basically "fake code" committed, and modified by hand (with Lens) the deployment to make it work

$munificent 3 hours ago

> The distinguishing draw of yaml is largely the "easiness" of not having explicit opening or - more importantly - closing delimeters.

Along with a coworker, I wrote the package manager for Dart, which uses YAML for its main manifest file (pubspec.yaml). The lack of delimiters is kind of nice but wasn't instrumental in the choice to use YAML.

It's because JSON doesn't have comments.

If there was a JSON+comments what was specified and widely compatible, we would have used that. YAML really is a brittle nightmare, and the lack of delimiters cause problems as often as they solve them. We wrote a YAML parser from scratch and I still get the indentation on lists wrong sometimes.

But YAML lets you actually, you know, comment out a line of text in it temporarily, and that's really fucking handy. I think of Crockford had left comments in JSON, YAML would be dead.

[-]

$rendaw an hour ago

JSONC is JSON with comments (and trailing commas) and it's fairly widely supported, namely because VS Code ships with support built in and they use it for all their config files. I've seen libraries for a number of languages.

VS code defaults to complaining about trailing commas though (the warnings can be turned off though (it feels like a hack and they didn't properly document it though (it is an officially sanctioned procedure though))).

$lucideer 2 hours ago

> It's because JSON doesn't have comments.

This is a big plus but JSON5 has pretty widespread language library support - probably equal to that of YAML tbh (e.g. Swift has native JSON5 support, I don't know that anyone natively supports YAML). Any reason not to opt for it here?

[-]

$Diti an hour ago

Most protocols defined in RFCs require the use of regular JSON. You don’t have a choice.

$puzzlingcaptcha 4 hours ago

from the article:

>Many of the problems with yaml are caused by unquoted things that look like strings but behave differently. This is easy to avoid: always quote all strings.

$lillesvin 4 hours ago

> Almost all of this is solved by basically putting quotes around strings.

Yeah, that was my first thought as well. I personally don't mind YAML, but I've also made a habit out of quoting strings. And, I mean, you're quoting both keys and strings in JSON, so you're still saving approx. 2 double quotes per key/value pair in YAML if that's a metric that's important to you.

[-]

$montroser 3 hours ago

As the article points out with the `on` example, you really have to quote yaml keys as well, if you want the defense to work...

[-]

$lillesvin 31 minutes ago

The argument was that most of the mentioned problems could be solved by quoting the values. I don't have a problem with avoiding "on" as a key, and I apparently haven't used it ever, because I've never run into this particular problem in my 15+ years using YAML.

So, sure, if you want to play it super safe, quote keys as well. But I'm personally fine with the trade-off in not quoting keys.

$everforward 4 hours ago

JSON doesn’t do them as part of the spec, but there’s nothing stopping you from doing them as post-processing. Eg OpenAPI does it by using a special $ref key where the post processor swaps in the value referenced there.

That’s effectively what jsonnet/cue/hcl do, though as a preprocessor instead of a postprocessor.

$bjackman 4 hours ago

Yeah and this is enforced by default in yamllint.

It's very fair to cry "why the hell do I need a linter for my trivial config file format", and these footguns are a valid reason to avoid YAML.

But overall YAML's sketchiness is a pretty easy problem to solve and if you have a good reason to keep/choose YAML, and a context where adding a linter is viable, it's not really a big deal IMO.

And as hinted in the post, there's really no well-established universal alternative. TOML is a good default but it's only usable for pretty straightforward stuff. I'm personally a fan of the "just use Nix" approach but you can't put a Nix interpreter everywhere. And Cue is way overpowered for most usecases.

I guess the tldr is that the takeaway isn't "don't use YAML" but just "beware of YAML footguns, know the alternatives".

$danmur 2 hours ago

Jsonnet is pretty nice but the library support isn't quite as good. There are some nice libraries for yaml that do round trip processing for example so you can modify a yaml programmatically and keep comments. Yaml certainly has some warts (and a few things that are just frankly moronic) but it deserves some credit for hitting the sweet spot in a bunch of ways.

$zyx321 3 hours ago

It's very counter-intuitive to me that 22:22 would need to be a quoted string, since functionally it's a K-V-pair. YAML itself even uses : in the Dict syntax!

[-]

$darkwater 3 hours ago

It's a key pair in whatever thing reads the YAML and then assign some meaning to that string. In YAML you need to put a space between the semi-colon and the value.

$raincole 5 hours ago

The n, no, off thing is just sad. It's a 100% avoidable issue. But whoever put that into spec was just so clever that they overflew and became stupid.

[-]

$phpnode 3 hours ago

Whoever thought supporting sexagesimal numbers was a good idea needs to spend some extended time away from their computer to reflect on what they’ve done

[-]

$microtherion 2 hours ago

Presumably that was to support time values.

[-]

$Ajedi32 2 hours ago

That makes sense, but I think the vast majority of tools that need time values would actually expect users to just input a string and parse that themselves.

IMO anything other than the basic types supported by JSON (number, true, false, null) ought to be be parsed as a string. Or if you really insist, some kind of special syntax to make it clear it's not a string would probably be acceptable.

$chrisandchris 3 hours ago

We wanted a file format that's easy to read and less verbose than xml and all we got was something that is so full of pitfalls that it would be easier just not to use it.

$__alexs 5 hours ago

This is basically every problem in YAML. Someone couldn't resist adding more stuff and either didn't realise or didn't care about the ambiguities it created.

$kevincox 4 hours ago

It basically feels like overfitting. They saw some use case so they added it. But they didn't think about how this would generalize and now this nice use case is disproportionately supported at the cost of surprising everyone who doesn't need time-of-day fields in their file.

$baobrien 4 hours ago

Too clever by half

$bertman 5 hours ago

Discussion from 3 years ago, when this was originally posted:

https://news.ycombinator.com/item?id=34351503 , 566 points, 358 comments

[-]

$natebc an hour ago

I think this article gets posted about every quarter.

$al_borland 7 hours ago

The Norway problem drives me a bit nuts.

In a lot of the Ansible documentation, yes/no are used instead of true/false. When seeing this in the official docs, I used it, figuring this was the preferred convention in Ansible. These days it now throws warnings or lint errors, so I’m updating it all over the places as I find it. Yet the Ansible documentation still commonly uses it.

[-]

$gchamonlive 4 hours ago

Ansible isn't a gold standard for docs. The docs are updated and maintained, but the underlying interfaces aren't consistent and that leaks to the docs. One can only wonder why, maybe different developers with different ideas for conventions without a style guide.

Ansible is a wonderful tool though, if you can excuse these idiosyncrasies.

[-]

$sofixa 3 hours ago

> Ansible is a wonderful tool though, if you can excuse these idiosyncrasies.

The only advantage Ansible has is how easy it is to start with it - you don't need to deploy agents or even understand a lot about how it works.

Trouble is, it doesn't really scale. It's pretty slow when running against a bunch of machines, and large configurations get unwieldily quickly (be it because of YAML when in large documents its impossible to orient/know what is where/at what level, or because of the structure of playbooks vs roles vs whatever, or because templating a whitespace-as-logic-"language" is just hell). It's also fun to debug "missing X at line A, but the error can be somewhere else". Cool, thanks for the tip.

So it's pretty great to get started with, or at a home lab. Big organisations struggling with it is a bit weird.

[-]

$al_borland 2 hours ago

I found job slicing speeds up jobs dramatically. In a test I did recently it dropped the time from nearly 4 hours, down to 17 minutes, for an inventory of about 4500 hosts.

$dreamcompiler 2 hours ago

Seems like the right answer is "bootstrap your daemon installs with Ansible and then use something that scales better that runs on those daemons."

What are the best practices along these lines? What's the "something better"?

[-]

$TheTaytay 2 hours ago

Curious about this myself!

$tgv 5 hours ago

It depends on how they parse/decode/unmarshal the file. If they use a "generic" yaml parser, no will be translated to false. But if the parser knows the types of the data structure, or can be instructed not to replace certain strings, or has hooks, it can treat no as a string. So it might be that the linter doesn't operate like the parser.

[-]

$maxbond 4 hours ago

Halloween isn't for a few more weeks, but this framework for creating bespoke YAML dialects that can only be parsed by a specific implementation and with the correct type annotations will scare the pants off of your devops colleagues around the campfire.

(In case I haven't succeeded in hitting the right tone, this is intended to be good-natured jest and not snark.)

[-]

$tgv 2 hours ago

Well, JSON cannot represent dates (nor Sets, Maps, NaN, etc.), so quite a few applications with a JSON parser have their own conversion (e.g. seconds since epoch, string parsing, object with date fields). Is that a bespoke JSON dialect that scares the pants off?

Now, JSON is more suited for machine-to-machine, but YAML works fairly well for humans. It's a pity, but a few domain specific don't really hurt, since you can't copy some bit of YAML and paste it in an entirely different config anyway.

PS campfire story? "When we were still working in the old building, deep down in the cellar, there was a colleague who had been there since the early days. Nobody saw him arrive at work or leave. It was as if he was always there. One of the things he had written was a custom parser ... FOR YAML!"

$Y-bar 3 hours ago

Has this really been a problem in the last ten years? Version 1.2 of the spec (if I recall) fixed it in 2009.

$aranw 4 hours ago

I find it remarkable that YAML has become our goto for configuration when it is riddled with parsing traps and inconsistent behaviour that catches out even experienced developers

[-]

$foobarian 3 hours ago

And furthermore I find it remarkable how much people like the visual format where you indent nested things with whitespace. I'm pretty sure it's the main reason Python took off as well.

$mcdonje 4 hours ago

It's because other config formats aren't as expressive.

[-]

$aranw 3 hours ago

> It's because other config formats aren't as expressive.

Oh yeah it is literally the best of a bad bunch in my opinion

I'm hopeful of languages like CUE https://cuelang.org/

$esafak 3 hours ago

See starlark, dall, jsonnet, cuelang, toml, etc.

$mcdonje 3 hours ago

IMO, JSON, YAML, and TOML should all interpret all keys as strings, and only enforce quotes when syntactically necessary.

So, `key1` is a string and doesn't need to be quoted. `12345` as a key is interpreted as a string (because keys are strings) and doesn't need to be quoted. `"key 1"` has a space, so it needs to be quoted.

[-]

$sceptic123 2 hours ago

What does IMO configuration look like

[-]

$psnehanshu 2 hours ago

IMO means "in my opinion", or if you were being sarcastic, putting /s helps.

$edoceo 3 hours ago

We'd have to change the spec and then all the core libs. Big task.

Use more quotes, use yamllint.

Like bash, more quotes and shellcheck.

[-]

$mcdonje 2 hours ago

Specs change from time to time. It requires effort. Nothing new here. It's necessary sometimes. Dealing with annoyances and footguns also takes effort.

$bilekas 2 hours ago

I have always thought that there is a place for YAML but I do tend to avoid it when I can. I will say while working with terraform I have absolutely falled in love with HCL. It makes a lot of sense to me and there are a lot of validating you can do along the way leading to much more confidence in larger setups. iAC in my case at least.

$h1fra 5 hours ago

not only is YAML a pain but JSON has native parser in major languages, while not yaml. I find it crazy some people are still actively choosing this over JSON (or alternatives)

[-]

$loudmax 4 hours ago

This is a case of the right tool for the right job. YAML is far easier to read and parse as a human than JSON.

If you're passing data between processes, and you still want the data to be human readable, then JSON is a good choice.

If you're writing a configuration file that's going to be edited by a human, then YAML is easier to look at and understand.

$Kostarrr 5 hours ago

So... what are the good alternatives to yaml?

For quite some time I thought toml, but the way you can spread e.g. lists all over the document can also cause some headaches.

Dhall is exactly my kind of type fest but you can hit a hard brick wall because the type system is not as strong as you think.

[-]

$endgame 5 hours ago

I wish I had a good answer for you. I've been dissatisfied with Dhall, Nickle, Cue, and possibly others. Dhall's type system is both too strong (you have to plumb type variables by hand if you want to do any kind of routine FP idioms) and too weak (you can't really _do_ much with record types - it's really hard to swizzle and rearrange deeply nested records).

On top of that, the grammar is quite difficult to parse. You need a parser that can keep several candidate parses running in parallel (like the classic `Parser a = Parser (String -> [(a, String)])` type) to disambiguate some of the gnarlier constructs (maybe around file paths, URLs, and record accesses? I forget). The problem with this is that it makes the parse errors downright inscrutable, because it's hard to know when the parse you actually intended was rejected by the parser when the only error you get was "Unexpected ','".

Oh, and you can't multiply integers together, only naturals.

Maybe Nix in pure eval mode, absurd as that sounds?

I think the best thing for tools to do is to take and return JSON (possible exception: tools whose format is simple enough for old-school UNIX-style stdin/stdout file formats). Someone will come up with a good functional abstraction over JSON eventually, and until then you can make do with Dhall, YAML, or whatever else.

[-]

$ruuda 4 hours ago

> Maybe Nix in pure eval mode, absurd as that sounds?

It doesn’t sound absurd, it’s pretty nice. What do you think about https://rcl-lang.org?

[-]

$rswail 5 minutes ago

Just been reading the docs, I like it :)

Gonna have to set aside some time to play with it compared to HCL where I spend a lot of time.

$bmacho 4 hours ago

What about KDL (https://kdl.dev/) or Pkl (https://pkl-lang.org/)?

[-]

$Ajedi32 2 hours ago

For configuration I dislike the XML object model KDL is built around. It needlessly complicates things to have two different incompatible ways (properties and children) of nesting configuration keys under an element.

Pkl seems syntactically beautiful and powerful, but having types and functions and loops makes it a lot more complicated than the dead-simple JSON data model that YAML is based on.

[-]

$speed_spread an hour ago

In JSON I often end up recreating XML attributes equivalent for metadata fields and using custom prefixes to differentiate those fields from actual data. I find it's nice the data/metadata separation at the language level.

[-]

$Ajedi32 an hour ago

Can you give an example of metadata you would put in a config file that isn't configuration and isn't a comment?

$simonask 3 hours ago

KDL is really, really nice. And lightweight.

$lazystone 4 hours ago

No one mentioned HashiCorp HCL so far, though it's really a shame that it didn't get much traction...

$cousin_it 4 hours ago

How about textproto? And the proto definition gives the schema.

$speed_spread an hour ago

The article mentions

> A simple subset of yaml

Which already exists and is called StrictYAML. It's just strings, lists and dicts. No numbers. No booleans. No _countries_. No anchors. No JSON-compatible blocks. So, essentially it's what most of use think as being proper YAML, without all the stupid/bad/overcomplicated stuff. Just bring your own schema and types where required.

https://hitchdev.com/strictyaml/

$maweki 4 hours ago

We found yaml to be a great exchange format for electronic exam data. It allows us to put student submitted answers and source code into a yaml file and there is no weird escaping. It's very readable with a text editor. And then we just add notes and a score as a list below and then there's the next submission.

For readability of large blocks of texts that may or may not contain various special characters and newlines the only other alternative we have seen was XML, but that is very verbose.

So what the author finds as a negative, the many string formats, are exactly what drew us to yaml in the first place.

[-]

$dreamcompiler 2 hours ago

Somebody in these discussions always correctly points out that s-expressions are as expressive as XML but without the excess line noise, so it might as well be me.

$privatelypublic 3 hours ago

What is so verbose about a cdata directive? Everybody complains about XML being verbose, never once heard complains about HTML being too verbose.

$YouWhy 2 hours ago

I came to regard YAML as a kind of a syntactic HFC syrup, a bearable idea that was taken too far.

Alas, YAML is just about everywhere, so the chances for a replacement that'll be both better behaved and as ubiquitous are unfortunately slim.

$BobbyTables2 6 hours ago

I’m amazed how sane the “document from hell” looks.

The author didn’t even get into the weird stuff GitLab does with YAML too!

$seiferteric 4 hours ago

I wonder if you could make a new standard something based on yaml where every value was prefixed by a type so there is no ambiguity.

[-]

$juliend2 2 hours ago

We'd need a "YAML, the good parts".

[-]

$speed_spread an hour ago

It's called StrictYAML.

$Titan2189 4 hours ago

Obligatory https://xkcd.com/927/

[-]

$psnehanshu 2 hours ago

Yup, author made RCL

$vjvjvjvjghv 3 hours ago

It's really interesting that after all these years we still don't have a document format that just works. They all suck in their own sweet ways and we still have culture wars over them.

$kzrdude 5 hours ago

Yaml is an interesting case study that we can (and have) learned a lot about. Mistakes to avoid. :)

$simonask 3 hours ago

I never really understood why nobody ever just forked YAML and took out the ugly bits. It’s not a very complicated parser.

In the mean time, I’m very much enjoying KDL.

[-]

$esafak 3 hours ago

TOML

$thomasfl 4 hours ago

Not many know that the inventor of the YAML specification built a fully working pendulum clock as a teenager. With Lego bricks. YAML is a good standard for simple settings files. For more complex data structures, use JSON.

$rossant 5 hours ago

Wow, I wasn't aware there was so much magic and arcane features in yaml. Great post. Thanks.

$lerp-io 3 hours ago

the problem is that yaml came from geeked out devops employees that used bash where as json came from javascript.

$vivzkestrel 4 hours ago

stupid question: why dont they announce a newer version of YAML that is not backwards compatible and allow only quoted strings in their parser?

[-]

$mystifyingpoi 4 hours ago

> that is not backwards compatible

This would be a massive breaking change for Kubernetes. There are piles and piles of YAML all around the opensource that would need updating. It would be very hard to adopt.

Also, quoting strings 100% of the time just looks ugly in my opinion. Not a big deal with autogenerated YAML, or YAML that I do not maintain, but for anything handwritten it's annoying.

[-]

$phito 4 hours ago

how is it annoying...? it's literally like that in almost every single language out there. IMO seeing unquoted strings in YAML feels weird.

[-]

$mystifyingpoi 4 hours ago

As I said, it's subjective. I like this

    image: my-repo.com/my-app:v1
    imagePullPolicy: Always

more than this

    image: "my-repo.com/my-app:v1"
    imagePullPolicy: "Always"

That's all. Not sure about quoting keys though.

$wingi 2 hours ago

The norway problem is well known.

$xenator 6 hours ago

This one is amazing, I almost pissed myself laughing reading it. So true about YAML. Another caveat is using --- as section separator in the file. It will starts new file inside your existing file.

Still love it.

$mavamaarten 4 hours ago

I despise yaml. On top of the points from the article, I never know where to indent and how whitespace is handled on multiline fields.

Just a yucky standard all-around

[-]

$al_borland 3 hours ago

Whitespace gets weird with indenting code.

I use block scalars constantly now, with liberal use of the trimming dashes all over the place.

Any time I need to preserve some indentation in my result, I always hate the formatting I’m left with, especially if there is logic involved.

$shadowgovt 3 hours ago

Perfectly normal YAML document detected.

More seriously: this is a good overview of the reasons I dislike YAML as a web configuration language. There's too much overlap between the "friendly" auto-type-determination in YAML and the symbols used in web tech, from colons to Norway having a TLD. It wouldn't be so bad if yaml parsers could use expected type of each value as a hint, but that's not a feature in any parser I've met, so I'd rather just not use yaml for anything that's going to end up describing a web service.

$privatelypublic 3 hours ago

Can't take this seriously if XML isn't listed as an alternative.

[-]

$Someone 3 hours ago

FTA: Xml is noisy and annoying to write by hand

[-]

$privatelypublic 18 minutes ago

So, at what point does YAML needing magic incantations, wrapping everything in quotes, avoiding any form of templating, etc. stop being less verbose (oops, meant noisy), and "annoying?"

Reality is, clunky XML is badly designed, or simply has no schema attached.

$secondcoming 5 hours ago

It's honestly absurd how prevalent YAML is. It's clearly dumb.

The YAML Document from Hell