blog

git clone https://git.ce9e.org/blog.git

commit
d5a38e8a790d85c9b24b1b52af24cf013419da50
parent
1a265de64e4e3c401a0b3237689f0460532dbfd0
Author
Tobias Bengfort <tobias.bengfort@posteo.de>
Date
2025-04-27 08:47
post on TOML

Diffstat

A _content/posts/2025-04-27-toml/index.md 140 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

1 files changed, 140 insertions, 0 deletions


diff --git a/_content/posts/2025-04-27-toml/index.md b/_content/posts/2025-04-27-toml/index.md

@@ -0,0 +1,140 @@
   -1     1 ---
   -1     2 title: Some thoughts on TOML
   -1     3 date: 2025-04-27
   -1     4 tags: [code]
   -1     5 description: "I wonder if there is room in the world for a variant of TOML that allows to overwrite values"
   -1     6 ---
   -1     7 
   -1     8 I recently read some things that made me think about configuration files,
   -1     9 namely [TOML](https://toml.io/).
   -1    10 
   -1    11 ## Comparison to INI
   -1    12 
   -1    13 The first text was [An INI critique of
   -1    14 TOML](https://github.com/madmurphy/libconfini/wiki/An-INI-critique-of-TOML)
   -1    15 which compares TOML (quite unfavourably) to INI. As a quick reminder, INI is a
   -1    16 family of formats that have a set of sections with key-value pairs, like this:
   -1    17 
   -1    18 ```ini
   -1    19 [server1]
   -1    20 hostname = foo
   -1    21 cores = 16
   -1    22 online = true
   -1    23 tags = linux, europe
   -1    24 
   -1    25 [server2]
   -1    26 hostname = bar
   -1    27 cores = 8
   -1    28 online = false
   -1    29 tags = bsd, asia
   -1    30 ```
   -1    31 
   -1    32 I think it is fair to say that TOML is a rather distant relative in the INI
   -1    33 family. The main critique in the article seems to be that TOML has types,
   -1    34 similar to JSON. With INI, it is typically the application and not the
   -1    35 configuration file that says how a value should be interpreted. So `false`
   -1    36 could be interpreted as a boolean, but also as a string, or a list with a
   -1    37 single element. Based on that difference, the author has multiple complaints
   -1    38 about TOML:
   -1    39 
   -1    40 -   Users must know the correct types for value
   -1    41 -   Users must use quotes around strings and square brackets around lists
   -1    42 -   Date-related types are bad for some reason
   -1    43 -   The application still has to interpret values for types that are not
   -1    44     covered by this simple system, e.g. enums
   -1    45 
   -1    46 While I understand the critique, I don't think it is all that strong. I believe
   -1    47 it is actually a good thing that users can understand the type of values from
   -1    48 reading the configuration file. I don't really mind the quotes, square
   -1    49 brackets, and dates. And my experience with JSON has taught me that a few
   -1    50 simple types can go a long way. On the other hand, my experience with INI has
   -1    51 taught me that having to call the correct typed getter on each use of a config
   -1    52 value can become tedious.
   -1    53 
   -1    54 The proper way to handle this is to validate the configuration before using it.
   -1    55 If any values have the wrong type (TOML) or cannot be interpreted as the
   -1    56 correct type (INI), the application should exit with a descriptive error
   -1    57 message.
   -1    58 
   -1    59 ## Overwrites
   -1    60 
   -1    61 The second text was the [UAPI Group Configuration Files
   -1    62 Specification](https://uapi-group.org/specifications/specs/configuration_files_specification/).
   -1    63 This one is less about the format of configuration files themselves, but about
   -1    64 their location in a Linux system. Crucially, it defines the concept of
   -1    65 drop-ins, these `*.d` folders that can contain snippets of configuration
   -1    66 that are combined together.
   -1    67 
   -1    68 Combining multiple configuration files is important in two situations:
   -1    69 
   -1    70 -   You want to overwrite some specific values and otherwise use the defaults
   -1    71     provided by the distro or vendor
   -1    72 -   Other packages should be able to add their own config, e.g. crontabs or
   -1    73     apparmor profiles
   -1    74 
   -1    75 The concept of drop-ins is well established. I am not convinced that it should
   -1    76 be require for every single configuration file, but a lot of projects would
   -1    77 benefit from it. So I was a bit surprised when I learned that TOML does not
   -1    78 allow overwriting values. (I was also surprised that this limitation is not
   -1    79 even mentioned in the INI article.) TOML is compatible with the second use case
   -1    80 (adding new sections), but not with the first. And [that is not going to
   -1    81 change](https://github.com/toml-lang/toml/issues/697).
   -1    82 
   -1    83 Why is that and can it be fixed?
   -1    84 
   -1    85 ## Hierarchy
   -1    86 
   -1    87 INI doesn't have much of a hierarchy. There are sections, keys, and values, and
   -1    88 that's it. TOML on the other hand interprets dots in sections and keys as
   -1    89 additional levels of hierarchy:
   -1    90 
   -1    91 ```toml
   -1    92 [servers.foo]
   -1    93 cores = 16
   -1    94 status.online = true
   -1    95 status.has_errors = false
   -1    96 tags = ["linux", "europe"]
   -1    97 ```
   -1    98 
   -1    99 This also means that sections are basically just common prefixes and can be
   -1   100 avoided entirely:
   -1   101 
   -1   102 ```toml
   -1   103 servers.foo.cores = 16
   -1   104 servers.foo.status.online = true
   -1   105 servers.foo.status.has_errors = false
   -1   106 servers.foo.tags = ["linux", "europe"]
   -1   107 ```
   -1   108 
   -1   109 With this structure, overwriting would be simple: Later values simply overwrite
   -1   110 earlier ones. However, the big issue are lists. How would you add, remove, or
   -1   111 modify individual list items? For simple lists it might be ok to just replace
   -1   112 them completely. But TOML provides not one but two ways to nest tables inside
   -1   113 of lists:
   -1   114 
   -1   115 ```
   -1   116 # inline tables
   -1   117 servers = [
   -1   118     {hostname = "foo", cores = 16},
   -1   119     {hostname = "bar", cores = 8},
   -1   120 ]
   -1   121 
   -1   122 # array of tables
   -1   123 [[servers]]
   -1   124 hostname = "foo"
   -1   125 cores = 16
   -1   126 
   -1   127 [[servers]]
   -1   128 hostname = "bar"
   -1   129 cores = 8
   -1   130 ```
   -1   131 
   -1   132 # Conclusion
   -1   133 
   -1   134 Currently, we have the option to either not use TOML, not use overwrites, or
   -1   135 merge the config after parsing (which might potentially lead to multiple,
   -1   136 incompatible implementations).
   -1   137 
   -1   138 I wonder if there is room in the world for a variant of TOML that allows
   -1   139 to overwrite values and bans (or at least discourages) the use of tables inside
   -1   140 of lists.