I admit I am new here, and I don't know how you will react, but for the
few several minutes it takes to read this post I ask simply that you lend me your ears (eyes?), as I post this topic in the interest of inciting naught but thought about this subject. If you feel I am wrong about this, I am very much interested in your opinion.
Basically, the raws are a flat format that has been coerced into becoming a pseudo-hierarchical format with light scripting capabilities. This has, or will eventually, lead to the format becoming unmanageable.
I'm using the word "scripting" very
loosely here.There is also a TL;DR near the bottom, so scroll down if you collide with Text block wall and burst.
First, some related links. There have been a few calls and ideas for and attempts made at editors for raws. This is normal for any programming language that sees a high level of editing.
Then, there's
a project that converts raws to and from xml. This is a huge red flag. Not that someone did this--I mean, at least one person has tried to coerce any bit of data you can think of into XML--but rather that XML is not intended for this sort of application, and yet appears to perform better (in a sense...) than the raws.
In short, XML is designed to be machine-readable. Legibility by humans is coincidental. It is not easily editable. The ability for XML to better present the data found in the raws lies solely in the fact that XML is designed to be hierarchical.
The raws are a flat format because there is no explicit ending to any given object or group. It's not obvious until you consider the CASTE tag. In the DWARF entry in creature_standard, for example, it begins the FEMALE caste and sets some things for it. Then it begins the MALE caste--which also implicitly ends the FEMALE caste--and sets some more things for that. As there is no explicit ending to an object or grouping, the next tag also belongs to the MALE caste, and so on. There's a logical end that the format can't recognize until the command SELECT_CASTE is issued, which implicitly ends the MALE caste and opens up a previously existing caste for continued modification, in this example, the ALL caste.
The various levels of the hierarchy
are indented, though as far as I can tell, this means exactly nothing and is as decorative as the commentary in the file.
It should have been possible to place the caste-specific detail at the end of the creature definition, as then the next creature would implicitly close everything, not necessitating a specific command to essentially move up one level in the creature detail hierarchy. All the generic stuff was already listed. Unfortunately, castes are not the only groupings that work like this. Tissue layers also implicitly group subsequent tags, and so both tissue layers and castes have selection commands to move around the hierarchy.
The format has also taken on scripting capabilities, which I believe to be a result of limitations of not only the lack of explicit hierarchy, but also those of the template system.
In the dwarf example, the very first instances of a token that displays data manipulation rather than data description is when eyebrows and nails are added. In that case, the general body has already been pieced together from structure and material templates, and the details needed are not described in the templates.
To get around this, it is possible to include a tissue template into a new body part, and to position a body part with finer control than in the templates. I don't believe there is a way to accomplish this goal
here without either these commands or over-parameterizing the templates. However, it could be solved else where without these rudimentary scripting capabilities; if templates use other templates and position parts with the same precision as the creature object, the entire body could be built with full detail before being included wholesale into the creature object.
I know that there is at least one mod that does something like this to streamline creature creation, though I have not had the chance to look into it (read: I forgot the name and can't find it now). Whether it is actually chaining templates for progressive detail or just providing more building blocks I do not know.
It is also possible to add selections together, creating a super-selection, which is used to set identical attributes (modifiers like hair color and length, nail length, and skin color) to multiple similar body parts at once. Again, these selections are implicitly closed by the next group or the end of the entry, and again, with template chaining this could be done without those commands.
TL;DR:In conclusion, the raw format coerces a flat structure to be hierarchical by implicitly closing objects and groups, then uses non-data commands to navigate this hierarchy to apply additional data, turning creature definitions at least into spaghetti code. With explicit closures, you could work in a group like caste and then closing it automatically drops you back to the previous level.
Further, to workaround limitations to the template system, additional non-data commands select and manipulate selections of groups, as well as position detail in relation to parts defined elsewhere, which contributes to the spaghetti code.
In the future, an explicit data hierarchy will likely be needed, and a more powerful template system would allow much of the selection magic to be dropped, resulting in cleaner creature definitions, and easier to understand templates at every level.