Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Author Topic: About the raw format itself...  (Read 866 times)

Artanis00

  • Bay Watcher
    • View Profile
About the raw format itself...
« on: August 04, 2010, 09:13:48 pm »

I admit I am new here, and I don't know how you will react, but for the few several minutes it takes to read this post I ask simply that you lend me your ears (eyes?), as I post this topic in the interest of inciting naught but thought about this subject. If you feel I am wrong about this, I am very much interested in your opinion.



Basically, the raws are a flat format that has been coerced into becoming a pseudo-hierarchical format with light scripting capabilities. This has, or will eventually, lead to the format becoming unmanageable.

I'm using the word "scripting" very loosely here.

There is also a TL;DR near the bottom, so scroll down if you collide with Text block wall and burst.



First, some related links. There have been a few calls and ideas for and attempts made at editors for raws. This is normal for any programming language that sees a high level of editing.
Then, there's a project that converts raws to and from xml. This is a huge red flag. Not that someone did this--I mean, at least one person has tried to coerce any bit of data you can think of into XML--but rather that XML is not intended for this sort of application, and yet appears to perform better (in a sense...) than the raws.

In short, XML is designed to be machine-readable. Legibility by humans is coincidental. It is not easily editable. The ability for XML to better present the data found in the raws lies solely in the fact that XML is designed to be hierarchical.



The raws are a flat format because there is no explicit ending to any given object or group. It's not obvious until you consider the CASTE tag. In the DWARF entry in creature_standard, for example, it begins the FEMALE caste and sets some things for it. Then it begins the MALE caste--which also implicitly ends the FEMALE caste--and sets some more things for that. As there is no explicit ending to an object or grouping, the next tag also belongs to the MALE caste, and so on. There's a logical end that the format can't recognize until the command SELECT_CASTE is issued, which implicitly ends the MALE caste and opens up a previously existing caste for continued modification, in this example, the ALL caste.

The various levels of the hierarchy are indented, though as far as I can tell, this means exactly nothing and is as decorative as the commentary in the file.

It should have been possible to place the caste-specific detail at the end of the creature definition, as then the next creature would implicitly close everything, not necessitating a specific command to essentially move up one level in the creature detail hierarchy. All the generic stuff was already listed. Unfortunately, castes are not the only groupings that work like this. Tissue layers also implicitly group subsequent tags, and so both tissue layers and castes have selection commands to move around the hierarchy.

The format has also taken on scripting capabilities, which I believe to be a result of limitations of not only the lack of explicit hierarchy, but also those of the template system.

In the dwarf example, the very first instances of a token that displays data manipulation rather than data description is when eyebrows and nails are added. In that case, the general body has already been pieced together from structure and material templates, and the details needed are not described in the templates.

To get around this, it is possible to include a tissue template into a new body part, and to position a body part with finer control than in the templates. I don't believe there is a way to accomplish this goal here without either these commands or over-parameterizing the templates. However, it could be solved else where without these rudimentary scripting capabilities; if templates use other templates and position parts with the same precision as the creature object, the entire body could be built with full detail before being included wholesale into the creature object.

I know that there is at least one mod that does something like this to streamline creature creation, though I have not had the chance to look into it (read: I forgot the name and can't find it now). Whether it is actually chaining templates for progressive detail or just providing more building blocks I do not know.

It is also possible to add selections together, creating a super-selection, which is used to set identical attributes (modifiers like hair color and length, nail length, and skin color) to multiple similar body parts at once. Again, these selections are implicitly closed by the next group or the end of the entry, and again, with template chaining this could be done without those commands.



TL;DR:

In conclusion, the raw format coerces a flat structure to be hierarchical by implicitly closing objects and groups, then uses non-data commands to navigate this hierarchy to apply additional data, turning creature definitions at least into spaghetti code. With explicit closures, you could work in a group like caste and then closing it automatically drops you back to the previous level.

Further, to workaround limitations to the template system, additional non-data commands select and manipulate selections of groups, as well as position detail in relation to parts defined elsewhere, which contributes to the spaghetti code.

In the future, an explicit data hierarchy will likely be needed, and a more powerful template system would allow much of the selection magic to be dropped, resulting in cleaner creature definitions, and easier to understand templates at every level.
Logged
Git - fast, efficient, distributed version control system
Github - Free public repositories, issue tracking, wikis, downloads...

soul4hdwn

  • Bay Watcher
  • make due with what you have
    • View Profile
Re: About the raw format itself...
« Reply #1 on: August 05, 2010, 03:22:03 am »

i have no idea how much i'm assisting to either helping or adding to the "problem" but someone has managed to use the new creature variation file to condense some/most already made organizations in the files. might be more hierarchical or step based as you'd have to start from body/material/tissue file -> body gloss/structure -> variation for setup -> actual creature -> smaller more specific necessities
Logged

Grimlocke

  • Bay Watcher
  • *kobold noises*
    • View Profile
Re: About the raw format itself...
« Reply #2 on: August 05, 2010, 10:43:13 am »

Just adding that there is also a tag to select additional castes/tissues/materials. With that I dont realy see any difference from this and a system where you have to explicitely tell the system to deselect a caste/material/tissue, as long as you know that it deselects unless you use select_additional.
Logged
I make Grimlocke's History & Realism Mods. Its got poleaxes, sturdy joints and bloomeries. Now compatible with DF Revised!

Artanis00

  • Bay Watcher
    • View Profile
Re: About the raw format itself...
« Reply #3 on: August 06, 2010, 03:15:05 pm »

Just adding that there is also a tag to select additional castes/tissues/materials. With that I dont realy see any difference from this and a system where you have to explicitely tell the system to deselect a caste/material/tissue, as long as you know that it deselects unless you use select_additional.

The main thing missing from that is that there's no way to say when the last one ends.

An abstract example:
Code: [Select]
object:root
// some initial items in root
item1
item2
item3
group:group1
// adding items 4 & 5 to group1
item4
item5
group:group2
// adding items 6 and 7 to group2
item6
item7
// here we add item8 to root
item8

What the computer sees with only implicit endings:
Code: [Select]
root
    item1
    item2
    item3
    group1
        item4
        item5
    group2
        item6
        item7
        item8 // wrong, but there's no way to do otherwise unless item8 is before group1

Raws currently have the selection commands to workaround this, but what I am trying to say is that data manipulation capabilities within a data format is a bad idea. It makes a format that should be static become dynamic.

It's like writing in a (plain) text editor on paragraph 10, and upon deciding you don't like paragraph 2 you type:
Code: [Select]
select paragraph 2; insert "New Sentence. " at position BEGIN+15;The editor then goes back and makes the paragraph look like it changed while you go back and continue writing. The changes you made in paragraph 2 aren't written to the disk, rather the bit of text manipulation that performed the change is. Paragraph 2 is still as it was before that, and in fact no editor knows what paragraph 2 (or any bit of text) looks like until it finishes reading and executing the entire text file.

Back to the example above--now with select commands--what belongs to what?
Code: [Select]
object:root
item1
group:group1
item4
select root
item8
item2
group:group2
item7
select group1
item5
select group2
item6
select root
item3

Spoiler: Answer (click to show/hide)
Logged
Git - fast, efficient, distributed version control system
Github - Free public repositories, issue tracking, wikis, downloads...