Status report! I had been a little haphazard about putting in our lists all the mammals that DF already includes, and many of the fish and insects hadn't been covered yet either. They're all listed now. Also I noticed that the
stingray family had somehow gotten mostly overlooked before, so I fixed that up.
The next thing I'll be working on, when I get a chance, is reorganizing
rodents, which are just one giant group right now. After that I'll be back to bony fish and maybe insects.
Birds are still a ways from being done, and
arachnids are going to need some major reorganization and expansion.
Snakes are also a little sparse.
Firstly, XML will not be necessary here, since we can include any necessary metadata in comment lines.
Yeah, any kind of comments would work, I just mentioned XML for ease of parsing.
Also, geographical metadata would not be enough, even combined with the taxonomic classifications. For example, we may decide that five frogs in a family are redundant with each other, but the sixth should always be included because of its distinct poison. We would need a system that would allow a program to identify the five frogs and select two of them for inclusion, and always include the sixth. Even worse, there will be cases where a group of redundant creatures include some family level entries and some genus level entries.
We could add our own redundancy groups to the raws in comment lines. If we added a line containing '*normal_frogs' to the raws of the redundant frogs, a program could easily identify them, group them, and randomly choose which to include in the final raws without a geographical identifier.
How does that sound?
I have a couple objections:
- One of our stated criteria for including creatures in the Ark Project at all is that they shouldn't be redundant with each other. Sure, some of them are more interesting than others, but we already have a hard enough time making the binary decision of whether to include them or not.
- Right now, creatures are grouped in one straightforward and easily comprehended way -- a taxonomic tree. Groups that cross taxonomic lines ('*normal_frogs') would be difficult to work with -- there wouldn't be any way to see, at a glance, what groups exist and what creatures are in them.
- Since the groups would be inherently built around whatever animals we currently have, adding new animals later would be a pain at best, and could potentially require that the groups be totally restructured -- and restructuring could mean removing strings from dozens or hundreds of creatures. It's not flexible at all.
- I don't see a reason why certain creatures should always be guaranteed. Every continent has plenty of fascinating animals to go around.
#2 is also true for geographical data, but it's only a real problem when #1 and #3 come into play.
A similar system could be used for the dialect problem discussed before:
[PREFSTRING:coloration] *en-US
[PREFSTRING:colouration] *en-UK
The compiler would include only one of the two lines.
I'd prefer to avoid redundant data of this kind if possible. Prefstrings aren't so bad, but it would get ugly with the description tag.
Early in the thread I found some utilities for automatically doing the American-British conversion. If those work, they're the cleanest option for us by far, but nobody seems interested in checking them out.