OK, after a difficult (but mostly worth it) rewrite of the file reader and string parsers, at the very least I have gotten back to the point where things were working before. The rewrite reduced the lines of code required by about 400, and should improve memory performance by a factor of n or something.
After that I got to work on figuring out the biggest hurdle so far - the handling of sub-entries (like how there are caste-level tags that affect a sub-entry of a specific creature - the caste). The solution I came up with ended up being not TOO ugly, and I think it should work. It's all in place, but needs some testing.
Before I can get to any of that though, it has finally come time that I need to make a comprehensive DF raw tag database. If anyone knows if toady has already released such a thing, then I would be glad to see it. Otherwise, I'm going to just be slogging through the raws for the next couple weeks carefully documenting each new type of tag. Luckily, because of the way the program reads the raws, the tokens (that is: [tag:token1:token2:etc.]) do NOT need to be input into any files, since they are simply read and documented as the program runs. Once I get a better handle on linking tags to other tags however, this may end up changing.
Oh, also, the entire database ends up being a bunch of linked lists that are held together by other linked lists - some of that should probably be changed to trees at some point so that when processing the whole DF database the program doesn't end up with an embolism.