Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Poll

How should the parser format tag names?

Don't make any changes.
- 0 (0%)
Capital RAW names, lower-case XML names.
- 1 (33.3%)
Capital RAW names, leave XML names alone.
- 0 (0%)
Leave RAW names alone, lower-case XML names.
- 1 (33.3%)
Make both options toggle-able settings.
- 1 (33.3%)
Other - Please post...
- 0 (0%)

Total Members Voted: 3


Pages: [1] 2

Author Topic: Convert Dwarf Fortress Raws to and from XML! (Update Mar. 18, 2010)  (Read 3535 times)

timmeh

  • Bay Watcher
    • View Profile
    • My Portfolio

After I finished the initial work on my World History/Sites and Population-to-XML program, I went back and re-read the post that inspired it.  I realized that the history and sites were only half the information (if that) that would be needed to build a database of everything in a given Dwarf Fortress world, as the history and site files contain no information about the items, creatures, materials, etc. found in the world.  So, I started considering the idea of writing another program to parse the raw files.

It took me about 30 seconds to realize that the sheer number of tokens that can be used in the raws would make this a monumental task, that I would be lucky to finish before the next version was released and some (many?) of the tokens were made obsolete.  So I turned my thoughts to the idea of using other external files to parse the raws.  I ended up settling on writing a graphical application instead of a command-line/terminal app, so that the external files would be easy to add to and change.  The result, the Dwarf Fortress Raw Parser.


Features:
  • Uses external "tag list" files, which can be edited by anyone with the time to spare, to determine how to parse the raws, and what the resulting XML should look like.
  • I've done my best to keep the code platform-portable, so while I don't have a Mac or Linux computer, and as such cannot compile the app for either OS, the code should compile okay on either one, should someone have the time and patience to do so for me.  (Although compiling is slightly complicated by a dependency on both wxWidgets and Boost...)
  • Until and unless Tarn decides to completely overhaul the raw system (I.E. Totally ditching the [TAGNAME:Argument1:Argument2:...] format) it should work with all future versions.
  • In case something should happen to me (like college eating all me spare time...), I've released the source code as well (under the GNU General Public License), and will set up a SVN repository as soon as I get the chance.
  • Full documentation, with a tutorial on tag editing (although it's hardly complicated enough to need it).
  • Includes a tag-list that can be used to parse the "creature_*.txt" files.


Links:
Project Page
Google Code Page
Download - 3.3MB
Source Download - 90.7KB


Screen-shots:



Enjoy!
« Last Edit: March 26, 2010, 09:56:41 am by timmeh »
Logged
On the Wall is a Masterfully engraved carving of Urist McHardcastle and Goblins. Urist McHardcastle is surrounded by the Goblins. The Golbins are stamping on Urist McHardcastle. Urist McHardcaste is laughing at the Goblins. The carving related to the prolonged and bloody death of Urist McHardcastle in the Fall of 1659, the Winter of 1659, and the Spring of 1660. On the engraving is an image of Cheese.

Rochndil

  • Bay Watcher
  • Mad Modder
    • View Profile
Re: Convert Dwarf Fortress Raws to XML!
« Reply #1 on: March 14, 2010, 09:49:29 am »

Good morning!

I think this is a brilliant idea, because it's one I've had myself in the past! :)

I'm not a programmer though, so there's no way I could actually make it happen. The potential uses for a DF-->DB-->DF application are significant, especially for modding/multi-modding users.

One of the biggest problems with modding is visually parsing the raws to make sure that all the necessary data is there, and all the UN-necessary data is not. The fact that the raws are not always stored in a consistent format or order does NOT help much.

By being able to take the raws data out and dump them into a database, manipulate the data there, and then write them BACK into raw-formatted files, would be a tremendous improvement. For example, this would make the process of MERGING raw files referencing the same data MUCH easier. Also, finding subtle conflicts, like creatures that are defined with very slightly differing names, would become practical.

I'll definitely be taking a close look at this project of yours, because there are a dozen different ways I can use it already.

Rochndil, who plays with a stone-soup of mods...
Logged
Build an Elf a fire, keep him warm for a day.
Drown an Elf in magma, keep him warm for the rest of his life!

jaked122

  • Bay Watcher
  • [PREFSTRING:Lurker tendancies]
    • View Profile
Re: Convert Dwarf Fortress Raws to XML!
« Reply #2 on: March 14, 2010, 12:38:23 pm »

this seems as though it would be better if it were in suggestions, although if you intend to make a tool that will convert DF raws to and from XML for the purposes of editing.
this is a good Idea in general, but I've already learned the tokens, therefore I think that XML should start looking more like the Dwarf Fortress raws

timmeh

  • Bay Watcher
    • View Profile
    • My Portfolio
Re: Convert Dwarf Fortress Raws to XML!
« Reply #3 on: March 15, 2010, 04:48:07 pm »

@Rochndil - Thanks!  I'm looking into converting the XML files back into RAWs now; shouldn't be too difficult, just need to decide if I'd rather do something similar to the current RAW->XML method, reading through it one tag at a time, or if I'd rather do something more based on the XML tree...  Anyways, I should be able to start programming soon, just need to finish up a precis for my literature class...

@Jaked122 - Um, the XML v.s. RAWs thing was a joke right?  XML has been around for years, it's more popular, more powerful, and better (more consistently/concretely that is) formatted... that, and one of the things I'd like to see is the ability to generate a wiki-like display of a dwarf-fortress world, which is infinitely easier with XML...

Anyways, SVN repository is now available on the Google Code page.  I left out the wx_pch.c.gch folder, as it's contents are included with wxWidgets (which the converter is dependent on anyways), and totaled at ~100mb, which would make the repository reach the space limit Google set fairly quickly...
Logged
On the Wall is a Masterfully engraved carving of Urist McHardcastle and Goblins. Urist McHardcastle is surrounded by the Goblins. The Golbins are stamping on Urist McHardcastle. Urist McHardcaste is laughing at the Goblins. The carving related to the prolonged and bloody death of Urist McHardcastle in the Fall of 1659, the Winter of 1659, and the Spring of 1660. On the engraving is an image of Cheese.

Astarch

  • Bay Watcher
    • View Profile
Re: Convert Dwarf Fortress Raws to XML!
« Reply #4 on: March 15, 2010, 07:49:12 pm »

Well, DF RAWs may be a bit annoying to edit, but they're nothing if not easy to read, whereas XML generally is not.

timmeh

  • Bay Watcher
    • View Profile
    • My Portfolio
Re: Convert Dwarf Fortress Raws to XML!
« Reply #5 on: March 15, 2010, 08:07:10 pm »

@Astarch - That may be true (although I've seen files as parts of mods that were a real mess...), but only if the XML is examined in Notepad or similar.  I can open an XML file in something like XML Marker, and get a nice, clean tree-view of everything in the file.
Logged
On the Wall is a Masterfully engraved carving of Urist McHardcastle and Goblins. Urist McHardcastle is surrounded by the Goblins. The Golbins are stamping on Urist McHardcastle. Urist McHardcaste is laughing at the Goblins. The carving related to the prolonged and bloody death of Urist McHardcastle in the Fall of 1659, the Winter of 1659, and the Spring of 1660. On the engraving is an image of Cheese.

Eagleon

  • Bay Watcher
    • View Profile
    • Soundcloud
Re: Convert Dwarf Fortress Raws to XML!
« Reply #6 on: March 16, 2010, 10:38:57 am »

XML is hideous IMHO (most especially automatically generated XML), but it's still easier to parse with another program, if only because hundreds of libraries have been made to do just that. I have to say if DF's raws became XML-based, without some sort of editor we'd lose most of our modders.

This is awesome stuff and exactly what we need to create an editor suite, though. Toady obviously has plans to include more editors (the map editor in world-gen is one early example), but that's probably a fair ways off. Looking at the raws for the new version, it's going to get ugly fast - lots of "Fun" where someone modifies some material template that another mod depends on to have certain characteristics, etc. hehe. That's my prediction at least. I'll be curious to see what kind of utilities emerge to handle it - a while back I made a quickie Material Template editor, but it's all basic VB crap with a ton of copy-paste because I didn't know better :P This would have probably made it much easier.
Logged
Agora: open-source, next-gen online discussions with formal outcomes!
Music, Ballpoint
Support 100% Emigration, Everyone Walking Around Confused Forever 2044

timmeh

  • Bay Watcher
    • View Profile
    • My Portfolio
Re: Convert Dwarf Fortress Raws to XML!
« Reply #7 on: March 16, 2010, 02:14:38 pm »

@Eagleon - Again, I definitely don't disagree if you're trying to edit it by hand in something like notepad... but there are numerous free tools for editing XML files that make it look a lot nicer and cleaner... the only reason the raws are so easy to read at the moment, is cause they come indented and otherwise formatted cleanly.  Problem is, there's no rules there, I could lay out the raw file exactly like an XML file (just with square brackets and no closing tags...), and it would look similar, or not indent any of the tags, or heck, I don't believe DF would complain if I threw all the tags on the same line, without even putting a space between them...

But that's getting off topic again, I've a big advocate of XML, so I'm more than a little biased here... sorry... :P

Anyways, thank you :D  Progress continues on the XML->RAW front, I've found the library I'm going to use, and I just need to play around with it and get a better feel for it before I can wrap this up pretty quickly...

[EDIT]:  There's a small mistake in the included tag list, where I had it recording the ID in the [CREATURE:ID] tag in a <name> tag.  Just had to rename the argument in the creature tag to "id", for anyone who cares.  Progress with the XML->RAW is smooth and continuous, shouldn't be too hard.
« Last Edit: March 16, 2010, 02:43:42 pm by timmeh »
Logged
On the Wall is a Masterfully engraved carving of Urist McHardcastle and Goblins. Urist McHardcastle is surrounded by the Goblins. The Golbins are stamping on Urist McHardcastle. Urist McHardcaste is laughing at the Goblins. The carving related to the prolonged and bloody death of Urist McHardcastle in the Fall of 1659, the Winter of 1659, and the Spring of 1660. On the engraving is an image of Cheese.

FatedTemp

  • Bay Watcher
    • View Profile
Re: Convert Dwarf Fortress Raws to XML!
« Reply #8 on: March 16, 2010, 02:40:23 pm »

XML is hideous IMHO (most especially automatically generated XML), but it's still easier to parse with another program, if only because hundreds of libraries have been made to do just that. I have to say if DF's raws became XML-based, without some sort of editor we'd lose most of our modders.

I disagree with this. And here's why:
Spoiler (click to show/hide)
That's the RAW as it is now, editing in Notepad from that, and keeping all the same data I made this:
Spoiler (click to show/hide)

Of course it could be written differently, more like the RAWs or more efficiently, but the point is that it's no harder to read or edit. Anyway, xml stuff aside I support this project, the more tools we have to help us play, explore and mod DF the better!
Logged

timmeh

  • Bay Watcher
    • View Profile
    • My Portfolio
Re: Convert Dwarf Fortress Raws to XML!
« Reply #9 on: March 16, 2010, 02:57:31 pm »

@Fated Temp - Thanks :)

Just about got the XML->RAW thing figured out, just have to get it to handle tags that have both attributes of their own, and have children-with-children (I.E. "grandchildren").  As it is, it knows to write XML tags containing only text (I.E. <fat>4</fat>) as [tagname:attribute], it knows to write those with no attributes as [tagname], and currently takes any tags that don't have grandchildren, and outputs them as [tagname:attr1:attr2:attr3...]  Just have to get it to handle tags like [OBJECT:ID] and [CREATURE:ID] properly, and it should be good.

[EDIT]:  Got it working in the testing project:

Just need to transfer it over to the original program, and make a small change to the way it handles tags like [OBJECT:ID] and [CREATURE:ID], and it should be ready.  One thing of note, until I find a better way to handle it, be careful not to let the attribute names of tags like object/creature match the names of the possible child tags, as it may try to pick the wrong information out of the XML if they share the same name (although this should probably be avoided anyways, simply on account of keeping clean XML...)


[EDIT2]:  Couple quick formatting questions.  The program now tracks two names for the tags, one for the RAW files, and one for the XML.  I.E. When converting to XML, it checks the tags against the RAW name, then writes them with the XML name.  This is completely transparent at the moment, the only difference is that when re-naming a tag, there is an extra text-box that allows you to set a different XML name, as it defaults to the same as the RAW name.

Now, from what I understand, "standard" RAW tags are in all caps, while standard XML tags are in all lowercase.  My questions are these:
1.  Should I make it automatically change the RAW tag names to all caps?  (Technically this doesn't matter for converting to XML, since the check isn't case-sensitive, but it'll matter when I set it up to write RAW files from the XML files...)
2.  Should I make it automatically change the XML tag names to all lowercase?  (Again, won't really matter when reading the XML file, but it will matter when writing one...)
3.  Theoretically, I could make this a check-box option for each, but this could be time consuming, as I'd have to teach myself how to properly set up check-boxes, and I'd have to set up some sort of settings file, probably either INI or XML... would it be worth the trouble?  If this is something you'd like to see, would it be better in a separate options menu, or as a check-box next to the "Add Tag" button or something?

Feel free to comment or use the poll or whichever, just need some input, given my horrible lack of visual/GUI-design skill :P
« Last Edit: March 16, 2010, 11:06:18 pm by timmeh »
Logged
On the Wall is a Masterfully engraved carving of Urist McHardcastle and Goblins. Urist McHardcastle is surrounded by the Goblins. The Golbins are stamping on Urist McHardcastle. Urist McHardcaste is laughing at the Goblins. The carving related to the prolonged and bloody death of Urist McHardcastle in the Fall of 1659, the Winter of 1659, and the Spring of 1660. On the engraving is an image of Cheese.

Eagleon

  • Bay Watcher
    • View Profile
    • Soundcloud
Re: Convert Dwarf Fortress Raws to XML!
« Reply #10 on: March 17, 2010, 10:48:45 am »

Quote
   <body size="5" fat="1" layering="100" homeotherm="10070" flesh="standard" type="quadruped">
      <head eyes ="2" ears="2" nose="1" mouth="1" brain="1"/>
      <torso_upper neck="1" throat="1" heart="1" lungs="2" spine="1" organs="1"/>
      <torso_lower tail="1" guts="1"/>
      <bodygloss="paw"/>
   </body>
See, this is the kind of stuff that bothered me first looking at XML. I know from modding DF that each is a property of the first, but if I were completely new to the format, is 'body size' an identifier, or is it 'body' and then the subtag of 'size'? Is each consecutive identifier working on a new subtype, or is it part of the first? DF's raws you can pretty much just intuitively understand - that's why I say we'd lose a lot of modders that don't want to deal with learning anything to do with programming/markup, and don't want to download an XML editor. That's if we don't have a good editor :P You have to look back each time you look at any part of an XML file, and there's multiple tag break styles to deal with - I'm used to it, but it's not something you can 'get' just opening a file.

Anyway, enough of my ranting, haha. On topic, I'd say go with whatever is easiest to parse by other XML editors, i.e standard for that format. Wouldn't worry too much about getting the raws exactly the same beyond preserving strings and identifiers such as creature names, so I'd just UCASE all the tag names as they go out.
Logged
Agora: open-source, next-gen online discussions with formal outcomes!
Music, Ballpoint
Support 100% Emigration, Everyone Walking Around Confused Forever 2044

timmeh

  • Bay Watcher
    • View Profile
    • My Portfolio
Re: Convert Dwarf Fortress Raws to XML!
« Reply #11 on: March 17, 2010, 11:11:12 am »

Actually, something like that would look more like this anyways:
Code: [Select]
<creature>
  <id>RANDOM_CREATURE</id>
  <name>
    <singular>Blah</singular>
    <plural>Blahs</plural>
    <adjective>Blah</adjective>
  <size>5</size>
  <fat>1</fat>
  <body>
    <part>2EYES</part>
    <part>2EARS</part>
    <part>etc...</part>
  </body>
</creature>

I typically use separate tags for holding information about the thing described in the XML, and attributes only for describing the tags themselves, rather than the information in them...  personal preference I guess...

Anyways, thanks for the input :)
Logged
On the Wall is a Masterfully engraved carving of Urist McHardcastle and Goblins. Urist McHardcastle is surrounded by the Goblins. The Golbins are stamping on Urist McHardcastle. Urist McHardcaste is laughing at the Goblins. The carving related to the prolonged and bloody death of Urist McHardcastle in the Fall of 1659, the Winter of 1659, and the Spring of 1660. On the engraving is an image of Cheese.

timmeh

  • Bay Watcher
    • View Profile
    • My Portfolio
Re: Convert Dwarf Fortress Raws to XML!
« Reply #12 on: March 17, 2010, 10:23:59 pm »

It basically works, with the capitalization as check-box options at the bottom, although it won't remember settings yet.  Just need to clean it up a bit, and I'll get it uploaded for everyone :D

(As fair warning, an XML file can only really be converted back into RAW format with the same tag-list that was used to change it into XML in the first place.  So if you've got an XML file you may want to change back, but need to make changes to a tag-list, keep a backup of the original, so that it'll still work.  The changes may or may not break it, but better safe than sorry.)

[EDIT]:  Done :D  See the first post for the download links.  The Google Code page and repository have been updated.  I'm gonna go update my website real quick and then I need to get some sleep, I have to get up in 6-7 hours :P

[EDIT2]:  Almost forgot the change list...

Version 1.10 Release

Added check-boxes for auto-caps
Added support for conversions from XML to RAW
Tags now have 2 names, one for XML one for RAW

[EDIT3]:  Okay then, maybe not on the website, FTP server is down... guess it ought to be expected, free host an all :P  Anyways, I'll update the website and snapshot some time tomorrow.
« Last Edit: March 17, 2010, 11:23:25 pm by timmeh »
Logged
On the Wall is a Masterfully engraved carving of Urist McHardcastle and Goblins. Urist McHardcastle is surrounded by the Goblins. The Golbins are stamping on Urist McHardcastle. Urist McHardcaste is laughing at the Goblins. The carving related to the prolonged and bloody death of Urist McHardcastle in the Fall of 1659, the Winter of 1659, and the Spring of 1660. On the engraving is an image of Cheese.

timmeh

  • Bay Watcher
    • View Profile
    • My Portfolio
Re: Convert Dwarf Fortress Raws to and from XML! (Update Mar. 18, 2010)
« Reply #13 on: March 20, 2010, 08:12:03 pm »

No luck so far with the website FTP, although I haven't tried today.  I just realized that I really screwed up the tag-list loading, in that any lists saved from version 1.10 can't be opened yet...  Working on that now, I should be able to get a fixed version up tomorrow night, assuming I'm conscious... tomorrow will be crazy... awesome, but crazy...
Logged
On the Wall is a Masterfully engraved carving of Urist McHardcastle and Goblins. Urist McHardcastle is surrounded by the Goblins. The Golbins are stamping on Urist McHardcastle. Urist McHardcaste is laughing at the Goblins. The carving related to the prolonged and bloody death of Urist McHardcastle in the Fall of 1659, the Winter of 1659, and the Spring of 1660. On the engraving is an image of Cheese.

timmeh

  • Bay Watcher
    • View Profile
    • My Portfolio
Re: Convert Dwarf Fortress Raws to and from XML! (Update Mar. 18, 2010)
« Reply #14 on: March 21, 2010, 08:43:53 pm »

Okay, due to a long, complicated series of events I will not have access to my laptop until further notice... 

On a slightly more positive note, I did manage to fix the loading bug.  I won't be able to upload it until I get my laptop back, but it's fixed.

Oh, and typing using a PS3 controller is a pain :p
Logged
On the Wall is a Masterfully engraved carving of Urist McHardcastle and Goblins. Urist McHardcastle is surrounded by the Goblins. The Golbins are stamping on Urist McHardcastle. Urist McHardcaste is laughing at the Goblins. The carving related to the prolonged and bloody death of Urist McHardcastle in the Fall of 1659, the Winter of 1659, and the Spring of 1660. On the engraving is an image of Cheese.
Pages: [1] 2