Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Author Topic: Programming: Parsing Raws in C++  (Read 1501 times)

Cuppasoup

  • Escaped Lunatic
    • View Profile
Programming: Parsing Raws in C++
« on: January 06, 2018, 12:43:01 pm »

Hey I was wondering if there was anything chance anyone could suggest some textbooks or documentation on parsing with C++. 

I was interested in doing my own stuff similar to the DF raws, except I don't want to be reinventing the wheel the whole time, hence the textbooks/documentation. 

Thanks :)
Logged

Reelya

  • Bay Watcher
    • View Profile
Re: Programming: Parsing Raws in C++
« Reply #1 on: January 07, 2018, 05:35:38 am »

I think if you look for "parsing documentation" then you'll be up to your eyeballs in technical details that you don't need to know.

Raw C++ has no "parsing" built in. It all depends which library for file handling (and optionally, for string handling) you choose to add to your project. If you do it with e.g. iostream + std::string then you need to write a low-level state machine and your own data structures. This is very flexible, however, it's also brittle in that if the file structure can deviate from the expected structure, you need to write your code to take account of all the things that could go wrong with poorly-formatted user files. So, you end up with very complex code with tons of if-statements, special clauses, error checking at every stage, and it only gets more complex and more "spaghettified" as you add in more types of tags, nesting, or tag parameters that can exist.

Basically, if you want to avoid a massive time investment in making your custom parser work just-so, then your best bet is to use some pre-existing file structure, and grab libraries that can read it. XML or JSON are the main choices. I prefer JSON as the main choice, since I know that PHP and JavaScript both have built-in commands for reading/writing data structures to these files. The advantage is that a JSON reader can turn the "raws" file straight into a data structure in the programming language without needing any "manual" parsing from your own code.

If you want something that's more hands-on in giving you more control over how the files look, I recommend that instead of writing your own parser, you get a Regex library for C++. Using Regex's instead of manual parsing would save a lot of work, and Regex rules are far more robust than what you can probably code by hand.
« Last Edit: January 07, 2018, 08:41:10 am by Reelya »
Logged

Magistrum

  • Bay Watcher
  • Skilled Fortresser
    • View Profile
Re: Programming: Parsing Raws in C++
« Reply #2 on: January 07, 2018, 09:57:52 am »

Seconding Reelya here.
Use XML, or look forward to a month of suffering just to get the basics set up.
Logged
In a time before time, I had a name.

Reelya

  • Bay Watcher
    • View Profile
Re: Programming: Parsing Raws in C++
« Reply #3 on: January 07, 2018, 10:14:28 am »

Personally I prefer JSON's style to XML. XML's open and close tags are really not necessary, since you can use nested scope to signify the same thing.

https://www.w3schools.com/js/js_json_xml.asp

XML is significantly more verbose to convey the same data. JSON lets you skip a lot of that, and the end format is much close to e.g. DF Raws.
« Last Edit: January 07, 2018, 10:16:43 am by Reelya »
Logged

Magistrum

  • Bay Watcher
  • Skilled Fortresser
    • View Profile
Re: Programming: Parsing Raws in C++
« Reply #4 on: January 07, 2018, 10:23:14 am »

For web I would go with JSON, just about anything has native functions to deal with it.

I am a bit more acquainted with XML, since I started out with it for using Xpath, Xquery... It feels a bit more flexible.

You are right though, JSON is way lighter.
Logged
In a time before time, I had a name.

milo christiansen

  • Bay Watcher
  • Something generic here
    • View Profile
Re: Programming: Parsing Raws in C++
« Reply #5 on: January 07, 2018, 12:23:57 pm »

I have written several raw parsers and it isn't hard to parse as data languages go. However, for new work I strongly suggest JSON, as a data file format raws suck.
Logged
Rubble 8 - The most powerful modding suite in existence!
After all, coke is for furnaces, not for snorting.
You're not true dwarven royalty unless you own the complete 'Signature Collection' baby-bone bedroom set from NOKEAS

McTraveller

  • Bay Watcher
  • This text isn't very personal.
    • View Profile
Re: Programming: Parsing Raws in C++
« Reply #6 on: January 07, 2018, 02:44:04 pm »

XML is terribly verbose, so I don't like it, but yeah, in general you don't want to roll your own unless you want to learn or if perhaps you need a very simple format and aren't trying to get interoperability with existing formats - sometimes it takes longer to learn or deal with licensing (even OSS licences can be annoying) for XML, JSON, or other existing libraries.  If you want something really simple, writing it yourself isn't nearly as bad as some others make it out to be.
Logged
This product contains deoxyribonucleic acid which is known to the State of California to cause cancer, reproductive harm, and other health issues.

Telgin

  • Bay Watcher
  • Professional Programmer
    • View Profile
Re: Programming: Parsing Raws in C++
« Reply #7 on: January 08, 2018, 10:51:06 am »

Another vote for JSON over XML.  Even besides XML's verbosity, I've found that literally every parser has a different interpretation of what an XML document means, which is infuriating.  Attributes on elements in particular seem to be a free for all when it comes to deciding how the value is represented in your programming language of choice.  JSON never has this problem.

I'm pretty sure that there are some JSON parsing libraries you can use directly with C++, but if not, I know that a parser can be written with minimal fuss since I've done it before.  I do recall that I chose to write it myself even after searching for a library, but I don't remember why.  The one I found might have been Boost based, which I have not had good experiences with.
Logged
Through pain, I find wisdom.

Puzzlemaker

  • Bay Watcher
    • View Profile
Re: Programming: Parsing Raws in C++
« Reply #8 on: January 08, 2018, 11:14:06 am »

Yeah, use JSON or YAML.  XML is a huge pain to set up and isn't necessary unless you have an uber-complicated config. 
Logged
The mark of the immature man is that he wants to die nobly for a cause, while the mark of the mature man is that he wants to live humbly for one.