Very nice! There are a couple of cosmetic issues though; the output contains the input twice over (I suspect a bug), and a final pass to remove any line consisting entirely of square brackets would be nice. One those are dealt with, I think that looks more human-readable as well as better for diffs.
The problem with relying on a manifest is that it massively cuts down on your potential input. At the moment, that's every mod out there - if it can work installed on vanilla DF (without graphics or dfhack), that install can be fed into the tool / converted to our format / stupidly merged. If we require a manifest, that shrinks the pool to those mods made with this tool in mind or manually updated or compatibility by someone, which is to say no mods at all. So requiring a manifest is fine, so long as we can derive a sensible one from the mod alone, and the defaults aren't too restrictive. Good ways to use it would be to show extra information to the user (author, update link, etc), or helpful but non-critical info for the program - eg known non-conflicting mods (useful for mods that were split up, program then ignores detected conflicts between them as false), base raws version, etc. Again though, the program should work well enough without a manifest file.
Complicated problems in the raws following multiple merges: there's probably something possible with comparing the before and after line identifiers in diffs and working out if two are intending to apply to the same area, even if the content at that line has changed. This would deal with a couple of the problems noted so far. The order might be:
- transform all mod folders into our format, permanently (once ever per mod)
- create (temporary) flattened raws for vanilla and all mods (once per run)
- create a diff between flat vanilla and each flat mod (once per run)
- select mods to load and load order, and each time this changes:
* analyse diffs in order; if changed areas overlap reject merge and return first overlapping mod
- attempt merges in order; if any fail reject the merge and return first non-merging mod
- if merge was rejected inform the user which mod caused failure, otherwise offer new raws
I think at this point we're mostly talking over how to do the starred point to avoid false-positives in merge-ability. While I think this would work enough of the time to ignore the edge cases, I don't have enough modding experience to really tell. Tying for zero false positives (ie merges that should have been refused) if probably futile, because we'd be back to one mod at a time. I'm still focussing on normal diffs rather than something raw-aware because it's so much faster to build; we can improve the system after it exists.
And in the end we can add a label to use at own risk, explain that modding can lead to a broken game and while we try it's not perfect, and if the new world they generate is broken - that's what modding is like sometimes!