Yes, any variation from the vanilla file structure will mean files have to be copied instead of merged - which massively increases the chance of conflicts. You can't really avoid new files with some mods, but trying to 'fix' the folder structure should make them easier to spot.
This looks like a very good start, though I'm not sure why it couldn't handle stuff outside the raw folder (like Stonesense XML or a new embark profile) so long as it doesn't attempt any flattening there.
I tried to look through Python's documentation to see if I could be of any more help on the project, but it looks like it will be a while before I do anything other than tweak other people's code. I did, however, see that Python has a nice
ElementTree structure that can be used to hold (among other things) file directory information or XML data. Not sure if an
ElementTree would be easier than
shutil, but the XML parser
massively simplifies the problem of using a manifest file.
<?xml version="1.0" ?>
<!-- This XML contains enough information to make an educated guess about how two mods would merge *without processing any files.*
If it sees multiple mods affecting the same file, it notifies the user that it has to try a merge to figure out compatibility.
This means that the tool can download a bunch of these manifests in bulk to preview mods. -->
<modpack token="Example"> <!-- The token is the name of the folder holding the mod, and how mods refer to one another. -->
<title>The Example Mod</title>
<version>0</version><subversion>90</subversion> <!-- Would display as "v0.90". Can also add a <revision> tag for "v0.90r2" -->
<author>Dirst</author>
<thread>http://www.bay12forums.com/smf/index.php?topic=000000</thread>
<download>http://dffd.wimbli.com/file.php?id=0000</download>
<description>This mod is an example used to illustrate the optional manifest file.</description>
<dependencies>
<dfhack/>
<mod token="Basic" min_version="0" min_subversion="01" order="before"/>
</dependencies>
<compatibility>
<core max_version="40" max_subversion="07"> <!-- The lack of min_ will match anything v40.07 or earlier. -->
<color>red</color>
<message level="3">The Example Mod is not compatible with Dwarf Fortress before v40.08.</message>
</core>
<core min_version="40" min_subversion="08" max_version="40" max_subversion="11">
<color>green</color>
</core>
<core min_version"40" min_subversion="12">
<color>orange</color>
<message level="3">The Example Mod has not been tested with this version of Dwarf Fortress.</message>
</core>
<dfhack min_version="40" min_subversion="08" min_revision="2" max_version="40" max_subversion="10" max_revision="1">
<color>green</color>
<message level="1">"The Example Mod" employs DFHack's interactionTrigger and lua scripts.</message>
</dfhack>
<dfhack min_version="40" min_subversion="10" min_revision="2">
<color>orange</color>
<message level="3">The Example Mod has not been tested with this version of DFHack.</message>
<message level="1">"The Example Mod" employs DFHack's interactionTrigger and lua scripts.</message>
</dfhack>
<mod token="CLA" min_version="40" max_version="40">
<color>green</color>
</mod>
<mod token="Ironhand" min_version="40" max_version="40">
<color>green</color>
</mod>
<mod token="MasterworkDF"> <!-- Lack of min_ and max_ will match any version -->
<color>red</color>
<message level="3">The Example Mod is known to be incompatible with Masterwork DF.</message>
</mod>
<mod token="Mayday" min_version="40" max_version="40">
<color>green</color>
</mod>
<mod token="OldGenesis">
<color>red</color>
<message level="3">The Example Mod is known to be incompatible with Old Genesis.</message>
</mod>
<mod token="Phoebus" min_version="40" min_subversion="07" max_version="40">
<color>green</color>
</mod>
<mod token="Spacefox" min_version="40" max_version="40">
<color>green</color>
</mod>
<mod token="Other" min_version="1" min_subversion="0" max_version="1" max_subversion="35">
<color>yellow</color>
<message level="2">There have been no reports of conflicts with this version of That Other Mod.</message>
<message level="1">"The Example Mod" & "That Other Mod" both add to entity_default.txt, but do not appear to affect one another's edits.</message>
</mod>
<mod token="Basic" min_version="0" min_subversion="01" max_version="0" max_subversion="20">
<color>green</color>
<message level="1">"The Example Mod" is an extension of "The Basic Mod" and will not function without it.</message>
</mod>
<mod token="Basic" min_version="0" min_subversion="21">
<color>orange</color>
<message level="3">"The Example Mod" has not been tested with this version of "The Basic Mod".</message>
<message level="1">"The Example Mod" is an extension of "The Basic Mod" and will not function without it.</message>
</mod>
</compatibility>
<manifest>
<!-- Upon merge, any files with no path info are copied into an LNP documentation subfolder named after the token. -->
<!-- The parse attribute defaults to "flatten" which also strips out anything outside of []s. -->
<file parse="text">manifest.xml</file> <!-- Encode linefeeds if needed, but do not flatten this file. -->
<file parse="text">readme.txt</file>
<file parse="none">The Example Mod v0.90.pdf</file> <!-- Pass this file exactly as-is. -->
<file parse="none">data/art/font.TTF</file>
<file>data/init/embark_profiles.txt</file> <!-- Should be able to add a profile to the end of the user's set. -->
<file parse="text">data/init/overrides.txt</file> <!-- TWBT uses # to comment out inactive lines, so can't trust flattening. -->
<file parse="text">raw/onLoad.init</file> <!-- Mods that use DFHack will probably want to add lines to this file. -->
<file>raw/graphics/graphics_example.txt</file>
<file parse="none">raw/graphics/example/example.png</file>
<file>raw/objects/creature_example.txt</file>
<file>raw/objects/entity_default.txt</file> <!-- Many mods will need to add lines into this file. -->
<file>raw/objects/interaction_example.txt</file>
<file>raw/objects/reaction_example.txt</file>
<file parse="text">raw/objects/text/secret_wisdom.txt</file>
<file parse="text">raw/scripts/example-script.lua</file>
<file parse="text">stonesense/index.txt</file> <!-- Any Stonesense content will need to add a line to this file. -->
<file parse="text">stonesense/example/index.txt</file>
<file parse="none">stonesense/example/example.png</file>
<file parse="text">stonesense/example/example.xml</file>
</manifest>
</modpack>
The first part contains some identifying information about the mod as it might appear in a GUI loader. A set of
<dependencies> lets the tool know if DFHack or any other mod is required for this mod to function. A structure of
<compatibility> data allows for hand-crafted overrides for specific combinations of mods. This can be displayed to the user even before a merge is attempted. The actual
<manifest> at the end identifies all of the files. We could hardcode how certain folders and certain extensions are handled, but this seems more future-proof.
One issue with
<compatibility> is that there could be zero, one or two bits of XML that are relevant:
0: No pre-merge feedback to the user because these two mods have never heard of each other before.
1: Trust the override that is present and display the
message in the GUI before the merge is attempted. The minimum message
level is user configurable. All
messages might get sent to a log.
2: Color the mods according to the "redder" data, and present both sets messages to the user for him/her to decide. Maybe cull duplicate identical messages.
The manifest is the only thing that a modder would be asked to do specifically for the loader, and even that is optional.
Edit: Fixed missing word and added link to Python docs on
ElementTree.