The single largest problem is changes to vanilla raws. These are sometimes necessary for a variety of reasons and detecting them with general algorithms is difficult at best. The worst example is probably reactions--if you add a new reaction, it must be merged into entity_default if you want dwarves to use it. Multiple mods which add new reactions thus cannot be easily mergeable with standard mod installation (i.e. merge the mod folder in).
PyLNP has a mod merger included but it's, uh, not nearly good enough. I've thought about coming up with some sort of bespoke algorithm for mod merging, but it's not something that can keep my attention for long enough to do--it essentially requires keeping a list of every token every object has. This is essentially the same thing as writing an entire raw parser from scratch, primarily due to the complexities of creature modding, syndromes and related object-in-object stuff.
My first thought as to how to do this: discourage modding vanilla creatures entirely, except by way of creature variations. Warn that making in-place mods to vanilla creatures rather than using creature variations to make all changes will lead to unexpected behavior, and don't even bother attempting to merge creature changes. Creatures are seriously that complex--the subtleties of creature-level vs caste level tokens, materials, tissues and the like make it nigh-infeasible to merge multiple mods that change one creature together unless such changes are entirely in the form of adding creature variations.
For the rest, any nested stuff is simple to deal with. PERMITTED_REACTION's placement in the entity does not matter whatsoever; pydwarf adds them by just putting them on the same line as [ENTITY:MOUNTAIN] (or whatever they're set to), which works perfectly.