Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: 1 ... 13 14 [15] 16 17 ... 22

Author Topic: Proposal: a standard format for mods in a diff/patch Mod Starter Pack  (Read 42432 times)

PeridexisErrant

  • Bay Watcher
  • Dai stihó, Hrasht.
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #210 on: August 21, 2014, 07:49:08 pm »

<use a nonstandard diff format>
If you want to write one, that would be awesome.  It should take two files as input, derive the diff, and be able to apply that to another file. 
Operating on the whole raw folder instead of per-file would be a nice bonus, but the former is enough to use it. 
The only catch is that so far no one who could wants to dedicate the time to do that.

<diff3 is awesome>
Amen.  Unfortunately I want this to work on random Windows computers too  :'(

However, I've since found... this 3-Way Text Merging Algorithm.  It looks perfect for us!
Quote
For a class called 6.033 (Computer Systems Engineering) at MIT, I was required to design a collaborative, distributed text editor. It’s supposed to work like Git—if two users make concurrent changes to the document, the editor should try to automatically merge the two changes (or report a merge conflict).

However, the merge algorithm had to be more intelligent than Git’s line-by-line diff merger—if one user moved a paragraph of text (for example) to a different location in the document, and another user concurrently edited the wording of that paragraph, then the merger should be able to detect that the paragraph was both edited and moved, automatically.

I wanted the merge algorithm to be even more general. It shouldn’t have any notion of “paragraph” or “sentence.” Rather, if any piece of text is moved and edited concurrently, this should be resolved automatically. Furthermore, it should work even if two (or more) users edit the text and another user user moves it. Moreover, I wanted it to even work recursively—for example, if one user moves a chapter in a book, while another user moves a paragraph within that chapter, and yet another user moves a sentence within that paragraph, and still another user changes the wording of that sentence, the merge algorithm should be able to handle all of that without any user intervention, and without any notion of “chapter,” “paragraph,” “sentence,” etc. It should just “do the right thing,” whether you’re working on code, a novel, a scientific paper, or any other text-based document.

While I wasn’t required to actually implement the algorithm, I ended up doing it in Python anyway. I later discovered that I had independently invented the operational transformation.
So far as I can tell, this solves basically all of our problems if we can get permission to use it.
Logged
I maintain the DF Starter Pack - over a million downloads and still counting!
 Donations here.

King Mir

  • Bay Watcher
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #211 on: August 21, 2014, 08:10:23 pm »

Yeah I saw that project. I don't know how good it is. It might be too smart, or it might be too dumb. Thing is, just because two mods can by combined in some predictable way, doesn't mean that it's safe to do so. I didn't really look at what that code does in such cases.

I'll reiterate this test case:
Code: (vanilla) [Select]
[CREATURE:GIANT_LEOPARD_GECKO]
[COPY_TAGS_FROM:GECKO_LEOPARD]
[APPLY_CREATURE_VARIATION:GIANT]
[CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC]
[APPLY_CURRENT_CREATURE_VARIATION]
[GO_TO_END]
[SELECT_CASTE:ALL]
[CHANGE_BODY_SIZE_PERC:400700]
[GO_TO_START]
[NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[DESCRIPTION:A large monster in the shape of a gecko.]
[POPULATION_NUMBER:10:20]
[CLUSTER_NUMBER:1:1]
[CREATURE_TILE:'G']
[COLOR:6:0:1]
[PETVALUE:500]
[MOUNT_EXOTIC]
[GO_TO_END]
[PREFSTRING:amazing sticky feet]
[PREFSTRING:coloration]
[APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph
[APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
Code: (mod adding pet tag) [Select]
[CREATURE:GIANT_LEOPARD_GECKO]
[COPY_TAGS_FROM:GECKO_LEOPARD]
[APPLY_CREATURE_VARIATION:GIANT]
[CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC]
[APPLY_CURRENT_CREATURE_VARIATION]
[GO_TO_END]
[SELECT_CASTE:ALL]
[CHANGE_BODY_SIZE_PERC:400700]
[GO_TO_START]
[NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[DESCRIPTION:A large monster in the shape of a gecko.]
[POPULATION_NUMBER:10:20]
[CLUSTER_NUMBER:1:1]
[CREATURE_TILE:'G']
[COLOR:6:0:1]
[PETVALUE:500]
[MOUNT_EXOTIC]
[GO_TO_END]
[PREFSTRING:amazing sticky feet]
[PREFSTRING:coloration]
[APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph
[APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
        [PET]
Code: (mod adding creature) [Select]
[CREATURE:GIANT_LEOPARD_GECKO]
[COPY_TAGS_FROM:GECKO_LEOPARD]
[APPLY_CREATURE_VARIATION:GIANT]
[CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC]
[APPLY_CURRENT_CREATURE_VARIATION]
[GO_TO_END]
[SELECT_CASTE:ALL]
[CHANGE_BODY_SIZE_PERC:400700]
[GO_TO_START]
[NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[DESCRIPTION:A large monster in the shape of a gecko.]
[POPULATION_NUMBER:10:20]
[CLUSTER_NUMBER:1:1]
[CREATURE_TILE:'G']
[COLOR:6:0:1]
[PET_EXOTIC]
[PETVALUE:500]
[MOUNT_EXOTIC]
[GO_TO_END]
[PREFSTRING:amazing sticky feet]
[PREFSTRING:coloration]
[APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph
[APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[CREATURE:DESERT TORTOISE]
[DESCRIPTION:A tiny shelled reptile that lives in the desert.]
[NAME:desert tortoise:desert tortoises:desert tortoise]
[CASTE_NAME:desert tortoise:desert tortoises:desert tortoise]
[CHILD:1][GENERAL_CHILD_NAME:desert tortoise hatchling:desert tortoise hatchlings]
[CREATURE_TILE:'t'][COLOR:6:0:0]
[PETVALUE:50]
[BENIGN][NATURAL][PET_EXOTIC]
[BIOME:ANY_DESERT]
[LARGE_ROAMING]
[POPULATION_NUMBER:10:30]
[CLUSTER_NUMBER:1:1]
[PREFSTRING:shells]
[PREFSTRING:longevity]
[CANNOT_JUMP]

Can these be merged? does the order or merging effect the result? I posit that it must not be allowed to be merged. Does diff3 merging allow it? Does stephan boyer's code?
« Last Edit: August 21, 2014, 08:12:16 pm by King Mir »
Logged

thistleknot

  • Bay Watcher
  • Escaped Normalized Spreadsheet Berserker
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #212 on: August 21, 2014, 08:16:23 pm »

Diff3 works on windows. I'm on windows. Just include the executable.

When I get home I'll do your test your merge test case but tell me which is the common ancestor?
« Last Edit: August 21, 2014, 08:25:59 pm by thistleknot »
Logged

King Mir

  • Bay Watcher
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #213 on: August 21, 2014, 08:28:58 pm »

Diff3 works on windows. I'm on windows. Just include the executable.

When I get home I'll do your test your merge test case but tell me which is the common ancestor?
The first one is the ancestor. If you do it, be sure to try the other two in both orders.

PeridexisErrant

  • Bay Watcher
  • Dai stihó, Hrasht.
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #214 on: August 21, 2014, 08:59:56 pm »

Diff3 works on windows. I'm on with does. Just include the executable
If it can be included in the compiled version of the PyLNP, that would be great.  See below though, I'm not sure we need it. 

Yeah I saw that project. I don't know how good it is. It might be too smart, or it might be too dumb. Thing is, just because two mods can by combined in some predictable way, doesn't mean that it's safe to do so. I didn't really look at what that code does in such cases.

I'll reiterate this test case:<snip>

Can these be merged? does the order or merging effect the result? I posit that it must not be allowed to be merged. Does diff3 merging allow it? Does stephan boyer's code?
Yes, they can be merged by either clever use of a two-way diff, diff3 or Boyer's tool.  (though I haven't tested this specifically)

According to https://en.wikipedia.org/wiki/Diff3 we could get a similar effect to diff3 by simply applying a diff from vanilla to the mod to the already-merged-mods wrapped in something to catch merge conflicts, and then we've got a basic mod merger.  It probably won't handle merging many big changes, but it would give a working script that can give feedback to the user about the validity of their load order.  eg:

Code: [Select]
>>> (input load order)
  $mod0 <GREEN> (because first mod is always OK)
  $mod1 <ORANGE> (because validity unknown)
  $mod2 cannot be merged, try a different load order. <RED> (merge failed)
>>> (input load order)

Raws post-merge can be:
1 - Error, merge is impossible (RED)
2 - Invalid raws; DF crashes because of improper formatting or the deletion of something crucial (ORANGE)
3 - Nonfunctional raws; DF can read them but some things have no effect (likely due to missing dependencies, which sometimes is the above) (YELLOW)
4 - Functional raws; everything works as it should.  (GREEN)
Then color pessimistically - anything of unknown validity is orange.  Anything of unknown functionality is at least yellow.

Right now, I just want to get *some* mods past stage 1.  More advanced standard diff/merge implementations will be able to squeeze more changes through stage 1 without problems (due to recognising movement etc).  Distinguishing two from three or four is harder, and I think it shoulkd be a post-1.0 goal.  My proposal further up the thread was to just guess based on how many mods were selected for now and fill this in later.  I'd probably try for a simple script where you feed it a raw folder and it just returns whether the raws are valid or not; extensible to also check functionality later, and call that on the mixed folder after each mod is added.  It can also be used to check that the mods themselves are valid, which would be nice!
Logged
I maintain the DF Starter Pack - over a million downloads and still counting!
 Donations here.

thistleknot

  • Bay Watcher
  • Escaped Normalized Spreadsheet Berserker
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #215 on: August 21, 2014, 09:11:40 pm »

Update
I see the issue your test case raises.

It replaces the pet token with the new creature vs injecting the newcreature either before/after pet token.

Spoiler (click to show/hide)

Sample 1 = base (common ancestor)

compare sample1:sample3 apply to sample2

diff3 -e -m sample2.txt sample1.txt sample3.txt > ...
Spoiler (click to show/hide)

compare sample1:sample2 apply to sample3

diff3 -e -m sample3.txt sample1.txt sample2.txt > diffs1s2applytos3.txt
Spoiler (click to show/hide)

by the way, that 3 way python diff tool looks handy.  I'm all up for better solutions.  I'm just glad we're thinking n way [common ancestor] merging at this point.  We obviously see it can be done without a git system.

Quote
we could get a similar effect to diff3 by simply applying a diff from vanilla to the mod to the already-merged-mods wrapped in something to catch merge conflicts

you mean revert a diff from currentVersion to commonAncestor?  Apply changes to commonAncestor, un-reverse reversed patch?

Quote
I'd probably try for a simple script where you feed it a raw folder and it just returns whether the raws are valid or not; extensible to also check functionality later, and call that on the mixed folder after each mod is added.  It can also be used to check that the mods themselves are valid, which would be nice!

what your asking for is some post raw merge processing beyond simple patch filing.

As to merging using common ancestor and diff3, I was able to merge Fortress Defense, Accelerated Modest Mod, Civilization Forge 2.8, and 40_09 together with hardly any fuss.  I used the same method using git, but this time I used diff3.  When I tried it with git I checked the error log and the only errors I had were related to the new creatures that were brought in that didn't have their tokens updated to reflect 40_09.  Those kinds of things are the things I think you speak of when referring to is it compatible or not, otherwise I would think you would need to do some regexp raw object processing.  Things like new creatures from older mods that were never updated to 40_09 standards, so some template would have to be applied to ensure body detail plans were updated to include the new _neck, things of that nature.

But speaking from a tool that accomplishes merging raws as is with little to no error.  I would say diff3 can do that.

Mixing mods from 34_11 to 40_09 is where my errors came in.  However, I think one could use diff3 with even less drama if mixing mods of the same version (40_09) vs across versions.

Update
I think a lot of the concern with diff3 and it's "block matching" can be alleviated if we parse the raws before we diff3 them.  I recommend (if possible) to alphabetize each .txt file's objects, and then parse them to remove whitespace and put tokens on their own individual lines.  That way when mod folders are processed in the same manner, the diff between base and the mod will be on an individual token level.  I think diff3 is smart enough to catch inserted blocks
« Last Edit: August 21, 2014, 09:41:04 pm by thistleknot »
Logged

King Mir

  • Bay Watcher
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #216 on: August 21, 2014, 09:39:42 pm »

Here's an updated merge script. I haven't tested my algorithm thoroughly yet, so it probably has bugs.

It can be run as a script that takes 3 files passed to as arguments: the mod file, the vanilla raw file, and the generated raw file. If the generated raw file does not exist, it will create it. It will do a 3 way merge of the mod into the generated_raw file. The script will return 1 if it fails to merge, 0 if it succeeds.

It can also be run with 0 arguements (or any number other than 3), in which case it tests the merger algorithm.

You can also import it as a module and call do_merge_files directly, with the same 3 arguments. do_merge_seq  is for comparing lists, which may be useful if you already have the contents of the file as a list of lines.

Spoiler (click to show/hide)

thistleknot

  • Bay Watcher
  • Escaped Normalized Spreadsheet Berserker
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #217 on: August 21, 2014, 09:51:46 pm »

I tried the py file you guys are talking about and this latest, I just have no luck with python

Spoiler (click to show/hide)

King Mir

  • Bay Watcher
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #218 on: August 21, 2014, 09:59:17 pm »

Update
I see the issue your test case raises.

It replaces the pet token with the new creature vs injecting the newcreature either before/after pet token.
Ok, I think that's a problem. putting the pet token on the wrong creature is an incorrect merge.

Quote
Update
I think a lot of the concern with diff3 and it's "block matching" can be alleviated if we parse the raws before we diff3 them.  I recommend (if possible) to alphabetize each .txt file's objects, and then parse them to remove whitespace and put tokens on their own individual lines.  That way when mod folders are processed in the same manner, the diff between base and the mod will be on an individual token level.  I think diff3 is smart enough to catch inserted blocks
There are two problems with that:
1) The order of objects in the raws has in-game effects.
2) Parsing the raws and comparing them is more complicated. It'd be nice to have a dumber merge tool that doesn't mis-merge compatible mods first.

King Mir

  • Bay Watcher
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #219 on: August 21, 2014, 10:01:19 pm »

I tried the py file you guys are talking about and this latest, I just have no luck with python

Spoiler (click to show/hide)
I think the problem is I'm using Python 2.7 (because that's what's installed), and you probably have Python 3.x. But I'll fix those errors for you.

Here we are:
Spoiler (click to show/hide)

Sorry for the trouble. I should have made sure since it's the same problem as before.

EDIT: run as
python mergemod.py mod_file.txt vanilla_file.txt target_file.txt
« Last Edit: August 21, 2014, 10:12:00 pm by King Mir »
Logged

thistleknot

  • Bay Watcher
  • Escaped Normalized Spreadsheet Berserker
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #220 on: August 21, 2014, 10:12:59 pm »

I'm kinda bummed that your [pet] newcreature situation kind of breaks merging.  I would think if the diff program was smart enough to figure out what token it was next to, either before or after token's of the diff.  That it could still be added.  The fact that the other mod didn't add/remove that specific token bothers me.  The diff app should have seen that two mods were bringing different tokens to the same line.  So both should have been accomodated by checking the before/after tokens on adjacent lines...

Oh well.  1st world problems right.

King Mir

  • Bay Watcher
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #221 on: August 21, 2014, 10:26:29 pm »

I'm kinda bummed that your [pet] newcreature situation kind of breaks merging.  I would think if the diff program was smart enough to figure out what token it was next to, either before or after token's of the diff.  That it could still be added.  The fact that the other mod didn't add/remove that specific token bothers me.  The diff app should have seen that two mods were bringing different tokens to the same line.  So both should have been accomodated by checking the before/after tokens on adjacent lines...

Oh well.  1st world problems right.
I can't think of a way diff3 could be smarter at merging.

PS
Just realized diff3 will write into the first argument. my script writes into the 3rd. I should change that.

PPS try diff -m without the -e. That seems to merge correctly but show conflicts as needed.
« Last Edit: August 21, 2014, 10:45:11 pm by King Mir »
Logged

thistleknot

  • Bay Watcher
  • Escaped Normalized Spreadsheet Berserker
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #222 on: August 21, 2014, 11:09:46 pm »

well.  it seems merger fails with your sample test as well.

Seems like you could run a test such, as you've derived, something like if a conflict occurs for writing to the same line (as merger will throw), throw an error.

else do a diff3 -m -e merge?

...

I just realized that with the way we remove ALL WHITESPACE, could this type of issue occur outside of this context?  I'm thinking since we remove all whitespace, that when we change a token, we are always overwriting the old token.  However, since all our patches are based on before and after snapshots of whole files...  It's easier to get the exact diff from the document and merge them.  So the only time an issue may occur is when two files try to alter the exact same line.

Btw,
I tried diff -m and nogo with 3 files?

I also tried merger.py on entity_default in a 3 way and it crapped itself all over my console.  Something about maximum recursion depth.
« Last Edit: August 21, 2014, 11:12:33 pm by thistleknot »
Logged

PeridexisErrant

  • Bay Watcher
  • Dai stihó, Hrasht.
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #223 on: August 21, 2014, 11:21:47 pm »

Code: [Select]
import os
import difflib

context_lines = 2
if os.path.isfile(mixed_raw_folder+file+'.patch'):
    os.remove(mixed_raw_folder+file+'.patch')
for line in difflib.unified_diff(open(vanilla_raw_folder + file).readlines(),
                                 open(mod_raw_folder + file).readlines(), n=context_lines):
    with open(mixed_raw_folder+file+'.patch', 'a') as item:
        item.write(line)

Creating a unified patch file with a few lines of context (two lines matches within but not between objects) fixes the [pet] issue, but I can't work out how to apply a unified patch with python.  Argh.

<Python 3.x compatible version>
Well, it no longer freaks out about the print statement  :)  Unfortunately it also outputs the contents of the vanilla file  :(

My testing, though I don't follow the various opcodes, shows that the output_file_temp returned by do_merge_seq() is the same as the contents of the vanilla file. 
« Last Edit: August 22, 2014, 02:09:48 am by PeridexisErrant »
Logged
I maintain the DF Starter Pack - over a million downloads and still counting!
 Donations here.

thistleknot

  • Bay Watcher
  • Escaped Normalized Spreadsheet Berserker
    • View Profile
Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
« Reply #224 on: August 22, 2014, 05:15:49 am »

I was thinking and thinking about it.

Your script to join adjacent tokens to nearby tokens only works if their is whitespace.  I have an idea to modify the regexp sed script to include default whitespace but remove the whitespace that is created when splitting tokens onto their own lines.  I can do so by replacing the initial whitespace with a special token marker that at the very end will be replaced with whitespace again?
Pages: 1 ... 13 14 [15] 16 17 ... 22