Topic: Generating worlds in parallel (Read 1685 times)

Foxite · « **on:** August 14, 2017, 10:53:31 am »

Okay, first, I'm aware that running more than one instance of DF on the same save is asking for trouble, but I'm pretty sure that none of the instances should have touched common data (only read it).

I ran this batch script (had to rename the exe for this):

Code: [Select]

start DwarfFortress.exe -gen 1 RANDOM "MEDIUM ISLAND"
start DwarfFortress.exe -gen 2 RANDOM "MEDIUM ISLAND"
start DwarfFortress.exe -gen 3 RANDOM "MEDIUM ISLAND"
start DwarfFortress.exe -gen 4 RANDOM "MEDIUM ISLAND"

To batch generate 4 worlds concurrently. The idea was that each instance, being single-threaded, would use one of my 4 CPU cores and generate 4 worlds as quickly as if I was running a single instance.

But after one world finished I noticed in task manager that rather than one core dropping to 0% and the rest staying at 100%, they all started fluctuating in utilization, while they all used the same amount of resources. That's the first question - why does a single-threaded application run on more than one core? Does Windows optimize this? My feeling is that it's better to dedicate a core to each instance. Do I have to set the process affinity for this?

Secondly, two of the four worlds had the exact same seed. The gamelog snippet looks like this:

Code: [Select]

Command Line: World generation initiated.
Command Line: World generation initiated.
Command Line: World generation initiated.
Command Line: World generation initiated.
Command Line: World generation parameter set located successfully.
Generating world using parameter set MEDIUM ISLAND
 Seed: 4Kau2QQuWGM8aeOAgOeo
 History Seed: 8Cq2yOEosaYqsgCawIEA
 Name Seed: I4qwqkauCqM664eW0M4Q
 Creature Seed: ssoE80QKwWYsOu0sU6CM
Command Line: World generation parameter set located successfully.
Generating world using parameter set MEDIUM ISLAND
 Seed: qEMWUAqa6e26aaAmY8YG
 History Seed: QE8gmCYOEqSS2kIMGk4Y
 Name Seed: G6y6sEOUKMOgyI6S86kS
 Creature Seed: yqk2EOIoki0qaGiGqWUq
Command Line: World generation parameter set located successfully.
Generating world using parameter set MEDIUM ISLAND
 Seed: ikEakswy0aQwaG2quq4W
 History Seed: sOauoEAI08yWq4mEs0GU
 Name Seed: 6Gcu2eQOywYaKQM8mo8c
 Creature Seed: SgQ4iKeEmgGgY2GCccwG
Command Line: World generation parameter set located successfully.
Generating world using parameter set MEDIUM ISLAND
 Seed: ikEakswy0aQwaG2quq4W
 History Seed: sOauoEAI08yWq4mEs0GU
 Name Seed: 6Gcu2eQOywYaKQM8mo8c
 Creature Seed: SgQ4iKeEmgGgY2GCccwG
Command Line: World image, detailed image and info exported.
Command Line: World image, detailed image and info exported.
Command Line: World image, detailed image and info exported.
Command Line: World image, detailed image and info exported.
Command Line: World exported.
Command Line: World exported.
Command Line: World exported.
Command Line: World exported.

Why, when I use a random seed, does the RNG generate the same number twice in a row? I have a TPM chip with a RNG builtin -- I don't know if DF uses it, but I'm pretty sure that that should never happen, even with a cryptographically insecure RNG.

And thirdly, I ended up with 4 worlds with the exact same name that all cause DF to crash when loaded. They crash during the "loading coin information" stage.

Why do these things happen?

Fleeting Frames · « **Reply #1 on:** August 14, 2017, 11:03:10 am »

Hm, can't say. I do sometimes generate multiple worlds in 1 run on linux, but I run them sequentially.

Spoiler: example code.sh (click to show/hide)

Code: [Select]

#!/bin/bash
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 101 RANDOM MEETUP3x1lake
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 102 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 103 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 104 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 105 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 106 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 107 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 108 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 109 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 111 RANDOM MEETUP3x1lake 
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 112 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 113 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 114 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 115 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 116 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 117 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 118 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 119 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 121 RANDOM MEETUP3x1lake 
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 122 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 123 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 124 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 125 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 126 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 127 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 128 RANDOM MEETUP3x1lake  
'/home/<user>/DF/df4303LNP/df_linux/df' -gen 129 RANDOM MEETUP3x1lake

I've tried two at once, but with that CPU core heats up 10 additional degrees, so I'd rather not since I'm browsing in the meantime on other 3. Otherwise, the gens seemed to be normal to me, loaded fine.

The worlds I generate thus do have same name, but they never cause DF to crash when trying to either.

I could speculate that windows is sharing variables between worlds, but who knows?

Foxite · « **Reply #2 on:** August 14, 2017, 11:05:51 am »

Quote from: Fleeting Frames on August 14, 2017, 11:03:10 am

I could speculate that windows is sharing variables between worlds, but who knows?

That could very well explain the fact that all the worlds have the same name when viewed in the list. Looks like my idea is not happening, or I'd have to set up some virtualization environment. Do you know of a way to isolate programs without setting up a VM for each of them (which I'd rather not have to do)?

Fleeting Frames · « **Reply #3 on:** August 14, 2017, 12:41:30 pm »

Not for sure, no. Perhaps you could try
❱ setting them to start with different core affinities (not edit later as starting variables would be same)
❱ setting them to start at slightly different times - random generators typically use systemtime(milliseconds) as seed, which does result in same seeds at same moment.
❱ copying your df directory 3 times and then starting up four "different" programs

Oh yeah, the reason why my worlds have the same name with above script is because I had most seeds set, leaving just a single one random. Nonetheless, same name doesn't cause issues on its own - you can easily copy a working region folder and have two divergent saves.

Miuramir · « **Reply #4 on:** August 14, 2017, 06:58:26 pm »

Some notes:

* All of this is very much "YMMV" / "At Your Own Risk". DF is not designed to be run this way, and (as you have observed) does not have a lot of the complex overhead it would need to make sure it doesn't step on itself. That said...

* As noted, random generators are frequently initialized with environmental stuff like the clock. Starting things precisely at the same time (or perhaps even with precise offsets, if it's using, say, only the fractional seconds) can lead to less randomness than expected. The classic example of this was things that failed to be random because they ran at a predictable time after boot on a system with a dead clock (or no clock battery at all), so it was always Jan 1, 1980 or whatever.

* Most modern operating systems are designed to rotate single-core workloads among multiple cores, to spread heat buildup out over the entire CPU die. This is typically at very slow speeds by computer standards (a few times per second to every few seconds, for instance) but is fairly quick by human perception standards; looking at reporting tools that take snapshots every second or so gives misleading or confusing results in this sort of case. As a note, this is a large part of why on some CPUs the Intel Turbo Boost speeds are much higher for fewer cores in use; it lets the CPU use what amounts to a "fallow field" arrangement where some cores are effectively allocated to cooling off, in rotation.

Note that because of this, setting CPU core affinity can actually cause *worse* performance, especially in cases where thermals are a limiting factor.

* I've noticed that a lot of systems are not really CPU constrained, they are memory bus and/or hard drive constrained. DF tends to behave more like a scientific or engineering program than a typical game, and it's easily likely that this is a big part of your problem. If the largest limiting factor is loading data to and from memory, then having multiple copies contesting for the same bus resources will not help much, and could even make things worse with higher cache miss rates. Note that there are no CPUs manufactured (to my knowledge) that have enough on-die cache to handle DF without a lot of paging stuff off to main memory; and that some of the cache is shared between cores, so you've got even less in a multiple-copies situation.

* As for VMs... note that putting things in a VM doesn't magically give you any more performance; you still only have one real hard drive, memory bus, L3 cache, etc. That said, it can fix certain other problems with copies stepping on each other. I'd recommend looking at something like Docker, which is not a full VM, but a comparatively lightweight framework that gives each process its own space, but calls through to the underlying kernel for efficiency. Set up a Docker container with all your DF configs, and invoke several copies; ideally with a slight delay so that they're not all asking for exactly the same resources at exactly the same time.

* Another comment: it's usually good to leave one core for the OS itself and handling drive overhead, user interface stuff, etc. Especially if you're running VMs. So, if you've got a 4 core system, try running only 3 copies, not 4.

Kumquat · « **Reply #5 on:** August 15, 2017, 05:47:56 pm »

DF saves some stuff in data/save/current when running, especially when generating world (corruption of this data was what caused the giant candy spires and other glitches). So you should never ever run the same copy of DF several times, they'll just keep stepping on each others' toes. This is why it crashes when trying to load the save games and all worlds have the same name.

Secondly, I doubt you get much advantage from generating multiple worlds in parallel - DF eats memory like a starved goat and the deciding factor for actual world gen is quite likely memory speed, not the CPU. OS performance still does show 100% usage, but that's really just stating that CPU is doing something else than OS idle loop; even if it is just waiting for memory to respond. I'd recommend checking "wall clock time" rather than looking at CPU load.

And, no, DF does not use any fancy hardware RNG. For DF the RNG must be predictable and repeatable, which is the opposite of what crypto RNGs do.

News:

Author Topic: Generating worlds in parallel (Read 1685 times)

Foxite

Generating worlds in parallel

Fleeting Frames

Re: Generating worlds in parallel

Foxite

Re: Generating worlds in parallel

Fleeting Frames

Re: Generating worlds in parallel

Miuramir

Re: Generating worlds in parallel

Kumquat

Re: Generating worlds in parallel