Topic: Dwarf Fortress benchmarking (not world-gen tests) (Read 1729 times)

jecowa · « **on:** April 11, 2021, 04:53:23 pm »

I want to be able to see how well computers can run Dwarf Fortress. The world-gen tests are easy to do, but creating a new world is not the same as playing the game. It'd be nice to test the kind of lags players deal with by benchmarking computers using laggy save files. Maybe have different save files to test different types of lag (e.g. Maybe like Pathfinding lag, Fluid Lag, or Too many Socks lag, and others).

A good way to test might be to have these save files start paused, then testers would start their stop watches at the same time they unpause and time how long it takes to run a year or whatever of the fortress. If anyone knows ways to automate the timing, and dismissing in-game notifications that pause the game, that'd be great.

Problems that might pop up are random events that won't happen in all tests (like invasions), so it might be best to have test forts out-of-range of undead, goblins, and other hostile sites. And maybe embark in peaceful regions so random fights aren't common. Pocket worlds with very short histories would be nice too (to keep filesizes of test worlds to a minimum).

Anyone have any ideas for Dwarf Fortress benchmarking? Any ideas for types of benchmark tests to do or ways to make benchmark testing easier?

Shonai_Dweller · « **Reply #1 on:** April 11, 2021, 06:57:02 pm »

It can be tricky to pinpoint the cause of lag.
You may get a "too many socks" save, only to find that the fps death is actually being caused by a cat trying to pass through a locked door (will destroy FPS on the most powerful machines). So, big job, but hopefully will produce some interesting results.

Starver · « **Reply #2 on:** April 11, 2021, 07:30:30 pm »

There are plenty of other issues, as you identify in your plan, but the announcements.txt file should let you set everything to not pause.

Also, if you have d_init.txt with anything but [AUTOSAVE:NONE] then make sure it's [AUTOSAVE_PAUSE:NO].

(If you have a reasonable "can last a long time without major[1] happenings requiring player intervention" ma prepared, ready to be run on test machines, having an actual Seasonal Save could give you a benchmark record by checking the season-save directory times and seeing how far apart they were, to suggest one approach. You could also rerun (new copy) on the same machine multiple times with no changes to get an average[2] performance. Or ditto again but with key factors changed, like a door made pet-proof, to benchmark the potential impacts from them by invocation.)

[1] Or, indeed, too many random events, taking the passively-played testgame onto a very new trajectory. But that might mean a lot of turning off of Invasions, etc, etc, to try to avoid divergences.
[2] Or 'least hit by random slowing/spoiling stuff', so you'd look for the minimum durations.

jecowa · « **Reply #3 on:** April 11, 2021, 08:09:24 pm »

Quote from: Shonai_Dweller on April 11, 2021, 06:57:02 pm

It can be tricky to pinpoint the cause of lag.
You may get a "too many socks" save, only to find that the fps death is actually being caused by a cat trying to pass through a locked door (will destroy FPS on the most powerful machines). So, big job, but hopefully will produce some interesting results.

It'd be bad to get locked door lag in any of the test files. Especially since pet/locked-door lag is severe and more of a bug. It would probably be best to keep test forts minimal. Cheat heavily to reach the conditions needed to avoid getting too many other variables. Could probably verify that excessive clothing is the source of lag with DFHack commands to remove clothing or by atom smashing it and seeing if that clears up the lag.

Quote from: Starver on April 11, 2021, 07:30:30 pm

There are plenty of other issues, as you identify in your plan, but the announcements.txt file should let you set everything to not pause.

Also, if you have d_init.txt with anything but [AUTOSAVE:NONE] then make sure it's [AUTOSAVE_PAUSE:NO].

(If you have a reasonable "can last a long time without major[1] happenings requiring player intervention" ma prepared, ready to be run on test machines, having an actual Seasonal Save could give you a benchmark record by checking the season-save directory times and seeing how far apart they were, to suggest one approach. You could also rerun (new copy) on the same machine multiple times with no changes to get an average[2] performance. Or ditto again but with key factors changed, like a door made pet-proof, to benchmark the potential impacts from them by invocation.)

[1] Or, indeed, too many random events, taking the passively-played testgame onto a very new trajectory. But that might mean a lot of turning off of Invasions, etc, etc, to try to avoid divergences.
[2] Or 'least hit by random slowing/spoiling stuff', so you'd look for the minimum durations.

Good tips. I've never messed with announcements or seasonal saves before. Didn't realize seasonal saves created a new save file each time. Could have each test bench save file start just before a new season so it immediately auto-saves once, then auto-saves a second time.

Salmeuk · « **Reply #4 on:** April 12, 2021, 11:31:11 pm »

You would likely need a large sample size, with hardware information for every test participant. Deciding your sample size would be a great starting point for organizing this sort of benchmark test.

The largest issue with DF benchmarks has always been isolating variables. You would want to test each world size, and with a set of various worldgen lengths - say 1, 50, 100, 250, 500, 1000. Comparing world size effect and gen length would be very helpful.

I suspect DF has a compounding sort of lag, where multiple variables work constructively to cause slowdown. So like a large number of items laying around is not so bad, unless you also have gigantic open spaces all over your fortress - then it becomes a FFA for the dwarven pathfinding and things slow down.

Spoiler (click to show/hide)

So the development of the test-bed fortresses would need to be methodical, in that you would need to separate these confounding variables as best as possible. Kind of a warped logic puzzle, in a way.

Otherwise, you would merely be producing empirical data sets that, while still interesting and worthwhile, wouldn't set anything in stone regarding exact causes of lag.

I think more questions could be directly asked of Toady, since a knowledge of the programming might help us unpack the confounding variables. Though honestly I don't think it's a focus for him, so I'm not sure this would be fruitful. He's a busy man, you know!

To return to the question of world age, using 1 year old worlds would provide a very fast way to benchmark, since a 1000 year old world would likely run like trash from the get-go. However testing all ages of worldgen would still be important.

DwarfStar · « **Reply #5 on:** April 13, 2021, 06:43:02 am »

My guess is that, in general, lag/FPS scales fairly linearly with processor speed and game size. So a fort with twice as many units would run about twice as slow.

Except, if you don't have enough memory. I suspect most cases of catastrophic lag (aka FPF death) are due to the working set (the memory the CPU needs to touch every frame) not fitting in the main memory of the computer, causing the OS to start paging. When that happens, your FPS will plummet by a factor of x10 or more, and start to freeze randomly for several seconds at a time. You'll also notice other apps run slower and hang for several seconds while DF is running, as these processes get swapped out to disk to make room for Urist to find the path to his workshop.

I had that exact situation when I started playing DF recently on a new computer. My fort ran fine at first, but only when I had a handful of dwarves. Once the game was in full swing, the FPS quickly became unbearable. But then i added 64 GB to my memory and it's been smooth sailing since then.

Bumber · « **Reply #6 on:** April 13, 2021, 12:53:10 pm »

Quote from: DwarfStar on April 13, 2021, 06:43:02 am

I suspect most cases of catastrophic lag [...] are due to the working set [...] not fitting in the main memory of the computer, causing the OS to start paging.

I doubt it. When my forts begin to suffer FPS death, they're still using less than half my RAM, and the freezing only effects DF.

DwarfStar · « **Reply #7 on:** April 13, 2021, 04:04:37 pm »

Quote from: Bumber on April 13, 2021, 12:53:10 pm

Quote from: DwarfStar on April 13, 2021, 06:43:02 am
I suspect most cases of catastrophic lag [...] are due to the working set [...] not fitting in the main memory of the computer, causing the OS to start paging.

I doubt it. When my forts begin to suffer FPS death, they're still using less than half my RAM, and the freezing only effects DF.

I guess there are more than one kind of FPS death. Just to be sure though, you may want to check for page I/O traffic rather than current RAM usage, since depending on what you're looking at (and what operating system) it might still be paging when RAM does not appear to be full.

em312s0n · « **Reply #8 on:** July 14, 2021, 05:48:12 am »

I remember a thread from before where another user had the same idea. I think he made a google doc accessible to anyone and had people update it noting their cpu and fps that they get on a save.

News:

Author Topic: Dwarf Fortress benchmarking (not world-gen tests) (Read 1729 times)

jecowa

Dwarf Fortress benchmarking (not world-gen tests)

Shonai_Dweller

Re: Dwarf Fortress benchmarking (not world-gen tests)

Starver

Re: Dwarf Fortress benchmarking (not world-gen tests)

jecowa

Re: Dwarf Fortress benchmarking (not world-gen tests)

Salmeuk

Re: Dwarf Fortress benchmarking (not world-gen tests)

DwarfStar

Re: Dwarf Fortress benchmarking (not world-gen tests)

Bumber

Re: Dwarf Fortress benchmarking (not world-gen tests)

DwarfStar

Re: Dwarf Fortress benchmarking (not world-gen tests)

em312s0n

Re: Dwarf Fortress benchmarking (not world-gen tests)