Oh wow - thank you very much for taking the time to look into in detail!
Ah, the code is still full of constructs that I installed to help me get my bearings and to (dis)prove my assumptions how DF and the plugin works.
A lot of those can and should be removed or properly commented if it makes sense to keep them for debugging purposes.
Now to your comments:
Taking care of the real world takes precedence over the virtual one...
I'm just starting to look at what you're at, and probably misunderstand half of it, so the comments should be taken with that in mind.
Very true, especially if the real world does not give you a choice about that
Having read your comments I can tell you: Even if one of them did not point out an error or misconception of mine it at least made clear that I need to externalize my thoughts (by comments!) and remove old code more eagerly. Especially since code turns into legacy code so fast, even if I'm myself the author.
- defs.h; inorganics: There are more than 256 of those, so they won't fit in unit8_t (assuming that's what's supposed to be stored).
That seems true for my test world (but might even have been wrong there) but this was a put there to try improve performance where vector<bool> seemed slow - which they are in the debug build as some optimizations only are made for release builds. Has been removed.
- index.h: The plural of "index" is actually "indices".
Changed - hehe, in my mother tongue it would have been me to point this out
- index.h: I don't see the point of creating maps of the names of various inorganics, as you can easily extract those from the DF structures. If it's some kind of backward indexing I'd just store the index of the corresponding inorganic (in the DF structure).
I'm lazy and wanted to be able to look the name up during debugging and have it ready when writing the "report" at the end of the survey/index phase.
As those names also are used as part of the file names when writing the indices onto the disk (Index::outputContents) I thought it might be nice to have them around so the writing process was not being made more complicated or slower by the sequential (?) look up. But the idea that that could be complicated might have its root in my narrow understanding of the DF structures.
- index.h: As discussed earlier, it would make sense to merge the three (four?) inorganics vectors of maps into one, but it makes sense trying to get things to work first before trying to optimize them.
Exactly my thought - the fourth vector (inorganics) will replace the three others as soon as I'm sure that I'm not missing anything.
- Index::setup: It seems you make the assumption all worlds have x and y dimensions that are the same. That doesn't hold (mine are 17 * 129, for instance).
Oh boy - that one is - I never even knew that is possible!
Fixed this by removing world_dims and using worldgen_parms.dim_x and worldgen_parms.dim_y instead
- Index::createKey (and setup, maxKeyValue): Does the range matter? If it doesn't I'd just make the key out of the lower 12 bits of x and y, plus the lower 4 bits of i and k, with a sanity check to ensure the world dimensions don't exceed the limits (which they can't in vanilla, and it seems from another thread they can't approach those values). The next point indicates the value range does matter, though.
The range of the key values by itself does not really matter - apart from the fact that there is an upper limit (MAX(uint32_t)) - but the value is currently used to reserve memory for the vector keys_in_order (of addition that is), which in turn is used to help test performance. Roaring works best if the keys come in strict ascending order. It still works fine of they are "random", but in that case it performs better if it gets a little help now and then by calling runOptimize and shrinkToFit. That is what you stumbled upon in your next comment.
Sorry, I'm still a little slow and bit manipulation was never my strong suite - would taking 12 bits from x and y and 4 bits from i and k still lead to a 32bit key? Since that's the format Roaring consumes...
- Index::add: It seems this operation relies on the order in which the keys are added, and that order does not work well with either DF's feature shells or an efficient processing of world tiles without feature shells (unless there's some way to move directly from the last world tile of a row to the first world tile of the next one). A comment explaining what's special with 511 would be useful: I can't figure it out.
As said above if the keys don't come in strictly ascending order helping Roaring by calling runOptimize and shrinkToFit improves the memory consumption during the survey/index phase.
511 seemed to be the offset from last position the previous 16*16 mid_level_tile survey run (i = 15 and k = 15) to the position of the current (i = 0 and k = 0) if y is odd - which is wrong and already removed from the code - but bear with me for another moment
Correct me if I'm wrong, but to efficiently process feature shells the iteration goes like this:
x=0,y=0 => x++ x=15,y=0
>----->------>----->---->----->---->---->----
↓ y + 1
x=0,y=1 <= x-- x=15,y=1
-----<------<-----<----<-----<----<----<----<
↓ y + 1
x=0,y=2 => x++ x=15,y=2
>----->------>----->---->----->---->---->----
....
So every other/odd row is being processed "backwards", in descending x order, right?
.... I just added the x,y,i,k coordinates to the output of the keys_in_order data... now I really understand what you meant when you said:
...
Also of importance is that the movement pattern is complicated enough as it is (it took me a fair while to get it to work correctly).
...
Boy oh boy!
The iteration spans 4 feature shells in large worlds before the movement pattern repeats, wow! Mad respect for that!
Anyway this feeds Roaring a lot out of order keys which can lead to temporary sub-optimal behavior memory-wise.
I'll have to look into ways of mitigating that. This might take the form of an option for the user: Fast search which consumes more memory temporarily until the end of the survey/index phase or slow search that optimizes/"defrags" often and thus needs less RAM.
- Index::add: I'd change the code checking for duplicate keys to a one liner (which I usually avoid) and comment outthe whole line when not used, rather than just the output. The compiler will have to perform the function call unless it can somehow deduce it doesn't have side effects even if there's nothing to do when true.
Again that's just an debugging artifact that helped me to get a better understanding of what happened - so yeah that can & will be removed.
- Index::add: Is there a reason for the omission of code inside the candy check? The later storage of level?
Yes exactly, the empty block is already removed in my current working copy.
- Index::add: I'm not sure what you're after with the (explicitly redundant) checks for various layer materials (not sure if it covers veins). If activated, it would exclude gems, for instance, and regardless, if someone has hacked in something, I don't see why it should be filtered out (And I just realized someone hacking in steel as a layer would get it filtered out if all inorganics were merged, but the issue would still be isolated to metals, and it would be possible to handle metals only separately).
You mean
world->raws.inorganics[mineralIndex]->flags.is_set(df::inorganic_flags::SEDIMENTARY) ||
world->raws.inorganics[mineralIndex]->flags.is_set(df::inorganic_flags::IGNEOUS_EXTRUSIVE) ||
world->raws.inorganics[mineralIndex]->flags.is_set(df::inorganic_flags::IGNEOUS_INTRUSIVE) ||
world->raws.inorganics[mineralIndex]->flags.is_set(df::inorganic_flags::METAMORPHIC) ||
world->raws.inorganics[mineralIndex]->flags.is_set(df::inorganic_flags::SOIL)
?
Those I took from embark_assist::finder_ui::ui_setup (case fields::mineral_1) - once the inorganics are merged into one vector it can be dropped for sure - saying that when I experimented without this "filter" I got indices that had different names but the same content, for example galena and lead or garnierite and nickel - are those the pairings of the mineral containing a metal and the metal?
Ok, that was a lot - thanks again for taking the time to look into this very preliminary version of the code!
And for reading this wall of text till here
Now one more question: Is there a river size field in the DF structures on the level of embark tiles?
Or is it the same for all embark tiles of a region?
What happened in the meantime?
I spend quite some time obsessing over the performance and the memory consumption of the index.
Memory-wise I can say the current scope of the index (still missing some fields of the finder) is around 30MB serialized for a 257*257 region.
How much it actually consumes living in the RAM still eludes me - as the memory profiling does not work properly. This is complicated by the fact that an index created at run-time with out of order keys seems to be bigger than the same index that has been reloaded from serialized data - but I'll get there.
The standalone tests suggest that indexing during the survey/index phase shouldn't be a concern performance-wise, but I'll do integrated runtime checks just to be sure.
My next steps are roughly as follows:
- experiment with a variant of Index::add that takes the whole 16 x 16 mid_level_tiles at once instead of 1 mid_level_tile at a time.
- implementing the missing fields/indices (biome river_size, river_elevation, bad weather and effects,...), excluding incursions for now - I'll probably have some questions about some of those - but in essence some steps that currently reside in the matcher will have move into the survey, continuous numerical fields will either live in a vector (e.g. temperature) or probably in a map-like structure if not every embark tile has/needs them (e.g. river_elevation). Getting most or all of them will allow me to make an informed decision if the memory consumption is still tolerable. Having an analogue to the region might be a way to mitigate the memory costs for all attributes that are the same for all embark tiles with the same x/y.
- get queries running - I have a pretty clear idea of how to do that so that it performs good or even excellent in most cases. If the results aren't fast this will be a deal-breaker. The first version of this will run the query phase only after the survey/index phase have finished. (yeah I know, user feedback and responsiveness
) That way I hope to avoid all kinds of problems I might need to solve later. But the second iteration could run as soon as one region is completely indexed. The third iteration - well see the next point.
- get the incursions in there - I think this will integrate nicely with the concept of the indexing, but I'll have questions for sure. Adding incursions complicates running queries during the survey phase, but I'm have a low priority mental process working in the background on how to know when all relevant neighbors of a region tile have been processed. Solving that would allow for a third iteration of the queries, that could produce results during the survey phase that are "incursion-ready".
- as an optional feature: currently I save the indices to disk for debugging and performance tests - but seeing how easy it is it might be a nice option to allow the users to reload the indices (automatically) on a subsequent load of the world map and allowing for fast searches right away. Adding a version hash would make it possible to tell if the data is compatible with the code. What do you think? Even automatically creating the indices in batch mode might be an option.
Currently I (still) think it is feasible to move all information into the index and thus having pure queries during the search phase, which will avoid any additional iterations/calculations on the level of the embark tiles. But I'm willing to compromise on that if there are cases that lead to enormous, unproportional memory requirements or to hacky code (using an index-hammer on an algorithm-screw).
Also I haven't forgotten about your remark concerning the following possibility
I've been thinking, and believe it's a mistake to try to massage the current inorganics presence storage. All you really NEED is two bytes to store the first and last layer of the geo biome present for each MLT (in addition to the index of the geo biome itself, which is currently stored), as all the rest of the info is available from the geo biome itself (And since the two values are in the range 0-15, you can actually store both of them in a single byte). To get the first/last layers we could cut away a fair bit of the code from the modified Prospector code of the MLT processing (the removed code would still be needed elsewhere to extract the actual inorganics, though).
With that basic information stored, you can either process the geo biome each time you need the data, or you can try to pre process the geo biomes to speed up the information extraction.
- The layers potentially worn away by erosion are all soil layers, and DF never seems to generate more than 4 of those. It's possible to hack the geo biome to get more soil layers and/or deeper soil, and DF can erode up to 10 Z levels, if I remember correctly. The suggestion below doesn't actually make any use of this info, though.
- DF doesn't use more than 16 layers of the geo biome even if hacking has added more (DF stretches the last one to fill the gap to the magma sea if needed).
This means that one possible approach would be to make a bit array for each layer of each geo biome and then merge the ones you have in each MLT with OR operations (16 layers * 33 byte bit array * X geo biomes). Even if DF doesn't croak at a silly max size PSV world with a checkerboard layout (forcing each world tile to get its own geo biome), you'd still not use more than 35 MB to store the info in a more convenient format than the geo biomes themselves.
Okay, sleep. now.