I'd missed to add exportmap to own_scripts, and tweakmap wasn't the latest version in either place (fixed now).
PSV = Pre Set Values. That's the specification of temperature, savagery, rainfall, elevation, volcanism, and drainage DF allows advanced world gen to specify per world tile.
RelWithDebInfo may well be a factor in the sluggishness. Unfortunately, my attempts to compile it with Release has met with failures so far (the latest one being a spontaneous reboot of the computer for no apparent reason), so I haven't been able to test it yet.
The reason for keeping inorganic in three vectors is ease of searching with a guess that it ought to be faster than a more complex logic. Also note that once the decision to regenerate/discard info, storage size is much less important than speed. There isn't anything I'm aware of that would stop an inorganic from being both a metal and something that's involved in a non metal related (modded) reaction, and I thought the minerals list was complete, i.e. listed all minerals, regardless of whether they are present in the other lists (with clays showing up in both lists). Metals won't show up in that list because they're never actually present in their metallic form (native gold isn't actually a metal itself, but a source for a reaction that produces the metal) in the vanilla game.
It's quite possible to have a single list of everything found either natively or resulting from a metal production reaction and then have three separate global lists that contain the indices of the inorganics that match the criteria for each list. Thus, to identify the minerals, you'd iterate over the "minerals" list and check whether the corresponding index in the compound list for a tile is true or not, and similarly for the two other (much shorter) lists. Note that to generate the lists of minerals in DF you'd have to figure out a way to determine which inorganics the game can include in all the geo-biome positions, which may be non trivial (when I asked, the forum didn't know how to determine if an area is "alluvial", and some minerals (opals, I think) appear only in those places, but that may not be an issue here: it may well be that just adding "alluvial" ones to those that can readily be identified is sufficient).
There are probably other ways to achieve the same thing, so if trying to save space it's definitely possible to store the info in a single vector and process it to determine where different pieces fit,and you'd only do that for searches if the search criteria look for that specific info. For the embark location overlay you'd have to do the complete processing, but that's only for the tiles in that embark, and that changes at human speed, which means the processing time isn't important.
Edit:
Finally got Release working (it help to try to run the right DF copy...). As you suggested, it showed a dramatically improved performance at 7:30, with the high and final memory usage a 1601.7 MB.
Edit 2:
I've been thinking, and believe it's a mistake to try to massage the current inorganics presence storage. All you really NEED is two bytes to store the first and last layer of the geo biome present for each MLT (in addition to the index of the geo biome itself, which is currently stored), as all the rest of the info is available from the geo biome itself (And since the two values are in the range 0-15, you can actually store both of them in a single byte). To get the first/last layers we could cut away a fair bit of the code from the modified Prospector code of the MLT processing (the removed code would still be needed elsewhere to extract the actual inorganics, though).
With that basic information stored, you can either process the geo biome each time you need the data, or you can try to pre process the geo biomes to speed up the information extraction.
- The layers potentially worn away by erosion are all soil layers, and DF never seems to generate more than 4 of those. It's possible to hack the geo biome to get more soil layers and/or deeper soil, and DF can erode up to 10 Z levels, if I remember correctly. The suggestion below doesn't actually make any use of this info, though.
- DF doesn't use more than 16 layers of the geo biome even if hacking has added more (DF stretches the last one to fill the gap to the magma sea if needed).
This means that one possible approach would be to make a bit array for each layer of each geo biome and then merge the ones you have in each MLT with OR operations (16 layers * 33 byte bit array * X geo biomes). Even if DF doesn't croak at a silly max size PSV world with a checkerboard layout (forcing each world tile to get its own geo biome), you'd still not use more than 35 MB to store the info in a more convenient format than the geo biomes themselves.
Returning to the logic of the 3 lists (as per the current implementation), they don't help with embark matching, as you can just as well check for a True in a merged list as you can in a list dedicated to the category you're checking. The place where it is of some help is when generating the embark location resource lists displayed, but it's a marginal extra effort to iterate over Economics and Metal indices lists to check against a presence in a common list, and that processing happens at human speed anyway. Thus, it could have been implemented better.