Thistleknot, it takes most of my focus, concentration and linguistic ability for me to understand you, but I'll try to answer nonetheless.
the more and more i think about it. the more i dislike skill % being modified by skill rate. skill % is already modified by skill rate by attaining skill faster. to me it would b better to have a group by option for it, ot own category (such as attributes, skills, traits, preferences, skill rate), or an option to modify existing skill by skill rate (but not enabled by default)
Current skill and the speed of gaining that skill are two different things. Both matter and should be separate categories.
what could be done to make it it's own category is to read all rate modifiers for castes, multiply this rate modifier by the castes frequency then do something like set the lowest value to 0, highest to 100 Something similar is being done for attributes
That could work, but this method ignores how skill rates really work. If you had two castes: one with 80% and another with 130% learning rates, the relative difference is +0.625 / -0.385. With your method one would be be set to 0 and the other to 100, which looses information about their relative importance. For a more extreme example, consider two castes having 95 and 105 learning rates in some other skill. That's not that big of a difference, but they still get set to 0% and 100%.
One way to fix this method would be to set minimum to 0 and bottom-cap maximum to 100. That way with the first example, you'd get weights 61.5 and 100, while for the second example 90 and 100, respectively. If for all castes the highest learning rate of something was 10%, you'd get 10 or less for all. Come to think of it, setting the minimum to always 0 is more important with this method than what you do with maximum.
Even with all that, I still say my proposition is better.
here's the code for attribute bins[...]
I don't see why you'd need bins like this for learning rates. I don't care about probability distributions. For traits, attributes and learning rates, just knowing min and max for all castes and everyone in the fort would be enough. Yes, you can get a nicer curve with bins and set the average properly, but I don't need this kind of behaviour. Besides, it might look nice in the grid view, but for weighting attributes and traits into recommendations, just multiplying their values by weights should be better.
"Cumulative distribution function" is What you call "density". Probability density would be (temp.probability / ((temp.max - temp.min !=0) ? (temp.max - temp.min) : 1)), since DF assumes uniform distributions within bins.
The code for traits looks OK, but the code for attributes looks strange. Here is what I see in there:
*
Cumulative distribution function, which you for some reason call density, doesn't add up. For each bin it should be equal to that bin's probability plus it's value from the previous bin. Just needs two fixes.
* Why do you need a zeroth bin? By your code a Dwarf has a 0.01 chance of getting an attribute below the minimum for that caste. Is this the case? Does this emulate attribute rust? What is it?
*
If you have the zeroth bin, why not also add a last one with min as the caste maximum and max as 5000? NVM, the code after the loop seems to be responsible for this, assuming raws[7] is always 5000. It seems to forget to add 0.01 to temp.density, though. There is nothing above it anyway.
* What is 0.16333333333? Did you mean (1.0 - 0.01 - 0.01) / 6.0 ? Just let the compiler calculate that for you.
* The fist bin generated by the loop has "density" 0. Did you mean 0.01? If so, delete the "if(i>0)".
* The max value for this bin should be one less than the min value for next bin.
the traits and attributes use known knowledge to find their mean of 50. I did the research on the traits myself using 16,000 dwarf's, which provided me with about 46k data points to draw the statistics from.
Ah, so you did the research. The 0.01 probabilities for being below or above the racial minimum and maximum were there, yes?
The attributes were verified once it was found dfhack was exporting names of attributes in the wrong order, I started a big fight on the wiki saying stuff was inaccurate... the attributes follow a simple 1/6 breakdown per bin category (remember what I was saying about frequency bins, at least that's what I told a frequency bin was).
I remember dfhack (the command to dump Dwarves to xml), Dwarf Therapist and Runesmith not quite agreeing on what attributes of my Dwarves were in 31.25. I didn't know which was correct, but I think I assumed dfhack to be wrong.
we hope to incorporate better traits for castes by incorporating the castes min/max values that vanilla traits don't have. The formula is pretty basic, it's[...]
Yeah, so long as Min and Max are correct, this will behave well. The Mean will be correct and everything above and below will scale linearly. That said, I'd be content with just:
return 100 * (Value-Min) / (Max-Min);
we hope to multiply these values by the castes frequency, so we can see where a dwarf fits within the whole fortress mode race.
Yes, that's the logical thing to do once you went that far to get the average. I imagine you get your averages by taking the middles of all bins, multiplying that by their probability, then dividing that by the number of bins (or rather 6/8 the number of bins).
But once again, I'd be pretty content with just linear scaling between absolute minimum and maximum for all castes and current citizens or displaying rounded values when I set the grid to display numbers.
I finally get what your formula is doing
double jobs_left = (29000 - XP) / rate; // Ignore division by 60.
means treat each job * rate as the # of exp gained.
Interesting concept.
Yes, I omitted the 60 XP per job (or 120 per three mugs or whatever you get per weapon swing), but those are constants for any given skill (or reaction, in case of mugs) and therefore don't matter for comparison. All skill learning rates do is alter the number of XP gained. My formula is just an approximation, since I imagine XP is integer, while I use doubles. My formula is an answer to a question "How many jobs will it take this dwarf to max out his skill?" (except it ignores the 60 XP per job).
It is also a good reason to ignore probability distribution for learning rates. You simply don't need this information to decide who is better for a job.
Here's another hopefully simple suggestion (should be much easier to fix than the current optimization/sorting formulas discussion). The optimizer has an "exclude active military" option. How hard would it be to change that to (or add an additional) "exclude squad members" option to avoid having soldiers that just happen to not have active orders at the moment getting included for optimization?
When I look for dwarves to assign them civilian labours, a workaround I use is to sort by military status, but I don't use the "train all the time" schedule, so the soldiers get off-duty and get mixed up with civilians. It would be useful to have an option to treat everyone having a squad as military, regardless of schedules, or if their alert is set to active/training.
I just use the grid view to do this, I didn't even know there was a job optimizer other than all the suggestions for roles and sorting by them.
I propose "soft capping" learning rates by 1/x function, not just cutting them off at 50%. That way for things like 0-10% learning rates you get values close to -2 (and you normally really don't want a dwarf with learning rate below 50 to perform a job) and for 50 and above you get a nice linear progression. While Masterwork never goes below 50%, I've seen people "disabling" skills in their mods by setting their rates to 0 or 10 or something like that, therefore for generality you want to take that possibility it into account. To patch this in replace with:
if(aspect_value < -1.0)
aspect_value = -1.0;
if(aspect_value < -1.0)
aspect_value = -2.0 - 1.0/aspect_value;
Looking at the code some more:
skill_rate_value = (skill_rate_value < 100) ? 100: skill_rate_value;
Wait, what? You're capping skill rate at 100%?! It should be capped at 1% and only to avoid division by 0!
if(a->is_neg)
aspect_value = 1-aspect_value;
This looks suspicious, but I have no idea what it is.
Ah, so skill ratios are factored in last... Hm, looking at this, if skill ratios are very low, then the Recomendation can go below 0%. If this is a desired behaviour is up for debate.
Well, since I can't fight job recommendations, I might as well try to help you improving them. I believe the choice of the right Dwarf for the job (or the military) comes down to answering and prioritising these fundamental questions:
1) "Which one of you has the highest skill now?"
2) "Which one of you has the best attributes and traits and likes for the job?" OR "Which one of you will make the most quality items when you max out your skill?"
3) "Which one of you can get to legendary+5 with the fewest number of jobs, thus wasting the least materials?"
4) "Which one of you is the fastest?"
Are there any more fundamental questions? I'll give you my perspective on them:
1) Old DT used to sort by this exclusively. You already have this included, but I'd like to add a twist. I was wrong when I wanted this to be sorted by XP, because that would work for sorting by skills alone, but not when this has to be aggregated with other criteria. I have another proposal. Instead of just taking integer skill levels into consideration, scale them linearly to next level according to XP. Just have an array with it, except set 0 as the minimum for Dabbling.
http://dwarffortresswiki.org/index.php/Experience Instead of using just skill levels, scale them linearly according to XP. You might do something more sophisticated here, like fit a spline, but the old good linear interpolation is simple and good enough. An examples:
A Proficient (lvl5) Dwarf having 3600 XP, has effective lvl 5.1.
A Proficient (lvl5) Dwarf having 4000 XP, has effective lvl 5.5.
double eff_skill_level = dwarf.getSkill(skill_no);
if (eff_skill_level < 20.0) // Don't care anymore for Legendary +5
{ double skill_xp = dwarf.getSkillXP(skill_no); // double is important. You want floating point arithmethic for this.
double min_XP = XP_per_level[(int) eff_skill_level]; // static_cast<int>(eff_skill_level) if the compiler screams.
double max_XP = XP_per_level[1 + (int) eff_skill_level];
eff_skill_level += skill_xp / (max_XP - min_XP);
}
// Scale eff_skill_level from <0, 20> to whatever you need.
// Take it into account for recommendations, sorting or whatever.
If you want to make skill levels more apart, choose and alpha from <0.5, 1> and use this instead:
eff_skill_level += alpha * skill_xp / (max_XP - min_XP);
2) This is actually a series of questions about attributes and traits and likes. I have very little clue to how this works, you are away ahead of me and so I don't have anything to contribute. In DT you can already have a grid view with attributes and see personality traits. There is even a normally hidden window where preferences can be seen and filtered by.
3) For Vanilla this is the same as 1, really. For mods with learning rates, this is a separate category, because a dwarf with 29000/2 XP and 50% learning rate is obviously better than one with 0 XP and 100% learning rate; even though they will both reach Legendary+5 after the same number of jobs, the first one will be better up to that point and work faster, while creating higher quality items. We have recently discussed this point.
4) Speed affects everything: Walking to the workshop, gathering materials and completing the job itself. There should be some way to get it from the game and I wouldn't mind a column in the grid view with speed. The speed that corresponds to "wait 9 frames" should be considered average. I'd scale the rest with 1/x function before using it for your aggregated goal function, but I would need some additional data to submit a proposal. A downside of using speed from the game might be that a dwarf's speed drops drastically when he is carrying a heavy cage or bin or something.
Right, so unless someone wants to discuss any of this, that's it from me for now.