much thanks to Thuvian, we brainstormed a way to not have to use frequency distributions, now we can get exact % by taking (instance-mean)/range = a perfect of ~-50%/~+50%, which will also work for skewed distributions! the distribution is the exact distribution of the sample! no guessing with frequency distributions!
Sorry, that was a bit difficult to comprehend. Is "range" relative to the population or the absolute range of the aspects? What's "a perfect of ~-50%/~+50%" supposed to mean?
It could go either way, I'm going to calculate it from the population (which I know some of you guys hate). For two reasons.
One is only traits are static
skills and attributes change, attributes are trained by skills, and vice versa, and player strategy can result in different ranges for different players.
This new approach should account for skew bias (without the need for fancy distribution frequencies which are not as accurate anyways).
And instead of ~-50%/~+50%, I'm gonna calculate a 0-50% and a 50-100%, 0 representing population minimum, and 50% representing median, and 100% representing population max.
Lastly,
Skills will now be based on a skill (non 0) median.
The same concept can be applied using averages. However, I feel median's are more appropriate for "skewed" data.
Perfect meant that if you added the - % to the + % (if they were absolute values), they would equal 100%. but... I decided that's not good, even though a max may be 53% above average, that 3% will bias that scale towards combining it with other attributes... So I scaled them down from median to max and median to min by dividing by median-instance by (median-min) and (max-median)
Update:
What I could do Dreiche2,
Is use the median from the wiki (or mean), and then take the min/max I have already for the various stat's (as taken from a sample of 3300 dwarves) and increase/decrease them by 10% for error, and that should be good. There you go, everyone's happy. Min's/max's/median's/mean's are all static. I would still let skills be determined dynamically, but attributes/traits could use a static/means setup. I would need to collect some data on traits though.
I was considering modifying the formula to say that anything between the median/mean was 50% but I'm not really all for that, but it would bridge the gap between median/mean.
Currently the min's max's I have are:
- + ++
Min's 146 198 444 695
Max's 1595 2108 2250 2537