Topic: Dwarf Therapist (LATEST 0.6.12 5/23/12 see first post) (Read 870767 times)

thistleknot · « **Reply #1380 on:** March 11, 2012, 03:10:50 am »

What is this? I see category's and percents, but what do the percent's represent? A distribution curve?

Code: (dgc strength function. x = strength) [Select]

    var xpct;
    if (x <= 949) {
        xpct = ((0.0333988 * x) - 15.0295)
    } else if (x >= 950 && x <= 1149) {
        xpct = ((0.0837487 * x) - 62.8943)
    } else if (x >= 1150 && x <= 1249) {
        xpct = ((0.168333 * x) - 160.249)
    } else if (x >= 1250 && x <= 1349) {
        xpct = ((0.168343 * x) - 160.429)
    } else if (x >= 1350 && x <= 1549) {
        xpct = ((0.0837538 * x) - 46.4006)
    } else if (x >= 1550) {
        xpct = ((0.0238412 * x) + 46.3801)
    };
    return xpct;

thistleknot · « **Reply #1381 on:** March 11, 2012, 03:34:35 am »

One problem with my formula's is they only factor in attributes and traits.

I like what the idea schismatise posted
skills > traits > attributes.

One could argue skills/labors are the same, some skills/labors (like appraisal) can't be assigned. Tho you may have a good dwarf to do a labor based on attributes... which is a problem. There might not be too many labors like that though.

I just assumed since labors improved faster/efficiently based on attributes, that I would focus everything on attributes and let the labors work themself out.

However, I'm having a problem with a broker script that gives me someone with no appraisal skill.

And then there's skills.

So something needs to be done to ensure the proper order and weights are applied.

Kogut · « **Reply #1382 on:** March 11, 2012, 04:07:02 am »

BTW, traits and attributes are without any effect on quality of items produced by legendary dwarves.

thistleknot · « **Reply #1383 on:** March 11, 2012, 04:23:00 am »

Quote from: Kogut on March 11, 2012, 04:07:02 am

BTW, traits and attributes are without any effect on quality of items produced by legendary dwarves.

That could be used as an overriding factor. In fact, if a dwarf is close to legendary status, maybe an if, else statement can be done to change the course of a decision to favor such a dwarf. But for all the new hires, I think the trait/attribute is a good fit.

Update:
Can't sleep, clown's will eat me.

I simplified the formula for those following

Formula

for traits:
(trait-50)*
(average((each attribute's mean - attribute minimum)+(each attribute's max - mean)/2))/50
*weight

for attributes:
average((each attribute - attribute's average)*weight)>
((15% of all attributes averages)*(averages of all weights))

When adding traits and attributes, treat trait as an additional attribute (i.e. include it's weight in the averages of all weights, include it in the average((each attribute...)...

God so tired.

splinterz · « **Reply #1384 on:** March 11, 2012, 05:47:45 am »

Quote from: thistleknot on March 11, 2012, 03:10:50 am

What is this? I see category's and percents, but what do the percent's represent? A distribution curve?

Code: (dgc strength function. x = strength) [Select]
var xpct; if (x <= 949) { xpct = ((0.0333988 * x) - 15.0295) } else if (x >= 950 && x <= 1149) { xpct = ((0.0837487 * x) - 62.8943) } else if (x >= 1150 && x <= 1249) { xpct = ((0.168333 * x) - 160.249) } else if (x >= 1250 && x <= 1349) { xpct = ((0.168343 * x) - 160.429) } else if (x >= 1350 && x <= 1549) { xpct = ((0.0837538 * x) - 46.4006) } else if (x >= 1550) { xpct = ((0.0238412 * x) + 46.3801) }; return xpct;

from dgc:

Quote

I weight all attributes attached to a skill equally, and assume that a high percentile in one cancels out an equally low percentile in another. I scale attributes to percentiles in the six default ranges, and let the lines of the first and last ranges run off infinitely, meaning very high and very low attrs slowly stop making sense. For instance you get default strength of 100% at 2250, but a strength of 5000 will only report 166%.

thistleknot · « **Reply #1385 on:** March 11, 2012, 06:17:53 am »

If you guys are interested, here are my scripts.

I wish I could figure out the standard deviations (then I could use it for threshPct instead of just basing theshPct on the mean itself). I don't know if I can average the standard Deviation between all the between attributes.

http://dffd.wimbli.com/file.php?id=5850

Super late update (Jan 25, 2013):
Yes, yes you can...

you can use root mean square to combine standard deviations

man, I love answering my past

AnnanFay · « **Reply #1386 on:** March 11, 2012, 07:43:37 am »

I feel I should step into the discussion to say some things.

First of all, splinterz's above quote is correct. I've not changed the README however my current version of DGC still uses the old formula. (though refactored!)

This formula assumes that information in the wiki is correct, and without seeing more well documented tests with evidence I think they are correct. Attributes get raised by using skills and I believe thistleknot used dwarves from two large fortresses which would have seen attribute growth.

The DGC formula takes an attribute that falls within the generated range and maps it from 0-100. This roughly is how much of the population this is better than. If it's 80% it means that given 100 generated dwarves this will be above ~80 of them.

This is a very good formula to use when embarking! However it suffers huge problems as soon as you get immigrants, babies or any of your dwarves do anything since attributes will deviate from the generated averages.

If you have a very large fort you don't want to know how good your dwarves are above the generated average, you want to know how good they are in relation to all your other dwarves. Hence, two formula are needed, one for embark and one afterwards.

A third formula may be useful and can also be used on any other creatures. This is to say that attributes are between 0-5000 so weight all attributes by dividing by 500. 100 will be the maximum value possible and 0 the minimum.

So yeah, I'm planning on having all three formula in the next release of DGC. Hopes this info helps people!

chrisadams3997 · « **Reply #1387 on:** March 11, 2012, 08:24:00 am »

I'm only lightly following all this right now, got enough RL things to keep me busy, but I wanted to pop in and say I agree with Spliterz that columns would be the stronger approach over filter scripts. Any formula should aim to result in a standardized range (e.i. 0-100) as well as Splinterz and AnnanFay's approaches do for the numbers to really mean anything and to support visual drawing methods. The formulas themselves I'll leave to you guys for now

.

splinterz · « **Reply #1388 on:** March 11, 2012, 08:45:22 am »

Quote from: AnnanFay on March 11, 2012, 07:43:37 am

I feel I should step into the discussion to say some things.

First of all, splinterz's above quote is correct. I've not changed the README however my current version of DGC still uses the old formula. (though refactored!)

This formula assumes that information in the wiki is correct, and without seeing more well documented tests with evidence I think they are correct. Attributes get raised by using skills and I believe thistleknot used dwarves from two large fortresses which would have seen attribute growth.

The DGC formula takes an attribute that falls within the generated range and maps it from 0-100. This roughly is how much of the population this is better than. If it's 80% it means that given 100 generated dwarves this will be above ~80 of them.

This is a very good formula to use when embarking! However it suffers huge problems as soon as you get immigrants, babies or any of your dwarves do anything since attributes will deviate from the generated averages.

If you have a very large fort you don't want to know how good your dwarves are above the generated average, you want to know how good they are in relation to all your other dwarves. Hence, two formula are needed, one for embark and one afterwards.

A third formula may be useful and can also be used on any other creatures. This is to say that attributes are between 0-5000 so weight all attributes by dividing by 500. 100 will be the maximum value possible and 0 the minimum.

So yeah, I'm planning on having all three formula in the next release of DGC. Hopes this info helps people!

hi! thanks for chiming in on this, much appreciated! alright so as i've shamefully admitted i am terrible at the maths, but it seems to me that the reason for having multiple formulas is due to ratings being generated relative to other dwarves in a fort, relative to the generated levels or relative to the absolutes. now you've been doing this much longer than i have so perhaps you can explain a bit more why this is necessary rather than just rating based on the wiki page's min/max values?

for example if i have a dwarf with 1500 strength, dgc says that he's 79% better than the generated ranges, correct? so if you applied the same thing to 0-5000 rather than the generated, obviously the rating percentage would be lower, but wouldn't they still have the same overall rank relative to the other dwarves?

am i just tired and missing something obvious here, or is it simply a matter of preference?

thistleknot · « **Reply #1389 on:** March 11, 2012, 09:21:02 am »

I beseech anyone to do a test of their dwarves attributes (I used advanced xml converter, free for trial use, and scalc to get it done). Some of the attributes were averaging LOWER than the wikis! So do dwarves lose attributes too? Anyways, that is why I was proposing an average based on the current population of dwarves, rather than the wiki. However since I have had two independant samples conclude the same averages. I'm sticking w my averages until I see a different sample that proves the wiki is more accurate.

Also, using the current populations average makes sense (if the average indeed can change from the wikis). I feel it would be more optimized for your current population and avoid arguments over as to what the "correct" averages are.

I also have a recommendation for how to get % based on my formula using statistics. I wasn't sure how to combine the different attributes, but I believe I know how to do it correctly. You basically combine the attribute arrays into one super array worth of values and derive the mean and standard deviation from that, then using elementary statistics, you can derive what % your at from the mean (this is assuming a bell curve WHICH THE WIKI SAYS THERE ISN'T ONE, but using my sample, it's plainly obvious that there is a bell curve of some kind). The only issue I see with this, is if the distribution curve is skewed to the left or the right, I don't remember if that matters when using standard deviation and want to look into it. However, I hope to have an answer today. I'm also hoping that I can just average the standard deviations from all the involved attributes and get a new standard deviation to apply to the school of attributes involved with a specific role rather than have to derive it from one large super sample of all involved attributes.

UPDATE: Using a 340+ sample of dwarves, averaging a standard deviation of 315.15 and 400.67 gave me 357.91, combining the samples into one and doing a standard deviation of 367.1, so I guess the averaging gave me a bias of
2%.

Upon further reading, I'm finding that I shouldn't average the standard deviations, but use the entirety of the dataset

. Something I don't know how to do with dwarf therapist script engine (it took me forever how to figure out how to declare a var!)

I recommend using standard deviation, it's quite easy to find:

Spoiler (click to show/hide)

I have yet to re-figure out how to find a % along the distribution curve using standard deviation, but that should be easy. It's important that when getting a mean from a super-set, that we apply the weights to each set of attributes individually before we combine them into a superset, you can apply weights to the superset afterwards (because there all combined into one dataset). For example, str weight of 1.2, willpower of .8. so we would multiple the strength attribute dataset by 1.2, then the willpower attribute dataset by .8, then combine them into a superset, then find the mean, and standard deviation.

I think I got it right, we then might need to take an individual dwarf's attribute's and apply the weight to his attributes before comparing it to the superset (to make it match the superset weights), I'm not 100% sure on this. Weights make it confusing for me, but I do know we need to apply the weights to the datasets BEFORE combining them into one superset to get the mean/standard deviation.

Update:
Okay, I re-remembered how to do standard deviations. Unfortunately, there's no easy math formula for it, you have to reference a table (which means probably coding this into an array, or something convenient for referencing). It's called the Table of the Standard Normal Distribution, aka z-value http://www.fmi.uni-sofia.bg/vesta/virtual_labs/tables/tables1.html. Then you take a attribute you want to compare, and compare it to this table and measure it in how many standard deviations it is away from the mean, and that is your % away from 50% (either above or below 50%). The table I supplied can be reversed above 50% using the same standard deviations in column z, if that is confusing I can provide a more complete table, but it's simply counting down from 50% the same distance as it counts up from 50%.

I have an example for you:

Spoiler (click to show/hide)

As a note, the standard distribution curve is used for normal distributions, which dwarf attributes may not be made up of (there is a curve though). However... this method is probably the best we're going to get, and uses the same "normal" logic I was using with my min/max range method (that method was assuming a flat distribution vs a curve, but still symmetrical). Good thing I took statistics in High School, otherwise this stuff would have been harder for me.

Update:
i realized multiplying a dataset by a weight and THEN finding it's average, ~~is the same as multiplying the dataset average by the weight (which means you can average all the new weighted averages to get the superset average~~^{Update, no it's not, the distance from mean multiplied means the stdevp is multiplied by the weight}. However, a superset of all role attributes (after being weighted) still needs to be done to determine the variation from the new mean to properly figure standard deviation.
the (sample-1) can be skipped and a (sample) can be used when dividing squared deviation from mean from.

Example:
Agility average: 879.59
Analytical average: 1052.42
Creativity average: 1044.32
Sum: 2976.33
# of attributes: 3

Update:
My formula's had a messed up weighted mean, I had addition occurring before multiplication, it was the other way around.

It's (attribute mean * weight)+(attribute mean * weight)+...(for each attribute)/(# of attributes) = weighted mean.
((1250*1.1)+( 900*1.2))/2 =
1227.5

Not
average of attribute averages: (1250 + 900)/2 = 1075
average of weights: 1.1 + 1.2 = 2.3; 2.3/2 = 1.15

multiply attribute average by weight average: 1236.25

The weighted mean WAS ONLY USED TO DETERMINE THE > #. Which was a % (10-20%) of the weighted average.
So the impact on the formula's is minimal, the rest of the formula was sound.

Update:
I think the original reason I requested a user definable column, was to be able to do my own sort. Whether I wanted to use flat out numbers, or %. That way I could create a script (hopefully) in dt, and then see my own new column that I can do a sort operation on.

thistleknot · « **Reply #1390 on:** March 11, 2012, 02:15:54 pm »

Apparently I was misusing the terminology, weighted mean. My formula's don't use weighted means, but they do weight a dataset over another dataset, which does affect each dataset's mean.

Update:
I did a simple_embark from 31.25 for 1500 dorf's. Then I exported their attributes with advanced xml converter (had to flip the .xml document internally to get the rest of the document exported since it's a trial version, but verified the stat's were not duplicated). The means, min's, max's matched my old sample, and continued to verify that the attribute wiki has incorrect dwarf median/mean attribute levels for a few attributes (which is important for those developing formula's for DT or DGC!!!. (I don't mean to say the wiki is incorrect, I'm just wondering why the "center dwarven attribute" is not matching with what the actual results are.)

Anyways, I'll include the pic as a followup pic for the other reference I have for the attribute wiki. Hopefully people catch on.

It's interesting to see the distribution isn't a bell curve, but it certainly is similar. The author of DF decided certain ranges were going to have more WEIGHT than others, for example, Agility has a higher range from 900-1100 than any other, and for Analytical, it's 900 to 1100 as well.

I'll update the post later with a link to the .ods spreadsheet I got this from.

Check it out

Spoiler (click to show/hide)

The Averages are off on a few of the attributes's from the wiki!

http://dffd.wimbli.com/file.php?id=5854

hell yeah, my formula's will hopefully be validated

although (cuz my formula's are best for a flat distribution curve)... I still recommend the standard deviation, otherwise... you guy's are going to have fun trying to mix different distributions of attributes together to come up with a % scale. If you mix enough attributes together, you will come up with a bell curve. So it's best to just use statistics to figure it out. Use standard deviation for future % values for Dwarf Therapist. Statistics is quite easy, once you figure out the z table.

Icefire2314 · « **Reply #1391 on:** March 11, 2012, 06:55:17 pm »

This is beautiful. I literally just opened it and I'm already in shock. Seriously, beautiful. Never used it before

Mike Mayday · « **Reply #1392 on:** March 11, 2012, 07:21:45 pm »

Dearest Engineers!
Is there a way to prevent social skills from showing in the tooltip when mousing over the name list?
Is it possible to make said name list colour coded according to profession?

Are there plans to allow for assigning to squads directly in DT?

Goncyn · « **Reply #1393 on:** March 11, 2012, 07:54:59 pm »

Quote from: Mike Mayday on March 11, 2012, 07:21:45 pm

Are there plans to allow for assigning to squads directly in DT?

That would be beyond brilliant because it would remove the most common source of trying to pick a dwarf from the little tiny list in-game.

Baselope · « **Reply #1394 on:** March 11, 2012, 09:08:01 pm »

Oncyn you just raised a great point. Maybe we are both doing this wrong when making squads what I do is nick them in DT and then pick the nicknames out into squads in DF. It is quite tedious, but I'm not complaining because DT is so awesome I am blinded to such minor imperfections.

Still, I will second that function to be on the list for our inspired dwarf utility crafters to consider worthy of their time. I hope that it isn't a technological nightmare as it would be a huge improvement. I'd definitely like it better than nobles/vampire or other types of information.

News:

Author Topic: Dwarf Therapist (LATEST 0.6.12 5/23/12 see first post) (Read 870767 times)

thistleknot

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

thistleknot

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

Kogut

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

thistleknot

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

splinterz

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

thistleknot

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

AnnanFay

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

chrisadams3997

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

splinterz

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

thistleknot

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

thistleknot

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

Icefire2314

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

Mike Mayday

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

Goncyn

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)

Baselope

Re: Dwarf Therapist (LATEST 0.6.12 3/6/12 see first post)