Well, without the fancy greek and subscript notation, I think I suggested an experiment with exactly that (implicit) hypothesis.
I think lack of science is the problem, not lack of words...
I am willing to help with the testing. I have a pretty great machine, as well as a spare one that does literally nothing all day, that I could use for collecting data. I even have a background in statistics, so I would be happy to help analyse the data. I do not have a background in graphic design, so my analysis would consist of walls of numbers and/or text rather than pretty graphs like the ones used in the Bolts v. Armour research project. But before we get started on the research, we need to first define an exhaustive series of tests to be conducted, as well as control variables for these tests. For lack of a better metric, I am going to put forth an arbitrary armour set which I will use as the control for future testing, which I will generally designate as population 1.
Steel and Pigtail Fiber (Control Population)
☼steel helm☼
☼pigtail fiber hood☼
☼pigtail fiber hood☼
☼pigtail fiber hood☼
☼pigtail fiber hood☼
☼pigtail fiber hood☼
☼pigtail fiber hood☼
☼steel mail shirt☼
☼steel mail shirt☼
☼steel mail shirt☼
☼steel breastplate☼
☼pigtail fiber cloak☼
☼pigtail fiber cloak☼
☼pigtail fiber cloak☼
☼pigtail fiber cloak☼
☼pigtail fiber cloak☼
☼pigtail fiber cloak☼
☼steel gauntlets☼
☼pigtail fiber mittens☼
☼steel greaves☼
☼pigtail fiber trousers☼
☼pigtail fiber trousers☼
☼steel high boots☼
☼pigtail fiber socks☼
I do not include a shield or weapons in the armour set, because blocks and parries add unnecessary variables to the testing of the rigid armour. I will similarly use dwarfs with no skill in armour use or dodger, so that the variable being tested shall be isolated. In an actual combat situation, a given dwarf would obviously want as high skill as possible in armour user and dodger, and to have a shield and one or more weapons, to provide extra opportunities to reduce incoming damage.
Our first step would be to establish a large volume of data about this control population. Ideally multiple samples with large sample sizes would be run through the arena, against groups of enemies in varying numbers and of varying types. This is the part of the testing that makes me somewhat wary, because I do not know how best to simulate a combat encounter. While a simple 100 v. 100 death match in which we record how long it takes for the dwarfs to eventually die to 100 stray kittens (or demons, or whatever) might give us a large volume of data, I wonder at the specificity of the data. I also wonder which enemies we should test. While we could conceivably test every single enemy, for my initial testing I am going to limit the attackers to humanoids. I am thinking that perhaps we should for each run of the experiment, that is for each armour set we want to test, we should perhaps do individual tests against battle axes, great axes, flails, halberds, large daggers, long swords, maces, mauls, picks, pikes, scimitars, scourges, short swords, spears, two handed swords, war hammers, and whips; and, that each of these tests should be conducted twice, first with the attackers at the Proficient skill and second with the attacks at Great skill. This would give us 34 data sets, for a given run of the experiment.
This is all just to establish a single sampling of the control population, for the above Steel and Plant Fiber armour set. Ideally the exact same experiment would be run several times so that we would have multiple samples from which to extrapolate population statistics. Conducing actual testing would require that an alternate armour set, preferably changing only one variable at a time (i.e., plant fiber clothes v. silk clothes, adamantine rigid armour v. steel rigid armour, or adamantine mail shirts v. steel mail shirts) be created and run through the same experiment; again, more samples is better so that we can extrapolate population statistics. At the end of several rounds of testing, once we have extrapolated population statistics rather than sample statistics, we would arrive at a falsifiable hypothesis; for example, "the mean number of short sword strikes required to strike down a dwarf wearing steel and plant fiber at the Proficient skill level (μ
1A=3) is greater than the mean number of short sword strikes required to strike down a dwarf wearing steel and plant fiber at the Great skill level (μ
1B=2)." That much is intuitive, and ought to be true; i.e., an attacker who is better at hitting things requires less hits to kill a thing. We would make further comparisons against every population statistic we calculate.
It also just occurred to me that to be really thorough, we ought to replicate the test several times: a full run each for a few different attacking weapon materials, e.g., copper, iron, steel, and adamantine. That would give us 136 individual 100 v. 100 simulations for a single sample.
edit: All this testing could take a while, since as far as I know you have to individually add creatures one at a time.