Mind explaining in a bit more detail how you tested it? Did you pit one dwarf wearing a set of arms and armour against subsequent goblins until the dwarf died, while having another dwarf in a different set of arms and armour through the same goblins? How well armoured were the goblins, stats etc?
Sure thing :
A Lethal Dose 50 is considered to be the most reliable test when it comes to the dangerosity of chemical substances, however it is frown upon in the scientific community due to ethic concern, which we need not to worry about in DF.
The set up is the following : you take a population of rats or mice (100 let's say) and increase the dose of the tested chemical until it exactly kills 50% of them. Then you note the dose you have to come to to achieve that result, and it gives you the LD50 value of that chemical.
So I conducted a ...reversed ? LD50 test with a standardized goblin (competent in all combat related skills and clad in copper).
I first took 10 dwarves armored in copper, single wielding+shield and conducted tests with increasing numbers of goblins until I could reliably kill 50% of the dwarves, then I repeated the process with dual wielding.
This test has many advantages compared to simple pitting, the two obvious ones are 1°) it tests the dwarves under similar conditions and 2°) the testing conditions are closer to actual field use of the tested gears
The test result gave a slight edge to dual wielders, but it was a very close call. It is not as overpowered as people seem to think - it is however viable as a strategy.
TL;DR : "How many standard goblin do you need to kill 50% of a population of X ?" is the question my test aims to answer, basically