I'm playing around with worldgens. If there's one thing that really seems to kill the speed, it's having a large number of sites that are connected to one another, especially by road. (Even when it is the same civilization that is trading amongst its own sites.)
When I have large maps that I put lower-than-normal limits on populations and they don't expand aggressively for one reason or another, the game's simulation speed goes by very quickly. I actually sometimes notice odd behavior where the speed slows down then suddenly shoots back up, but without the ability to really probe legends mode without a reliable 3rd-party tool at this time, I can't really trace what that actually entailed, although it seemed like there might have been a contraction in population.
Also, goblins definitely are easier on the complexity. At least, without them expanding or attacking anything.
I think another contributing factor may be that I tend to up the x and y variance of the biome-altering map features like elevation so that the map doesn't end up with one giant mountain blob taking up 1/3 of the map, and instead has a chain of dozens of small mountainous regions, and that gave dwarves far more chance to spread and get in contact with humans and elves.
It's actually extremely variable whether one of these map generations will get bogged down or not, as sometimes it ends up with permanent age of myth and nobody really doing anything for 500 years.
EDIT: even making the population limits very small (200), while it seems to help some, does not actually stop a small region from experiencing a lot of slowdown. In fact, I actually find it to generate medium or larger worlds faster with the same amount of civilizations on the map, just because the interaction seems to be what drives this slowdown.
One thing I am noting, however, is that any time the world hits a Golden Age, the frame rate will basically never recover, while the super-fast generations all take place in the Age of Legends. Messing with the number of procedural supercreatures generated does not seem to have a direct impact on generation speed, however.