Andeerz: yea, been on the topic for a decade.
I think what Bumber mentioned would be likely be the most dynamic and cost-efficient manner to start with. To handle all the special cases would then require additional work. This system could combine with the dwarves treating places as dangerous too, such as miners being afraid of sections with cave-ins and so on, eventually touching on the feeling of safety feature.
Reinforcement learning in general works on solving things like finding safest routes in trapped mazes, but the problem with using machine learning is the system won't "align" with our intentions too well, and odd behaviour will be way more common than not. As an example, a RL agent made to play tetris eventually figured out a sneaky tactic to "get a maximum reward": once it was about to lose, it hit the pause button, and stayed there forever. At least this way it didn't incur the heavy penalty from losing. And this took tens of thousands of iterations of training too. We wouldn't want the goblins finally learning from their mistakes a hundred sieges in. (yes, there are methods to speed this up, but usually that involves computationally expensive things. In certain cases, you'd need a high-end PC calculating the next move for up to a minute, which is too much to wait for in a game).
So overall, we're better off handling the AI ourselves with heuristics and algorithms that catch the essential features, instead of letting an ML solution figure something for us, at least at the moment. The machine doesn't really understand what's important for us and what's not. Even large language models (ChatGPT etc.) don't understand what the text "means", it just imitates what humans have written previously.