Once you see the tiles represented in context for the first time, a memory is formed and they become identifiable at a glance. It's no more abstract than colored text, and can be less so.
Claiming they're difficult to understand due to unfamiliarity is like claiming a language is hard to speak because you never studied it.
I'm not convinced that's true. Assuming you have familiarity with the roman alphabet, the letter has information even if you've never seen it before. Ie, a 'B' is going to be something large that begins with B, like a Bear. A 'g' is going to be something small that begins with g, like a goblin or groundhog. Thus, there are two pieces of information.
(1) the letter tells us about what type of creature it probably is.
(2) the capitalization or lack thereof tells us about its size. (Capitals are human+ size, lower case are smaller than that.)
Further, those who have played rogue-like games will recognize some symbols right off (k for kobold, for instance, or D for Dragon). Honestly, I got a little confused when I first played because it *didn't* use the *band system. But in some ways the DF system is probably more intuitive for those who haven't played a rogue-like, although it reduces the amount of information the letter conveys (when g can be both groundhogs and goblins, seeing a g is no longer as informative as it could be).
Tiles are uninformative because there is no information before you establish a specific memory of it. Tiles have the same problems in rogue-likes (and are generally disfavored by that community for that reason).
Finally, what does it do when the creatures have weapon specialties? Are those all different tiles? Isn't it easier to glance at a color and know its a hammer goblin or a spear goblin instead of trying to make out what its tiny weapon is? (I get confused every time I see screenshots with tiles).
Basically, my point is that the abstraction of text is an advantage that aids in comprehension, and not a disadvantage at all.