Wouldn't as it is right now be the easiest way? I.E, there's a CREATURE_TILE/CASTE_TILE token which is used to assign tiles by number, so that a creature with tile 195 will have ├ as its tile (or with Unicode, Ã (assuming it uses base 10 instead of base 16 as it does now))
Well, it varies. First of all, moving to Unicode at all (if you're using actual unicode, and not just treating "any int below 0x10FFFF" as an encoding space for ids) would require sparse tilemaps - since the useful bits are scattered all over the codepoint space. That means one file per tile, if you want it to be any kind of usable.
If you integrate Pango support, then you don't need the tileset rewrite (you use fonts, and generate the pixmaps), but at the cost of one more library. If you want users to be able to substitute tiles with this, then either users get good at editing fonts (which may be workable), or you go to option 3:
Some mapping between Internal IDs and (Unicode) tile IDs, probably in a textual config file. You need internal ids, rather than just mapping one codepoint to another, to avoid aliasing issues (I want to map my custom creature to c and cats to the cat emoji; wat do? CREATURE_TILE/CASTE_TILE alone can't do this properly without modding the vanilla files, which someone who wants a drop-in mod would find disagreeable.). This is actually already sort of done for trees, as I understand it - so not a large leap to make. If you use the existing CREATURE_TILE/CASTE_TILE stuff as defaults, this mapping file can be pretty tiny.
However, once you're using Pango, then supporting full-on graphemes becomes almost trivial - you just map to strings rather than individual codepoints, and Pango handles the differences in rendering. You need to validate them to ensure you only use the first grapheme in the string, but Pango has helpers for that kind of thing.