How many tiles would we need?
That depends on how many effects we can use to transform tiles (eg. differently coloured version of the same tile, superimposing another tile to indicate wear of items or different appearances of creatures, etc.)
How would they be indicated in the raws?
It would be practical to be able to reserve a row of tiles for each civ, so you could, for example, use position 45 for blacksmiths. If the rows were 100 long, 145 would then be dwarf blacksmiths, 245 human blacksmiths, 345 elven blacksmiths etc. The last one would of course be unused, but it allows to maintain some structure, which will be useful for extensive mods.
Here's how I picture it working the most efficiently. First, I'm pretty sure no matter what is added Toady still intends to keep support for the ascii set. That means each release will come default pretty much as you see it now. So I don't see anything being added that would break that support. That still leaves a lot of room to work with but just keep that in mind.
The first stage in adding graphics support will be extending the number of graphics we can use for objects. This is actually quite easy to do and shouldn't have to rely on a specific size for an image file. What I first thought of was adding an extra entry on all objects in the raws for an alternate graphics tile. If graphics are enabled it will use the graphics entry, otherwise it will use the default ascii entry. If there's no entry at all, use the entry for the parent object. If that has no entry, use the default ascii character.
Of course with all the graphics sets out there and everybody having their own idea on how they should be arranged and how many there should be, adding all these entries into the raws is going to get rough. What I suggest is that all the graphics entries go in a single, separate file that can be included with the graphics set and even go in the same folder. This would be the simplest way for the end users and the graphics artists and would cause the least conflicts with mods.
I think the next logical step would be to add support for things like ramps which would need multiple base tiles to look right. At the same time he could add support for water/ground boundaries which is pretty much the same thing. In fact it's the same way smoothed walls work so it shouldn't be too difficult to modify that code to work with more stuff. Some more complex raw entries would be needed.
I figure that's enough to build off of. A lot of this stuff isn't very difficult to do but it can be time consuming which is probably why it hasn't been done already. I'm sure Toadie's philosophy is gameplay takes precedence over pretty graphics and I'm not going to argue with him over that. But this is definitely something that needs to get worked on eventually.
Edit: I guess I really didn't address you're post very well. The point I was trying to get across is, if it is done right, where the graphic tiles are located in the image file for dwarven/human/elven blacksmiths or whatever will be entirely up to the person making those tiles. It will just take them an extra step to add or update an entry in the graphic raw file for that particular object.