It's much cooler this way. You can add your own symbols and words to mine and we'll eventually have a complete, different language and symbol set, leading to very different names as it already does... assuming nobody just copies and pastes words and symbols en masse.
It may be cool, but it isn't as practical as modifying the files would probably have been a better idea, because if someone creates a custom language for their creatures, then when more words are added, that file wouldn't be, and with the current version, there is barely anything to build a language around, unless someone dedicates a major amount of time to put in all of the words. If someone covers their bases and adds words that are yet to be implemented without the copy-paste, then there will be a large errorlog.txt each time.
Mm. Thing is, starting from a modified vanilla file wouldn't actually help it... we would have the same problem anyway, even if we kept the vanilla language file and only added to it. If people add new word definitions (and I would prefer that), those words become part of ALL languages immediately. Then translations for them must be added to any language (translation) that exists regardless of whether it's based on the vanilla files or scratch, or it will start looking glitchy. Random language gen utilities would be equally useless as the new words would have to be manually added to any randomly generated language too. The only difference would be that we would also have the vanilla words drowning out all our own words like vanilla creatures would bury our creatures under their sheer numbers if they were there.
The only way to completely avoid the problem would be to freeze language from the start and not let people add words, but that's just so limiting. For example expanding the NATURE symbol with your new creatures is just good all around.
What to do? I'm thinking could add a rule stating that if you add a word definition, add a translation for it to existing languages. It's simple as copying and pasting it into form T_WORD:WORD_ID:translation into that language's file. That means languages will eventually contain words added by many people, but doesn't add any constraining limits.