For the two problems, we could have a RAW that defines some words so that Urist, Armok, Ber, Olan, Kogan and Usan are all still in the game.
I have good luck with dwarves named Mebzuth, myself. I don't know why.
As a student of linguistics, I support this thread. However, language is more complex than anyone who hasn't taken a couple syntax classes likely knows. Leaving English grammar in place is the least of all evils, I think. Even something as simple as going from SVO to SOV requires remodeling at deep structure level. Adding in procedural generation of syntax rules would be...intense. I wouldn't be at all disappointed or surprised if a robust language system proved to sit outside the bounds of the coveted goal of Modeling Everything. I'm not a programmer, though, and I'm only in my second year of being a Ling major, so maybe there's someone on the forum who knows some more on the subject than I do.
Words, though, okay. Phonology is easy (at least that's what the grades on my academic transcript tell me).
Yeah, y'all should heed this. Language is
way more complicated than you know. Even irregularities are, by their very nature, strange. This will rear its ugly head if any attempt is made to randomly generate a unique language in the truest sense of the word.
You might get far, but the most difficult thing of a language to simulate is the...I don't know if their is a term for it, but the
feel of a language. The subtle interplay of vocabulary, connotation, definition, and the speaker's cognition that leads to how the language views and expresses reality is complex and very often describable. For instance, Spanish has a theme in its structure concerning the permanent/concrete/explicit and the impermanent/mutable/vague. There are two verbs for "to be," one for intrinsic, immutable things (ser) and one for general, relative, temporary things (estar). Furthermore, there are the indicative and subjunctive moods for verbs, which map similarly. Relative location words fall into this as well: you can say "here," "right there," both relating to things more or less in the immediate vicinity, and "over there," for things way the hell over there.
This thematic distinction in the language colors how you speak it and how you convey information, and it colors the vocabulary. How do you simulate such things randomly?
The problems get worse the more natural features to language you add. For instance, say the humans are speaking some inflecting language, kinda like Spanish or Latin (that is, words change depending on their grammatical purpose in a thought). Now you add a high rate of change to their language. Simulating the
change is pretty easy; sound changes and contractions are always regular and follow patterns that a computer can easily understand. But that is the problem: the sounds changes are regular, so if a specific sound change will screw up a whole category of the possessive form of words so they look like there are plural and part of a prepositional phrase, well too bad. How will the computer deal with this? What humans do with it isn't even predictable; anything could happen. Hell, English has collapsed Dative and Accusative into one case for Christ's sake.
And then there is just the
weird shit out there. Basque adds a word to every sentence composed of a particle for each word in the sentence that indicate the grammatical purpose and gender of that word. That's be like "Red cars drive faster" becoming "Red cars drive faster nominative_feminine_adjective-nominative_feminine_noun-present_thirdPerson_verb-superlative," though in far fewer sounds. No other language does that, so how does one simulate strange-ass things like that? Should one even bother?
There are so many ways to build a language that it would be an epic undertaking in itself just to come up with a passable place holder. Prepositions can't even be taken for granted: Latin uses two cases that more or less randomly associate with the prepositions, English just groups them with the noun phrases they modify, and Chinese probably dispenses with prepositions altogether. Polish has an extra half-dozen cases just for prepositional statements alone.
If you want to get fancy, insert the noun inside the noun it is within for "on," or say the word "cat" a few tones below the word "table" which the cat is underneath.
For a game that simulates the tensile strength of materials, attempting to randomly generate language can't ignore these things; it would be against the spirit of the random content.
With that kind of possible complexity, it would be easier to just keep a static
conlang for each race, or dub over English. If you're going to dub over English, why bother with randomly each time?