Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1] 2 3

Author Topic: The amazing automatic language  (Read 5577 times)

PTTG??

  • Bay Watcher
  • Kringrus! Babak crulurg tingra!
    • View Profile
    • http://www.nowherepublishing.com
The amazing automatic language
« on: June 29, 2009, 04:38:20 pm »

I was talking to Tahin about his dwarven translator (which much as I would like to link to, I will wait until he is ready to release), and I realized that we need a much larger language. While I realize that the current "X YZ A-B, the CD of E" is not permanent, I'm thinking on the scale of the actual dictionaries.
 
 In order to add a word these dictionaries, Toady One would need to concoct each collection of dwarven letters (and human, and elven, and goblin), then tie it to any spheres it might have. If someone wants to mod in a race with a unique language, then they have to generate each word for that language.
 
 My idea is thus: have a single word raw that stores all english words that are in the game, and all sphere relations, synonymic relations, any other things.
 
 The in-game language equivalent would be generated randomly, using the sum of the value of the English letters as a seed- thus, the same spelling or meaning in English would lead to the same spelling in dwarven, elven, goblin, and human. The generator could be fine-tuned and configured so that each language sounds different.
 
 If a word only exists in one language, then a simple tag on that word such as [UNIQUEWORD:DWARVEN] could be added (i.e.; a word like "a really feminine beard" would only be found in dwarven (boabkrund))
 
 Problems:
English-centric: all languages have Germanic sentence structure, apparently. Potentially workaroundable.
Some favorite words would be lost: Urist, for instance. Please consider that while Urist might be lost (it may be possible to have custom words in addition to randomly generated ones), we will have hundreds or even thousands of times as many words available.
 
 Other Advantages:
Dialects possible: Small, localised language changes such as 30% more Rs or always having umalutes over Os would be far easier.
New races and new words easier.
Logged
A thousand million pool balls made from precious metals, covered in beef stock.

Mel_Vixen

  • Bay Watcher
  • Hobby: accidently thread derailment
    • View Profile
Re: The amazing automatic language
« Reply #1 on: June 29, 2009, 05:16:01 pm »

Well why cant we go the way that Dwarven, Elven and Goblin are stock-languages and the rather shortlived Races(humans kobolds) / moded civs / Animalman-Tribes get a procedural generated language?

Dialects, social-class language and similiar things you can calculate from the Stocklanguages as well as from procedural generated languages.The Dialect words could just replace the normal ones.

In essence i say we dont have to discard our beloved dwarfen to get additional procedural languages.

The problem with Non-germanic-structured languages is that the translation gets much much harder for the machine since the structuring systems etc are different so i wouldnt mind if we would stay Germanocentric for a while.

Since every communication (right now) is Text i agree with Latin-font (sorry English-speakers you didnt invent the Letters/the scripture) since i asume they get them prompted as Phonetical variation. Having uniqe symbols would just confuse the players. O the other hand creating own symbols isnt that much a problem if you handle it right. For example by showing the writting in two seperate lines where the letters/word-symbols are in the first line and the Actual english transcription is in the second line.

We would also need a system of language handling that doesnt confuses the player. Its nice to play with different languages but being stranger in strange land isnt that much fun because you dont understand the people you meet nor do they understand you.

A way to learn a forreign language well enought to understand the people you are talking in a reasonable time is also a something that we would need.
Logged
[sarcasm] You know what? I love grammar Nazis! They give me that warm and fuzzy feeling. I am so ashamed of my bad english and that my first language is German. [/sarcasm]

Proud to be a Furry.

Grek

  • Bay Watcher
    • View Profile
Re: The amazing automatic language
« Reply #2 on: June 29, 2009, 05:49:59 pm »

Eventually, there will be a need for multiple languages for each race. Randomly generating words for each is a good idea. For the two problems, we could have a RAW that defines some words so that  Urist, Armok, Ber, Olan, Kogan and Usan are all still in the game. The germano-centricism isn't that big an issue once you stop to consider that the game is already going to have to parse grammar. Using nongermanic grammar could actually be easier, as Toady could drop all of the things that make natual languages difficult to code like multiple existential verbs, T-V distinctions, gendered words, missing pronouns, and other things like that.
Logged

Vattic

  • Bay Watcher
  • bibo ergo sum
    • View Profile
Re: The amazing automatic language
« Reply #3 on: June 30, 2009, 02:27:53 am »

This isn't a bad idea at all and by the sounds of things Toady already did randomly generate the languages already in the game and then went over them to make sure it all looked ok and to make some edits of his own. Oh and its also going to get added for random game names when they get added (If Toady can get the AI playing well).

I know this would be a really basic way of messing with grammar but couldn't different languages just change around the order certain types of words appear in sentences? The types of words being Subject, Verb and Object along with Place Manner and time. I probably haven't thought this through properly but I'd imagine something similar to this will be used if different grammatical systems get added to the game.
Logged
6 out of 7 dwarves aren't Happy.
How To Generate Small Islands

Aldaris

  • Bay Watcher
  • [LIBERAL] [WANNABE_DORF] [CAVE_ADAPTED]
    • View Profile
Re: The amazing automatic language
« Reply #4 on: June 30, 2009, 09:40:43 am »

Actually, one of the ideas earlier in this thread strikes me as a rather good one, namely, dialoge would consist of two lines, one in the actual language (Maybe with its own alphabet, there's already a few alternative alphabets on this forum.) and one with the same line in english with roman letters. 
Human languages evoloving quickly also sounds like a good idea, maybe change a certain percentage (If possible, put this in the RAWs.) of the words every once in a while when the game is running worldgen, but keep legends records in the language of the time.
Logged
but Baron Aqizzar had the firm advantage, battering Cthulhu with his Mighty Chin.
^Totally not out of context, promise.
The Liberal Crime Squad Community game, now with a Liberal Overdose of Liberally aplied Liberalism. -Liberally. (UBER-Hiatus, next update somewhere between now and 2012.)

Seryntas

  • Bay Watcher
  • has created a masterpiece!
    • View Profile
    • RPGWW Forums
Re: The amazing automatic language
« Reply #5 on: June 30, 2009, 09:55:46 am »

For the two problems, we could have a RAW that defines some words so that  Urist, Armok, Ber, Olan, Kogan and Usan are all still in the game.

I have good luck with dwarves named Mebzuth, myself.  I don't know why.

As a student of linguistics, I support this thread.  However, language is more complex than anyone who hasn't taken a couple syntax classes likely knows.  Leaving English grammar in place is the least of all evils, I think.  Even something as simple as going from SVO to SOV requires remodeling at deep structure level.  Adding in procedural generation of syntax rules would be...intense.  I wouldn't be at all disappointed or surprised if a robust language system proved to sit outside the bounds of the coveted goal of Modeling Everything.  I'm not a programmer, though, and I'm only in my second year of being a Ling major, so maybe there's someone on the forum who knows some more on the subject than I do.

Words, though, okay.  Phonology is easy (at least that's what the grades on my academic transcript tell me).
Logged
"Nectar and ambrosia are all the gods are allowed to eat in Greek mythology. In that way they're kind of like pandas. You know, in diet. From there the similarities break down." -my Greek Lit TA

Starver

  • Bay Watcher
    • View Profile
Re: The amazing automatic language
« Reply #6 on: June 30, 2009, 11:06:21 am »

A way to learn a forreign language well enought to understand the people you are talking in a reasonable time is also a something that we would need.

I don't know how many of you are familair with the Discworld MUD, but its representation of languages unfamiliar to your character (or partial representations of text not familiar enough, when you have gained some skills in that spoken/written segment of the tree) is, while obviously a simplistic symbolic translation, quite sophisticated in appearance.

To the end that a non-Morporkian character when looking at Morporkian writing or reading the spoken words of that kind (whether NPC or from another player's "say" dialogue) gets to see "Latinium" (a kind of "mongrel latin"), whereas one unversed in the language of Djelibeybi (pseudo-Egypt) gets written descriptions of heiroglyphs (for reading) or the stocastic representation of the associated speech, ditto the orientally-themed Agatean with its (textual representation of) pictographic writing and syllaballic speech.

These are of course grammatically neutral (or, to be precise, utterly biased towards the English that all 'native' languages and room descriptions are to the player), much the same as I understand the current Dwarven is, but i is effective.  It means that there's no need to attempt to parse the player's/creator's original English phrase (when certainly the former "cud B cmpltly Mpossble 2 B red proply bi th cmpUter", although thankfully there are very few players who persist in such perversities in their communications), just token match the input (words, word components, and (if necessary) falling back on individual unaccountd-for characters) and analogue them to an equivalent (but not in a strict 1:1 mapping, so not completely reversable) output.  Or at least as many words/word components aren't marked as within the aquired language capabilities of the character and thus remain untouched.

Within the context of DF, of course, the grammatical parsing and rearranging is not so impossible, for a limited (and predefined) number of stock components are all that needs to be touched, and either the held-in-raws/executable-embedded phrases can be marked up like "We can sell you five tin cans of cat meat that we can ourselves." to form "{pluralpronoun}We {auxverb:nosubtext}can {verb}sell {singpronoun}you {numeric}five {materialmodifier:currentform}tin {object:pluralform}cans {preposition:composition}of {materialmodifier:sourcedfrom}cat {object:massform}meat {prepositionalobject}that {pluralpronoun}we {process}can {pluralpronoun:reflexive}ourselves"[1] so that the a grammatical conversion could be performed.

e.g. to "packaged!tribalresponsibility!meatblock!cat(originated) within!container(cylindrical)!tin(formedmaterial)!fivefold-plurality sell!toyou!byus!ispossible(neutraltone)" prior to dictionary conversion to something probably completely unlike "SchuNaKragdarPurrid ikTrenPraIgso MellochEeAyKell" (<= not intended to be anything 'real', except within the context of this explanation).

Compare with "We would sell you one tin... <rest unspoken, subtle gestures only>" would be the English version of the still full and formal "SchuNaKragdarPurrid ikTrenPra MellochEeAyKepin" ("packaged!personally!meatblock!cat(originated) within!container(cylindrical)!tin(formedmaterial)!fivefold-plurality sell!toyou!byus!ispossible(doubtfultone:cause_is_disagreement_at_tribal_leader_level) <gesture towards traditional bribe-pocket in clothing>" in the grammatical markup) if you allow for a little extra markup to contextualise the nature of doubtfulness and the possible resolution thereof, and acknowledge that there will be information lost in translation (but also lost in non-translation, since the tribal leader disagreement is something that could be specified as a background feature of a human's speech, but not be expressed except if spoken by a human who is speaking to our hypothetical 'otherrace' in the 'otherracian' tongue with a sufficient proficiency in said language to avoid an effective conversion to "(doubtfulone:nonspecific)" or even "(joyoustone:just_witnessed_a_rainbow)" for the particularly inarticulate. ;)


Or is that a far too complicated treatment of the original problem?  :-\

[1] Noting that I've excluded stuff like tenses, first/third-person identification and the like, which might also be an essential ingredient in the re/de-construction of the new grammar.  And ignore the other simplifications or errors doubtlessly introduced during the recomposition of this post...

[Edit: Oh yeah, forgot to say that "accents" are also proceduraly represented on the Discworld MUD, to some degree, amongst communications coming from those with the same tongue, but another distinction (geopgraphy or species).  I was going to write about that too, but looks like I overindulged on the above theoretical musings.]
« Last Edit: June 30, 2009, 11:10:11 am by Starver »
Logged

Granite26

  • Bay Watcher
    • View Profile
Re: The amazing automatic language
« Reply #7 on: June 30, 2009, 01:35:01 pm »

Ties to spheres based on English roots seems to preclude wood being a 'good' word in elvish, but a 'bad' word in dwarvish

PTTG??

  • Bay Watcher
  • Kringrus! Babak crulurg tingra!
    • View Profile
    • http://www.nowherepublishing.com
Re: The amazing automatic language
« Reply #8 on: June 30, 2009, 01:43:50 pm »

I think that it would be possible to give "wood" tags like [SPHERE:GOOD:CIV_ONLY:FOREST][SPHERE:WEAK:CIV_ONLY:MOUNTAIN] tags. That would actually be something of an expansion from the current system.

Alternatively, wood could have [SPHERE:NATURAL] and elves would have [SELECT_SPHERE:NATURAL] while dwarves would have [CULL_SPHERE:NATURAL] or something; this is more like the current system, somewhat.

The biggest advantage of all this is that a rather simple program could basically digest a real-world dictionary and increase the vocabulary several thousand fold at least.

Hm... Imagine, with enough of a dictionary, we could have a game mode where the UI is presented in Dwarven as well...
Logged
A thousand million pool balls made from precious metals, covered in beef stock.

Mephansteras

  • Bay Watcher
  • Forger of Civilizations
    • View Profile
Re: The amazing automatic language
« Reply #9 on: June 30, 2009, 01:51:32 pm »

I like this idea. My language generation script for DF uses a somewhat similar approach to this, although it's a bit crude. But I think the idea of having an automatic language generator would be cool, especially for dialects and such.
Logged
Civilization Forge Mod v2.80: Adding in new races, equipment, animals, plants, metals, etc. Now with Alchemy and Libraries! Variety to spice up DF! (For DF 0.34.10)
Come play Mafia with us!
"Let us maintain our chill composure." - Toady One

Granite26

  • Bay Watcher
    • View Profile
Re: The amazing automatic language
« Reply #10 on: June 30, 2009, 02:31:19 pm »

Wood is maybe a bad example

Try 'White' where in some cultures it's life, in some it's death.  I know your 'civ only' tags fix it, but not expandibly.

I like the idea of mutations for dialects though.

OTOH, I don't know that I like too in-depth fantasy languages, because then you're using brainpower to learn something that's basically useless tripe.  Just my .02 though

DennyTom

  • Bay Watcher
  • !!Urist!!
    • View Profile
Re: The amazing automatic language
« Reply #11 on: June 30, 2009, 07:07:17 pm »

Interesting.

Personally I would suggest not using english grammar but simplified english grammar as a core. Like only one type of conditions, no irregular verbs, no bending of verbs when subject is he, she or it, not using defining and not defining "which" and "that" but just "that" everywhere, not using "a" and "the", etc.

I love the idea of different meanings of words for different races.

I agree with Vattic - changes in sentence structure would bring nice diversity.
Logged
It is so much easier to measure life in experience points...

Starver

  • Bay Watcher
    • View Profile
Re: The amazing automatic language
« Reply #12 on: July 01, 2009, 06:08:52 am »

Interesting.

Personally I would suggest not using english grammar but simplified english grammar as a core. Like only one type of conditions, no irregular verbs, no bending of verbs when subject is he, she or it, not using defining and not defining "which" and "that" but just "that" everywhere, not using "a" and "the", etc.
Note that the lack of irregular verbs in a language usually indicates either an artificial creation or some historical upheaval of the language that might indicated an overriding grammatical reform on the part of the influential intelligencia of the time.  (Or a third possibility that's really a hybrid of the other two: revival of the language after its 'death'.)

To some extent, the other simplifications, as well.

Which is not to say that we can't be dealing with a simplified "traders' patois" of some kind (or a diplamatic lingua-franca, or a non-verbal sign language useful for passing on the news of battles won and lost), with a consensus middle ground of "You, me give fifteen buffalo hides.  Me, you give firewater bottle," that does not map grammatically (or even necessarily lexically) to any particular source language.

But I mention this only as a thought.  A truly old and ancient race (or a very simple one) could probably have an iregularity-free language.  By dint of scholarly influence and an "Academie Francais"-ish hold over time, or just because complex irregularities get ironed out by constantly naive handling of the language.

There are areas within my region where all 'proper' occurances of "was" are replaced by "were", in the local dialect, and others where the reverse is true.  Either way it simplifyies the grammar.  (In one case "I was running down the street, they was running after me" and in the other "I were running down the street, they were running after me".)  Then again, a local dialect word, around here, is "tret", equievalent to "treated", for all senses of the word from "I was tret for a broken arm" to "I was[1] tret to an ice-cream for being a good boy."

[1] Or 'were'.  ;)
Logged

DennyTom

  • Bay Watcher
  • !!Urist!!
    • View Profile
Re: The amazing automatic language
« Reply #13 on: July 01, 2009, 06:39:15 am »

You are absolutely right. I am just concerned about programing part of this suggestion. Procedural generation of irregularities might be unecessary dificult (when considering players will never learn it since the language and possibly grammar too wil change next time they gen a new world).

Ok, let me change my suggestion. Either use very simplified grammar and procedural generation OR make several naturaly feeling but fixed languages. This would save lots of work on problems noone (or almost noone) will care of or would give player oportunity to learn dwarven and have fun (ehm... not that fun) with it.
Logged
It is so much easier to measure life in experience points...

Mel_Vixen

  • Bay Watcher
  • Hobby: accidently thread derailment
    • View Profile
Re: The amazing automatic language
« Reply #14 on: July 01, 2009, 06:59:13 am »

Well for building "irregular" verbs, tenses and genderisation etc. you could devine "Rules" this can to happen procedural. As example for rules: many Job-names/titles in German have the suffix "-in" to indicate that a female is meant.

The "The" and "a" isnt much a problem either - it gets funny thought if you get genderspecific articles like "der/die/das" from German :P .

edit:

The entire system for genderisation and tenses would just need a way to know who was meant and in which tense a sentence is written.

F.e we want to have "I have a Cow" translated from dwarven to English.

In the program such a sentence would be represented like this: "(Creature:person:self) (Verb:show_possession:present tense) (article:unspecific:genderless:depends on following phrase) (Creature:bovine:female)".
« Last Edit: July 01, 2009, 07:12:43 am by Heph »
Logged
[sarcasm] You know what? I love grammar Nazis! They give me that warm and fuzzy feeling. I am so ashamed of my bad english and that my first language is German. [/sarcasm]

Proud to be a Furry.
Pages: [1] 2 3