Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1] 2

Author Topic: Language source code  (Read 1923 times)

wilsonns

  • Bay Watcher
  • Golden hair?Green eyes?Spiky beard?Super Dorf Jin!
    • View Profile
Language source code
« on: April 07, 2010, 04:06:50 pm »

The suggestion is about Toady release a source code where we can just change the language.It will attract lots of players;
Logged

alfie275

  • Bay Watcher
    • View Profile
Re: Language source code
« Reply #1 on: April 07, 2010, 06:09:13 pm »

So he releases source code, then someone would just steal it and make more competition for Toady. Much easier would be to store text externally so we can mod that.
Logged
I do LP of videogames!
See here:
http://www.youtube.com/user/MrAlfie275

LeegleechN

  • Bay Watcher
    • View Profile
Re: Language source code
« Reply #2 on: April 07, 2010, 06:57:41 pm »

It would make sense for the text to be extracted into raws for localization. However, having had to do that for an application 1000 times smaller, I know how annoying it can be :)
Logged

Capntastic

  • Bay Watcher
  • Greetings, mortals!
    • View Profile
    • A review and literature weblog I never update
Re: Language source code
« Reply #3 on: April 07, 2010, 06:58:36 pm »

I think there's been mention of improved language support in the future, but it's not a priority at this time, unfortunately.
Logged

Urist McDepravity

  • Bay Watcher
    • View Profile
Re: Language source code
« Reply #4 on: April 09, 2010, 05:23:30 am »

So he releases source code, then someone would just steal it and make more competition for Toady. Much easier would be to store text externally so we can mod that.
He means raw strings i believe, not the actual sources.
Like GNU gettext po/pot/mo files.
Logged

Mel_Vixen

  • Bay Watcher
  • Hobby: accidently thread derailment
    • View Profile
Re: Language source code
« Reply #5 on: April 09, 2010, 05:38:12 am »

I can only guess but as it looks wilsonns wants some kind of open and exchangeable "language.dll"so he or anyone else can have a df-version that isnt natively in US-english.
Logged
[sarcasm] You know what? I love grammar Nazis! They give me that warm and fuzzy feeling. I am so ashamed of my bad english and that my first language is German. [/sarcasm]

Proud to be a Furry.

Urist McDepravity

  • Bay Watcher
    • View Profile
Re: Language source code
« Reply #6 on: April 09, 2010, 06:09:52 am »

I can only guess but as it looks wilsonns wants some kind of open and exchangeable "language.dll"so he or anyone else can have a df-version that isnt natively in US-english.
As i said, you dont need to change source code to do localization. Gettext is popular choice for that, and all strings and translations are kept as separate set of files. You can just plug new mo and it will work with same binaries.
Logged

zwei

  • Bay Watcher
  • [ECHO][MENDING]
    • View Profile
    • Fate of Heroes
Re: Language source code
« Reply #7 on: April 09, 2010, 07:00:10 am »

I can only guess but as it looks wilsonns wants some kind of open and exchangeable "language.dll"so he or anyone else can have a df-version that isnt natively in US-english.
As i said, you dont need to change source code to do localization. Gettext is popular choice for that, and all strings and translations are kept as separate set of files. You can just plug new mo and it will work with same binaries.

That does not work very well when messages are composed and have context.

For example,

"Urist bled to death."

Now, "bled to death" phrase can be easily transaled, of course, but some languages do distinguish between gender of subject in verb and it can result in two different messages required.

Gennerally, Gettext is ideal mostly just for software with static text that has always same context (menus, labels, buttons ...)

Urist McDepravity

  • Bay Watcher
    • View Profile
Re: Language source code
« Reply #8 on: April 09, 2010, 07:23:05 am »

Now, "bled to death" phrase can be easily transaled, of course, but some languages do distinguish between gender of subject in verb and it can result in two different messages required.

Gennerally, Gettext is ideal mostly just for software with static text that has always same context (menus, labels, buttons ...)
Masculine/feminine is handled by providing separate strings for sentences, containing such.
Plurality is handled natively by gettext, and it knows which languages require additional forms (like, in russian there are 3 forms for counted nouns (1 singular and 2 plural), while 2 in english. Gettext knows about that and handles that natively.)
It takes some good designing for the strings, but definitely works for dynamically generated stuff. You just should provide complete sentence to gettext to make it work properly in some cases.
Logged

wilsonns

  • Bay Watcher
  • Golden hair?Green eyes?Spiky beard?Super Dorf Jin!
    • View Profile
Re: Language source code
« Reply #9 on: April 09, 2010, 12:57:06 pm »

So he releases source code, then someone would just steal it and make more competition for Toady. Much easier would be to store text externally so we can mod that.

It's what I mean...I was thinking about raws...but said source code...Ok, I was thinking about source code, but not of the entire game, just language source.
Logged

UFLinguist

  • Escaped Lunatic
    • View Profile
Re: Language source code
« Reply #10 on: April 09, 2010, 02:29:13 pm »

I'm (as my name might imply) a linguist and would simply love to work on the language some. I know Toady doesn't want outsides mucking about with his goodies, but just saying, I'm here if anyone wants to start playing around with the language(s). Devs and not-devs alike.
Logged

psyn

  • Bay Watcher
    • View Profile
Re: Language source code
« Reply #11 on: April 11, 2010, 05:16:30 am »

Now, "bled to death" phrase can be easily transaled, of course, but some languages do distinguish between gender of subject in verb and it can result in two different messages required.

Gennerally, Gettext is ideal mostly just for software with static text that has always same context (menus, labels, buttons ...)
Masculine/feminine is handled by providing separate strings for sentences, containing such.
Plurality is handled natively by gettext, and it knows which languages require additional forms (like, in russian there are 3 forms for counted nouns (1 singular and 2 plural), while 2 in english. Gettext knows about that and handles that natively.)
It takes some good designing for the strings, but definitely works for dynamically generated stuff. You just should provide complete sentence to gettext to make it work properly in some cases.
I don't see this as working. Changes in meaning by token-{term/word/phrase} substitution is extended beyond only gender, and using multiple sentences is possible only with embedded conditional logic, because without a higher-level abstraction capable of detecting syntax issues, there is no way to choose and display the correct basic sentence format.

DF can't even do token recognition substitution currently to my understanding, let alone token-lookup substitution and subexpression handling.



Let's take this message: You pinch Elf 1's head with your left hand!

In DF, it handles messages by using a simple token-order substitution on embedded strings (or it did in the 2D version at least). For example:
Code: [Select]
"%s %s %s' %s with %s %s!", 1.Name, 1.Action, 2.Name, 2.Part, 1.Possessive, 1.Partbut this is easily changed to recognition substitution form:
Code: [Select]
"%1% %2% %3%' %4% with %5% %6%!" % 1.Name % 1.Action % 2.Name % 2.Part % 1.Possessive % 1.Part
Text handling using this method is incapable of supporting multiple languages. The format string is not exported. But even if it were, the parameter string references are still unchangeable. Additional contextual information is unavailable. If the language must know the elf's gender, that information is unavailable.

DF could handle messages like this, using more expressive strings and token lookup:
Code: [Select]
message EN_ATTACK := "[1.Name] [1.Action] [2.Name]' [2.Part] with [1.Possessive] [1.Part]!"
It will work for some languages. But likely only for very simple sentences. What if 2.Part ('head') works in this sentence, but in another, it interprets as 'orange?' It is also limited in available contextual information; e.g., 2.Gender might be recognized, but 1.AttackForce is not.

I guess the next logical step is adding subexpressions:
Code: [Select]
message EN_ATTACK  := {!2.Part=head?EN_ATTACK2:EN_ATTACK3}
message EN_ATTACK2 := "[1.Name] [1.Action] [2.Name]' [2.Part] with [1.Possessive] [1.Part]!"
message EN_ATTACK3 := "[1.Name] [1.Action] [2.Gender] [2.Name]' [2.Part] with [1.Possessive] [1.Part]!"

It's verbose, ugly, semi-complex, lots of work, and could result in dozens of alternate definitions. At least it doesn't use inline subexpressions. But it still doesn't solve the problem of missing context. And I suspect it will fall apart with syllabic languages like Kanji or Chinese (confirm someone?).

I don't really know how to solve this. I'm not sure if anyone has. And I'm not sure if DF would need to. But I would probably try to do something like the snippet below. Call it, 'Experimental half-baked automatic language-sentence composition using a conceptual format.'

Spoiler (click to show/hide)

In short summary:
  • Line 1 is the game logic hooking into an action to generate a 'common' message format.
  • Line 2 is the common format; it uses inline tags as embedded markup for later grammar suggestions. It then passes this common format to a language-defined lexer responsible for generating the language syntax.
  • Line 3 is one of the language syntaxes; it uses simplified grammar rules determined by the common-format tags, and additional information retrieved from the game. It then passes the language format along to generate the message itself.
  • Lines 4-6 are a stunningly overoptimistic idea of what messages could now look like.



I understand if you want to slap me silly now. It's a painfully simplified process, horrendously complicated, naively hopeful, rife with logical errors, absurdly far-fetched, idealistic, and is likely full of problems that my ignorance is robbing me of the ability to see. I'm not a linguist; in fact, I rate my English ability as a 1 or 0.5 on a scale of 1 to 10. As well, Toady has probably considered all of this years ago.

...but I would appreciate insight. ;D I tend to find this topic interesting.
Logged

Urist McDepravity

  • Bay Watcher
    • View Profile
Re: Language source code
« Reply #12 on: April 11, 2010, 06:27:53 am »

Yes, obviously token substitution will not work this way.
Well, it will, but you need do it like this: substitute first whatever is static there, translate whole sentence, substitute stuff like names.
Although this approach will produce pretty large strings table.
Logged

psyn

  • Bay Watcher
    • View Profile
Re: Language source code
« Reply #13 on: April 11, 2010, 06:55:12 am »

I don't understand what you mean. Unless you are suggesting automatic translation of the complete English sentence, that method falls apart. If you do mean full translation, do you really want to treat other languages as second class? Try translating a Chinese/Japanese website. It's horrid.

Babelizer result:
Quote
Original English Text:
You pinch Elf 1's head with your left hand!
Translated to Japanese:
エルフ1'をつまむ; 左手が付いているsヘッド!

Translated back to English:
The elf 1' is pinched; The s head where the left hand has been attached!
Logged

Urist McDepravity

  • Bay Watcher
    • View Profile
Re: Language source code
« Reply #14 on: April 11, 2010, 07:39:13 am »

I don't understand what you mean.
I mean po will be like
msgid "You[MALE] pinch %s's head with your left hand!"
msgid "You[FEMALE] pinch %s's head with your left hand!"
msgid "You[MALE] pinch %s's head with your right hand!"
etc.
Although obviously such approach is "worst case" and there could be some middle ground like
msgid "You[MALE] pinch %s's %s with your %s!" which would work for _enough_ languages.
Logged
Pages: [1] 2