Topic: DFLang: generate realistic language raws, every time. (Read 75501 times)

Africa · « **Reply #90 on:** February 01, 2011, 05:09:25 am »

Very interesting. Can you make it derive words in different ways? Like the current way is basically agglutinative, but it would be interesting to see other grammatical forms.

I like the current Dwarven words and their structure (one or two syllables and never ending in a vowel) but it would be interesting to see the grammar fleshed out more or have roots common from one word to another. Still, with those structural limits that could be hard.

What I think would be more interesting would be to differentiate grammar for different races' languages. Right now they all use the same grammar for adjectives, possessives, etc, and have no function words.

Lysabild · « **Reply #91 on:** February 01, 2011, 05:29:27 am »

Quote from: Africa on February 01, 2011, 05:09:25 am

Very interesting. Can you make it derive words in different ways? Like the current way is basically agglutinative, but it would be interesting to see other grammatical forms.

I like the current Dwarven words and their structure (one or two syllables and never ending in a vowel) but it would be interesting to see the grammar fleshed out more or have roots common from one word to another. Still, with those structural limits that could be hard.

What I think would be more interesting would be to differentiate grammar for different races' languages. Right now they all use the same grammar for adjectives, possessives, etc, and have no function words.

I wouldn't actually expect a response, the last post from the author in this thread was last years summer,

Also sorry for the necro, I managed to find out there was a lovely thing called DFFD where i could download it a little while after.

Igfig · « **Reply #92 on:** February 03, 2011, 01:33:53 am »

I haven't posted here in a while only because nobody else has posted here in a while. I'm not sure why you might have had problems downloading; DFFD is the only place DFLang can be found.

Africa, are you asking about different algorithms for word construction or for completely different grammars? Grammar is hard-coded into DF, so no, there's nothing I can do about that. As far as word construction, there already is a system in place for divvying up words by root, but I've been meaning to improve on it with a Variable-Order Markov Model and some other improvements that would allow the user more control over how roots are assigned.

Chromasphere · « **Reply #93 on:** February 03, 2011, 11:06:41 am »

'...and some other improvements that would allow the user more control over how roots are assigned.'

I would like to see that.

This program is VERY useful and I always use it to create my modded race's language. There are a few progs out there that are 'must haves' but because they are already well fleshed out, there is not much need for talking about them... just using them. So don't let the lack of current talk stop you from enhancing this program. I for one would love any updates you deem helpful.

Captain Mayday · « **Reply #94 on:** February 03, 2011, 12:32:29 pm »

Yeah, I'll agree with Chromasphere on this one.

612DwarfAvenue · « **Reply #95 on:** February 03, 2011, 10:30:04 pm »

What Mayday said.

Lysabild · « **Reply #96 on:** February 04, 2011, 05:35:01 pm »

This program gave me a little orgasm :b I'm a huge conlang fan and used hours mixing Japanese and Old norse with great interest for huge entertainment. Thank you <3 Any more improvements and I might stop playing Dwarf Fortress playing DFlang instead!

Igfig · « **Reply #97 on:** February 05, 2011, 11:51:38 pm »

Okay, here's what I'm going to do. I'm going to describe some of the changes I'm thinking of making, and you tell me which ones you're interested in.

First, though, here's how DFLang works right now:

Spoiler (click to show/hide)

This isn't terribly efficient or effective. My first change will be to make it start with a generated word, look for which roots it contains, and select a meaning that belongs to as many of those roots' symbols as possible. Change the order in which it searches, essentially. This will be quite simple to do, and it should have a small but noticeable effect on how well a word matches its associated symbols.

Another change that wouldn't be too difficult would be to allow roots to be expressed as regular expressions. DFLang wouldn't generate them on its own, but you the user could add them to the roots file by hand. This should be pretty easy to implement--in fact, you might already be able to do it in a limited way--and it'll make it easier to find roots in a word, which will affect how many words match their associated symbols.

In a similar vein, I could change the format of the roots files to allow more than one root per symbol.

An idea I've been throwing around a bit recently would be to change roots so that they represent not just symbols, but specific meanings within those symbols. DFLang would choose one or more meanings from a symbol's list and decide that those are vital concepts in this culture's understanding of that symbol. Roots, in other words. Those meanings would automatically get those roots as their entire words. For example, the symbol FIRE might receive GLOW, FIRE, and CONFLAGRATION as its roots. All the other words in FIRE would translate to something-GLOW or CONFLAGRATION-whatever. (You would of course be able to choose these words yourself in the roots file.) This wouldn't improve DFLang's accuracy at all, but it would be kinda neat to know that APPLE literally means CANDLEBERRY in the dwarven tongue.

To that end, I could have DFLang write literal translations of words in the margins of the language files. If a word could be literally translated as something, it'll tell you what that translation is.

I could take that concept even further with something I like to call "dynamic false roots". What that means is if, in the course of language generation, DFLang generates a short word that's unrelated to any of the existing roots, it'll count that word as a root of its own when the time comes to write out the literal translations. This will make the individual translations more complete, and also more wacky.

If you really like the idea of roots-as-words, I could try to make DFLang better at selecting relevant meanings to be a symbol's roots. I'm not sure of the exact details, but it would work by looking at language_words.txt to see how each word can be used in a name. If it can only be used in a couple of positions, then it's probably not general enough to be a good root.

Lastly, the biggest change I could make would be the rewrite of the generation script that I keep talking about. Moving from a regular Markov Model to a Variable-Order Markov Model would make DFLang better at generating realistic words (although I have no idea by how much), but would take a shitload of work to get right. Still, it would open up a lot of new possibilities: prefixes and suffixes, detection of natural roots (roots that already exist in the real-world language), generating words directly from roots instead of looking for roots in pregenerated words... maybe even some kind of context-free-grammar-based generation. It'd be pretty cool. But hard.

So, those are the ideas I've had. Your thoughts? Any ideas of your own?

Draco18s · « **Reply #98 on:** February 06, 2011, 12:15:46 am »

I like the idea of the variable order markov. Naturalistic words would be awesome.

Captain Mayday · « **Reply #99 on:** February 06, 2011, 12:41:45 am »

Agreed.

Lysabild · « **Reply #100 on:** February 06, 2011, 09:25:04 am »

I want it all! Even though the last and sadly hardest thing sounds coolest.

Chromasphere · « **Reply #101 on:** February 11, 2011, 09:37:30 am »

Yup, your first two easier ideas were interesting and of course the more difficult last idea of prefixes and suffixes, detection of natural roots and - particularly - generating words directly from roots. (hmm, I think that's a proper sentance too)

But any improvement, addition, enhancement would be appreciated. I would like to be able to see some generated words that have similar meaning and see the common connection between them within the makeup of the each word. I'm not a linguist by any stretch, but I think I'm talking about words derived from roots. This would decrease the instance of generated words which have similar meaning but completely different makeup.

Deadly Lamarr · « **Reply #102 on:** February 11, 2011, 06:34:55 pm »

I just finished making a Scottish Gaelic wordlist and the results of putting it through DFLang were exactly what I wanted!

DFLang is already a very useful tool, but all of the changes you suggest sound interesting.

Max White · « **Reply #103 on:** February 11, 2011, 06:40:16 pm »

As much as this is a very cool and helpful tool for modders, wouldn't this go into the modders forum?

Oh and... Thank you! I hate making languages, just hate it so much.

Igfig · « **Reply #104 on:** March 09, 2011, 03:52:09 pm »

Okay then. Algorithm rewrite it is.

Since that'll be kind of a big job, though, I'm probably not going to get around to it anytime soon.

And yeah, this probably should be in Modders. Not sure why it isn't. Moving.

News:

Author Topic: DFLang: generate realistic language raws, every time. (Read 75501 times)

Africa

Re: DFLang: generate realistic language raws, every time.

Lysabild

Re: DFLang: generate realistic language raws, every time.

Igfig

Re: DFLang: generate realistic language raws, every time.

Chromasphere

Re: DFLang: generate realistic language raws, every time.

Captain Mayday

Re: DFLang: generate realistic language raws, every time.

612DwarfAvenue

Re: DFLang: generate realistic language raws, every time.

Lysabild

Re: DFLang: generate realistic language raws, every time.

Igfig

Re: DFLang: generate realistic language raws, every time.

Draco18s

Re: DFLang: generate realistic language raws, every time.

Captain Mayday

Re: DFLang: generate realistic language raws, every time.

Lysabild

Re: DFLang: generate realistic language raws, every time.

Chromasphere

Re: DFLang: generate realistic language raws, every time.

Deadly Lamarr

Re: DFLang: generate realistic language raws, every time.

Max White

Re: DFLang: generate realistic language raws, every time.

Igfig

Re: DFLang: generate realistic language raws, every time.