Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Author Topic: Expanded Dictionary - masterwork edition please discuss.  (Read 1733 times)

Amostubal

  • Bay Watcher
    • View Profile
Expanded Dictionary - masterwork edition please discuss.
« on: April 11, 2017, 10:17:27 am »

Okay, so I've been working on my expanded dictionary for the last 2 weeks... it was a complete restart.  This time around I took a new approach.  I grabbed up the original vanilla word list, generated an item/animal/plant/ inorganic list from the vanilla DF scripts, and then culled al the unnecessary words (because they were in the original vanilla word list), added to that a list of professions-to-actions(i.e weaver->weave/clothier->sew) from the vanilla game.  Then did a massive search of the original oxford dictionary file I had used to create the original work for verbs and adjectives not in the pack.... from that list I culled a large portion of it (do we really need 8 words for laugh or humiliate?) and i came up with a list of ~1050 word entries to add to the original (including everything I mentioned above)

that would bring the word total to ~3,250 from the original vanilla DF list of 2,198.  total nouns, verbs, prefixes, adjectives wil be at around ~5500.    to put this into comparison the last version of the extended dictionary vanilla edition was ~12,000 words and ~28,000 N.,V.,pref., and adj..  its a reduction of about 1/6th of its original massive size.  now then I had set this up in stages:

A. Vanilla DF version first.
1. create addition list for vanilla DF - almost complete.
2. create symbol tables for the addition list - need help.
3. create word files and new translation files based off the above. - prepared to do after I grab up symbols and confirm the addition list.
4. release the vanilla DF edition.
B. Masterwork DF version second.
1. create addition list for masterwork - will use the vanilla version and do a re grab from a set of masterwork files and cull it as need.
2. create symbols for masterwork additions - need help.
3. create MWDF version word files and add MW word translations / additional race translations to the pack.
4 release the masterwork DF edition.

I really am trying to get it done... but what I really need is community input.

I have posted over on my mod forum page 2 lists(here), I need people to look over them and see if there is anywords you really want in the game that aren't there and any words you think I should of removed.  basically criticism.  Here's the thing that I know people are going to say:
where are all my curse/offensive/vulgar words?  yeah I had to make decisions on the fly to cull the list and really do you know how many vulgar words were in there?  How often do we need to hear of the embark of the flaming vaginas joke?  I left some standard ones in and some "risque" words that weren't so obvious (I have some risque work to do as I erect this monolithic stone here in front of that curvaceous statue of the nude burlesque dancers.... damned bastard next doors bitch of a dog wont quit barking all night long...)

so here' the same two lists from my own mod forum, for review here.  understand this is only the first step to the vanilla edition.

post 1: words created based on item names and jobs, etc.  things that do exist inside of DF:
Spoiler (click to show/hide)
Logged
Legendary Dwarf Fortress
Legendary Discord Group
"...peering into the darkness behind the curtains, evokes visions of pixies being chased by dragons while eating cupcakes made of coral iced with liquid fire while their hearts burn out with unknown plant substances..." - a quote from the diaries of Amostubal

Amostubal

  • Bay Watcher
    • View Profile
Re: Expanded Dictionary - masterwork edition please discuss.
« Reply #1 on: April 11, 2017, 10:19:41 am »

second post: list of additional words in consideration to the previous list:
Spoiler (click to show/hide)

90%s are words you really need to talk me out of adding.  50% are words you don't have to talk to hard for me to take out. 10% are words that if you tell me they are good I keep them.  the numbers at the end i need a consensus on.   Lastly if there is a conflict issue or a particular word(words...) that you see missing and want added in talk to me about it.  I'm concerned I may have missed a DF item, if I did let me know what it was and I'll create it.

I'll let this stew for a week and see the response.  otherwise. I'm running with the additional words, 90%, 50%, culling the 10% and numbers.  It makes the pack as a whole a lot lighter (close to 1000 words added to the original, ~3200 total, total nouns,verbs,adjectives,etc, close to 5500 total).

Whatever pack I come up with once its finished, I'll make an addition pack for masterwork, adding probably 1500 more items creatures etc (lol masterwork adds a lot of stuff).   After that I'll build upward as people request particular words to be added in.

All thoughts, questions, etc. welcomed.

Logged
Legendary Dwarf Fortress
Legendary Discord Group
"...peering into the darkness behind the curtains, evokes visions of pixies being chased by dragons while eating cupcakes made of coral iced with liquid fire while their hearts burn out with unknown plant substances..." - a quote from the diaries of Amostubal

-Nihil-

  • Bay Watcher
    • View Profile
Re: Expanded Dictionary - masterwork edition please discuss.
« Reply #2 on: April 11, 2017, 12:42:18 pm »

It's not entirely clear to me what the end result of this would be. I understand that you are adding more words, I am guessing that this will be entirely flavour? Or does this affect the simulations in any way? I am really not certain how the languages even work. All the words you have listed are English, will you also create corresponding dwarf/elf/human words? Are those generated with some tool that I don't know about?

One thing that I might suggest is that flavour additions are always welcome but removing content might not be. You should probably run at least a poll to see how people feel about the different types of words you have removed if you want to include this in Masterwork. Otherwise you would probably want a way to toggle this on/off for Masterwork in the way Meph has everything else optional.
Logged

Amostubal

  • Bay Watcher
    • View Profile
Re: Expanded Dictionary - masterwork edition please discuss.
« Reply #3 on: April 11, 2017, 02:15:06 pm »

It's not entirely clear to me what the end result of this would be. I understand that you are adding more words, I am guessing that this will be entirely flavour? Or does this affect the simulations in any way? I am really not certain how the languages even work. All the words you have listed are English, will you also create corresponding dwarf/elf/human words? Are those generated with some tool that I don't know about?

One thing that I might suggest is that flavour additions are always welcome but removing content might not be. You should probably run at least a poll to see how people feel about the different types of words you have removed if you want to include this in Masterwork. Otherwise you would probably want a way to toggle this on/off for Masterwork in the way Meph has everything else optional.

ah okay I think a full overview is needed, but I'll give the short version as I digress too much (I ramble  :D).

alright back in August or September of last year, me and Meph was discussing ideas to further distinguish Masterwork from vanilla DF.   After tossing several ideas around one thing that stuck was that Meph really wanted more options for the language files.  the language files basically are the building blocks for all the names of everything from map regions and civilizations to unit names and hero titles.  basically every time you see a weird word that is a name.  anyways I digress.  The endgame idea was actually to eventually lead up to a "oxford dictionary" size of words to be added to the game, a truly Dwarfy game of go big or go home.

Personally I thought it was an awesome idea, and I took it and ran.  I scoured the web and found a copy of the oxford dictionary... 35,000+ words with all the corresponding entry info.   I quickly culled out from it all the words too specific (terminology, nationalities, etc...) and whittled it down to 14,000+ words and then through a brute force effort forced what was left into a word file for dwarf fortress, basically all the nouns, adjectives, prefixes, and verbs.  Through another little language creator from the web, I pushed all the new words through and you get the individual race languages.  the initial version Meph liked so much he pushed it in version 1.21 I believe.

After that I began to refine the list further, culling, combining, chopping, hacking, etc. trying to get a decent symbol list based on it (virtually impossible), cutting words people got upset over and everything else.  the second and third versions were pushed in 1.22 and 1.23.  my 4th version never made it as it was suppose to go into 1.24....  but at that point it was the biggest mess I had ever seen and really... I couldn't look at it.

well I let it sit from December to March, worked out the kinks for Meph of DF 43.05 x64 and now I've returned to recreate the language files in a better way.

so to answer your questions, in order:
Yes this will be flavour, but it does effect how you see the game as names become more flexible.

Some would say it does affect the simulation, since currently names of a race civ etc are based on the symbol lists that attach to the word list, so if you think you've seen too many of one name in your fort, its because most of the original symbol lists are only 50 to 100 words long, units are named {single_word compound_word} making for approximately ~50 first names(a lot of symbols are adjectives, can't be first name), ~250 last names (other restrictions force the words not to function in a compound or only in the first or second half... so you get a lot of the same....).

I listed the english only because that's the point in the restart of the mod I'm at.  I'll generate the language versions through a utility app that another mod writer built for the purpose of adding/creating languages.



This post was to discuss what people wanted to see added in that isn't in the current lists above.  Posts over the last 4 months about the last version, indicated the following:
1.  It needs to be smaller - it effects load time, at the point it says its loading languages and the 2 steps after that.  primarily due to the last version being 10,000 words.

2.  It needs to be cleaned - brute force action caused some verb forms and plurals to be incorrectly expanded. (i.e. misspelled words).  I started cleanup... but it's problems multiplied.

3.  translations needed shorter words - due to a shortage of "syllables" in the available language files, certain languages (nagas, orcs, kobolds primarily) were increased in length to accommodate the total number of words, really I should of just added a dozen or so new syllables to their languages which would of increased the number of potential words (100x 3-letter syllables = 10,000 words at 6 letters(2 syllables) in length,  If I remember orcs only had 72 syllables average at 3 letters in length so when I let it choose 3 to 4 syllables to reach 12,000+ words, it created a bunch of 9 to 12 letter words... a real cause for alarm when your orc immigrant comes in and its entire name is 36 letters long)   BASICALLY: problems kept multiplying.   and the players noticed.  Nagas were WAY worse... but it wasn't as big of a deal as no one is able to play nagas.

4. Its symbol table is corrupted, mismatched, etc.  yeah its a problem of man hours, really you have to manually write this part which sucks, go through thousands of entries trying to find a dozen or so that match a current symbol group, etc. 



I just did remove the expanded mod with the 1.24c upload.  It may upset some to see it go... but really considering that it's had mixed reviews in its current format, mostly negative, that shouldn't be a problem.  I think out of all the posts, comments, messages, and emails I've received over the mod; I've had maybe 5 positives and the rest were from the complaints department.  I keep moving forward.  I know the old version was bad, is crap, and will always be horrible.  I'm better than that.  Which brings me to today....

It needed an overhaul, and a lot of body work... the cost to fix the original would of cost more than it was worth.  If I had kept plugging at it like I had, it would only have gotten worse.  Really it was like looking at an old ford pinto from the 70s sitting in the yard covered in rust.  Could it have been fixed? yes.  Could it have been cleaned up? yes.  But the real question is why would you want to?  To fix the old version, would of taken a line by line proofreading of 80,000+ lines of text and chasing down every error.  I'd be still working on it at Christmas 2017, if I hadn't given up by then.

So I traded it in, and went to work with new parts that I can custom fit now, faster, sleeker, better quality. and more befitting what the community deserves.  basically I combed the raws and the dictionary to build a list of words not in the current language files that would be beneficial and produced the above list.  A short list of ~1100 words missing from the original vanilla DF that should of been in there.   Once I finish this version, I'll move on to an additional list for masterwork, and be done.

as for your toggle idea... well until Meph returns and makes changes to the MW gui, that's going to be impossible.  Really if I put it in the 1.24, it will be only as a complete work, that I could and would be proud of. 

Further note: if the community really isn't interested in this, well then... I have other projects I'll move onto that involve other things.  I have 2 races I want to work on, including a variation of the orcs, and a zergling-esque style race I haven't named.  I also would like to fully flesh out the masterwork clay reactions that fell apart somewhere between 42.06 and 43.03.   Or chase down other "raw" changes that would improve gameplay.
Logged
Legendary Dwarf Fortress
Legendary Discord Group
"...peering into the darkness behind the curtains, evokes visions of pixies being chased by dragons while eating cupcakes made of coral iced with liquid fire while their hearts burn out with unknown plant substances..." - a quote from the diaries of Amostubal

-Nihil-

  • Bay Watcher
    • View Profile
Re: Expanded Dictionary - masterwork edition please discuss.
« Reply #4 on: April 11, 2017, 02:48:31 pm »

Seems worthwhile to me. Sounds like you know what you should do too. Errors need to be fixed obviously, much easier to start anew than try to find them as you suggested. If you can make shorter words(and therefore names) that still sound like they fit then that would help a lot with being able to recognize individuals in the fort. Both those things seem to provide pure benefits. Once we have that in then it's a question of balance between more words and the loading times that come with it.

I would be interested in seeing this utility app to translate in to new languages. I have been working on my own race mod and have been dreading going through the 18 000 line language files to create a unique language.

In terms of feedback I would much rather have the words in your second list than the first. Would rather have places and people named for concepts than animals. Obviously we would want words like bird, rodent, lizard and what not but I think specific animals(like blue jay, cockatiel, and rainbow trout) would be appreciated less than words like arcane, grace, and massacre. Basically I guess I am saying you could cull a lot of specific animals names out of the dictionary.
Logged

grotball

  • Bay Watcher
    • View Profile
Re: Expanded Dictionary - masterwork edition please discuss.
« Reply #5 on: April 11, 2017, 08:08:12 pm »

I guess I haven't seen the dictionary as an issue to be fixed, or big benefits versus the effort, but I don't tend to play other races either. Swear/vulgar words have popped up a few times in my fortresses with MW, and I find it immersion-breaking and not particularly hilarious (coming from someone with a pretty foul mouth), but I can see how some people find humour in it/youtube lets plays etc. I'd at least make those opt-in.
Logged

-Nihil-

  • Bay Watcher
    • View Profile
Re: Expanded Dictionary - masterwork edition please discuss.
« Reply #6 on: April 11, 2017, 08:16:48 pm »

I guess I haven't seen the dictionary as an issue to be fixed, or big benefits versus the effort, but I don't tend to play other races either. Swear/vulgar words have popped up a few times in my fortresses with MW, and I find it immersion-breaking and not particularly hilarious (coming from someone with a pretty foul mouth), but I can see how some people find humour in it/youtube lets plays etc. I'd at least make those opt-in.

I could go either way with some of those words. I don't really notice them myself but that might be because I don't really play adventure mode. Maybe it would make sense to have them included, assign them to a "vulgar" symbol and then cull that symbol in the civilized races? I think that should keep races from using those in place names.

As to the benefits, I would say the largest one is more names being available. It can be really hard to remember which orc is which when there are so many with the same first name.
Logged

Amostubal

  • Bay Watcher
    • View Profile
Re: Expanded Dictionary - masterwork edition please discuss.
« Reply #7 on: April 11, 2017, 08:53:42 pm »

Seems worthwhile to me. Sounds like you know what you should do too. Errors need to be fixed obviously, much easier to start anew than try to find them as you suggested. If you can make shorter words(and therefore names) that still sound like they fit then that would help a lot with being able to recognize individuals in the fort. Both those things seem to provide pure benefits. Once we have that in then it's a question of balance between more words and the loading times that come with it.
Yep all reasons I wanted to do this rewrite sooooooo bad.... really it was my first DF mod and I hacked and slashed my way through a lot of stuff, that I really should of slowed down and read.

Quote
I would be interested in seeing this utility app to translate in to new languages. I have been working on my own race mod and have been dreading going through the 18 000 line language files to create a unique language.
Igfig's DFLang - basically you give it the already existing language files, and it builds the new language translation file off those.  there's options for it to use a partial list of translation words (like I would do for dwarfs, it just takes the larger word file, sees which ones are missing from the current translation file, and adds the new translations to the dwarf file, based on the words already present.  It will also remove translations if a word is removed from the language_word.txt file).  It can also produce entire language from scratch and a base of 100+ words you enter into a list.  so just start typing what I would call a ramble, ignore punctuation and just put in what you want the words to sound like.  Boltgun used this one to produce the succubus language by entering a couple hundred demon names from history into it.  can't decide? get a random list of words from some known language (I'm thinking french for a saucy sexy sounding language, or creole for a saucy sexy rugged sounding language) and shove it into the program.  It works great.  but it has its limits, it wont work and crash if there isn't enough syllables to match the number of words in the word list within the designated length.... which is what I was talking about above.

Quote
In terms of feedback I would much rather have the words in your second list than the first. Would rather have places and people named for concepts than animals. Obviously we would want words like bird, rodent, lizard and what not but I think specific animals(like blue jay, cockatiel, and rainbow trout) would be appreciated less than words like arcane, grace, and massacre. Basically I guess I am saying you could cull a lot of specific animals names out of the dictionary.

I have a friend named bluejay.... her mom is flower, and her dad sunbeam.... hippy children from the american 60s.... lol.

interesting thought.  in the original word list from vanilla DF there was several animals and plants, even some that don't actually occur in game.   of course dwarfs wouldn't be named after plants and animals, their symbols don't match.  you have to assign the animal to a symbol group, then select the symbol for the entity file with these tags:

[SELECT_SYMBOL:{target}:{symbol}] - prefer these symbols for target
[SUBSELECT_SYMBOL:{target}:{symbol}] - prefer these symbols as adjectives for target
[CULL_SYMBOL:{target}:{symbol}] - refuse to use these on these targets.
where noun is ALL, REMAINING, BATTLE, BRIDGE, CIV, LIBRARY, MILITARY_UNIT, RELIGION, ROAD, SIEGE, SITE, TEMPLE, TUNNEL, VESSEL, WALL, WAR

so like with dwarves in masterwork, all these symbol tags are used in this order:
   [SELECT_SYMBOL:WAR:NAME_WAR]    - so war gets to be named with NAME_WAR symbols.
   [SUBSELECT_SYMBOL:WAR:VIOLENT] - adjective for violent wars
   [SELECT_SYMBOL:BATTLE:NAME_BATTLE]
   [SUBSELECT_SYMBOL:BATTLE:VIOLENT]
   [SELECT_SYMBOL:SIEGE:NAME_SIEGE]
   [SUBSELECT_SYMBOL:SIEGE:VIOLENT]
   [SELECT_SYMBOL:ROAD:NAME_ROAD]
   [SELECT_SYMBOL:TUNNEL:NAME_TUNNEL]
   [SELECT_SYMBOL:BRIDGE:NAME_BRIDGE]
   [SELECT_SYMBOL:WALL:NAME_WALL]
   [SELECT_SYMBOL:REMAINING:ARTIFICE] - all remaining targets get artifice (tools weapons some archaic stuff)
   [SELECT_SYMBOL:REMAINING:EARTH] - same but with earth stuff (rocks, stones, mountains, hills, etc)
   [CULL_SYMBOL:ALL:DOMESTIC] - get rid of all domestic terms
   [CULL_SYMBOL:ALL:SUBORDINATE] - dwarves aren't subordinates
   [CULL_SYMBOL:ALL:EVIL] - dwarves don't use evil symbols etc
   [CULL_SYMBOL:ALL:FLOWERY]
   [CULL_SYMBOL:ALL:NEGATIVE]
   [CULL_SYMBOL:ALL:UGLY]
   [CULL_SYMBOL:ALL:NEGATOR]
   [SELECT_SYMBOL:TEMPLE:NAME_BUILDING_TEMPLE] - earlier artifice was added then all the stuff after removed... now add name_building_temple to temples
   [SELECT_SYMBOL:LIBRARY:NAME_BUILDING_LIBRARY] same for libraries.

so lets say an OGRE is a symbol VIOLENT UGLY. well at first its added to the adjective lists for wars etc... then because cull_symbol:all:ugly its removed from all those lists, even though it is violent.  plants generally are tossed in flowery, animals in nature or domestic, food -> domestic, some animals are nature and violent (ape lion etc), a lot of aquatic stuff animals etc are marked symbol aquatic... well we never added that group, so unless the animal has some other symbol it wont show up in dwarfs.  what this does do is that elves who don't cull on domestic and select on nature, flowery, will end up with more flowery names.    The first list also includes items, food, weapons, tools, that would end up in violent categories, artifice categories, etc.  it also includes all the inorganic rocks which go into EARTH category, so you will have dwarfs name gneiss and granite, jet and adamantine.  it also contained career names so like humans do today, you will have dwarves who show up with names like smith and carpenter, mechanic and engineer as these will be included in certain symbol groups. 

anyways until symbols are laid new words wont even show up often in most names, randomly yes, but once symbols are laid, the concern of dwarves being called trout would be rare, as long as trouts aren't put in the earth symbol.  given the word a symbol makes it more likely to get selected or culled from word selection, and random prefers not to select outside the preferred symbols for the civ.

another cool feature of the symbol system is you can create your own symbol groups.  lets say we do want the professions to be under one symbol group called DWARF_PROFESSIONS we can do that in the language_SYM file.  then in the dwarf entity file, toss in a [SYMBOL_SELECT:REMAINING:DWARF_PROFESSIONS]  or we could just call it PROFESSIONS, and then mark some things say cheesemaker as say DOMESTIC also then after we symbol select for professions the cull of all domestic words afterwords would remove the option for cheesemaker.  the main issue is order.  if you cull before you add a symbol, it wont block cheesemaker...

the main thing is to add more options to current symbols and potential new symbols for future additions and deletions.  we could group numerals as numerals, then cull or add them (add them to a race of drones. so that everything is numbered....).

so the animals and plants would increase for the hippy... I mean elf children names.  (the dwarven kingdom has reached a peace agreement with king ranbow trout flowerchild).

I guess I haven't seen the dictionary as an issue to be fixed, or big benefits versus the effort, but I don't tend to play other races either. Swear/vulgar words have popped up a few times in my fortresses with MW, and I find it immersion-breaking and not particularly hilarious (coming from someone with a pretty foul mouth), but I can see how some people find humour in it/youtube lets plays etc. I'd at least make those opt-in.
really, I see it as a break in the monotony when odd names pop up (industrialfreak once showed up on a randomly selected mechanic...),  it makes MW different as we don't end up with hundreds of urists etc.

yeah most of the vulgar words and anatomy words are out.  really the fortress of fiery vaginas is not really a name I want showing up...

now double entendre words?  yeah those make for interesting names.  Words that have one meaning, but also a subtle meaning...

I could go either way with some of those words. I don't really notice them myself but that might be because I don't really play adventure mode. Maybe it would make sense to have them included, assign them to a "vulgar" symbol and then cull that symbol in the civilized races? I think that should keep races from using those in place names.

As to the benefits, I would say the largest one is more names being available. It can be really hard to remember which orc is which when there are so many with the same first name.

More of what I was talking about above.  we can always cull the words from different groups.  the biggest problem with the huge dictionary was that getting things into symbol groups was impossible.  new symbol groups can be created, then a switch later to turn off the symbol groups will be possible.  Say a [CULL_SYMBOL:ALL:EXPANDED_DICTIONARY] could turn off the use of the additional words all together, with a switch adding the line to all the entity files.  similar for "vulgar" etc.  also symbol groups such as SUCCUBUS_PREFER and SUCCUBUS_DETEST can be added to cull or select specific words for a particular civilization.

« Last Edit: April 12, 2017, 06:09:06 am by Amostubal »
Logged
Legendary Dwarf Fortress
Legendary Discord Group
"...peering into the darkness behind the curtains, evokes visions of pixies being chased by dragons while eating cupcakes made of coral iced with liquid fire while their hearts burn out with unknown plant substances..." - a quote from the diaries of Amostubal

-Nihil-

  • Bay Watcher
    • View Profile
Re: Expanded Dictionary - masterwork edition please discuss.
« Reply #8 on: April 11, 2017, 11:30:50 pm »

so the animals and plants would increase for the hippy... I mean elf children names.  (the dwarven kingdom has reached a peace agreement with king ranbow trout flowerchild).

Not against all the animals names as I am definitely considering this for all races not just dwarves, I just think that I would rather have less specific animals in favour of some of the words in your other list. A lot of them, like rainbow trout and grey parrot could be constructed anyway with their individual words. You could at least remove those ones.
Logged

Amostubal

  • Bay Watcher
    • View Profile
Re: Expanded Dictionary - masterwork edition please discuss.
« Reply #9 on: April 12, 2017, 06:21:28 am »

ah yeah I now understand what you mean.. I did chop certain ones.  Pushed the adjectives into separate words.  Like amaranth, vanilla had had 7 varieties of amaranth.... all different colors, chopped them.  Bears too.... I chopped all the bear varieties out.  I'll scan back through there and post a quick switch over.  some of those adjectives I want to keep.  there isn't a word for rainbow in vanilla DF, elves need rainbows...
Logged
Legendary Dwarf Fortress
Legendary Discord Group
"...peering into the darkness behind the curtains, evokes visions of pixies being chased by dragons while eating cupcakes made of coral iced with liquid fire while their hearts burn out with unknown plant substances..." - a quote from the diaries of Amostubal