Topic: discuss: Future Proofing of (memory hacking)Utilities (Read 1743 times)

sphr · « **on:** December 04, 2007, 11:21:00 pm »

(edit: reformatted some stuff due to code section not being wrapped and causing a wider-than screen-width page)
Not really a modding post, and more of a tools/utilties post. But since there are not tools/utilities subforum, and tool/utility topics seem to end up here, so this is where I'll drop it.

BEFORE YOU CARRY ON:
This is meant for present/potential tools/utility developers only Think that nobody else will be interested. So save yourself the pain of reading a long post with gibberish-like codes.

==(Limited) Future Proofing of (memory hacking)Utilities==

I am just getting started with re-learning the memory hacking all over again.

One problem I find with trying to create a utility is the rapidly changing version and the need to keep things updated, recompile and upload.

On the user end, they have to deal with multiple versions of the same tool that works for one version of dwarf fortress (I don't know about others, but I keep my previous versions around, in case there are insurmountable bugs in the lastest versions etc etc.)

What I'm thinking of is, instead of hard-coding/static-compiling all the memory offsets into the binary, why not add an optional means of loading a new "Memory Map" from a text file? If there is any latest updates to the memory map in the latest version, the "config" file could be edited into wiki and people can just copy and paste the config data to make their tools work with latest version if the attributes the tools used are defined.

To make the same config works across different tools, perhaps we can develop a library that accepts well-known attribute names defined by the community. E.g. Instead of assuming that nick name of a creature is 0x001C from the starting address of a known creature struct data, users of library uses the defined attribute name "first_name" instead. The actual resolving of the name to actual address offset is handled by the library. In the ideal case, the library should also provide wrappers to work with these offsets such that the tool developer don't ever have to deal with the address after binding the main dward fortress process. Or even the process binding part could be part of the library.

====A hopefully motivating example====
This is a mockup of a partial input file that helps define the memory maps for a particular dwarf fortress version

code:


<!--
  The following defines a memory map called  "creature_ptr" which "specializes"
  from a primitve type "pointer" and value_type denotes the memory map
  name of pointed data.
--><defmap id="creature_ptr" type="type "pointer" value_type="creature"/>
<!--  
    The following defines a memory map called  "creature_vector" which 
    "specializes" from "vector" type.
    A type-specific attribute "value_type" optionally denotes the contained 
    type.  If this type does not have a size which is statically known, the 
    vector cannot use indexed retrieval and falls back to just being a block 
    of bytes
-->
<defmap>
<!--  
    The following defines a memory map called  "creature" which "specializes" 
    from a primitive "complex" type.   Complex maps are the only ones that 
    can have submaps.
-->
<defmap>
  <submap>
  <submap>
  <submap>
  <submap>
  <other>
</defmap>
<!-- 
  this defines a memory map called  "main" which "specializes" from "complex" 
  type.  Specialized data are contained as children. 
-->
<defmap>
  <submap>
  <other>
</defmap>

Code-wise, say after the memory maps are loaded and placed in a registry (and can be retrieved by name), we can then create "memory objects" which is just a particular binding of a specific address/process handle to a memory map. The following is just rough work that doesn't fully make use of all the meta data available in the mock data earlier.

e.g. (c++ on windows platform assumed)

code:


  HANDLE handleDF;
  // proceed to find and retrieve handle for dwarf fortress process and detect
  // version and load correct memory map definitions etc)  /* for the rest of mock code, assume:
    suffix of Sptr to be shared/smart ptr.
    ??MemMap to be a memory map instance
    ??Object is like a io formatter for a memory address and a particular map
    In short, ??MemMap just denotes where the offsets to certain known
    attributes are, ??Object is the one that tries to interpret a memory
    location as the speicified ??.
  */
    // Optional phase to verify that all things we are going to use are
    // defined.  If not, should probably terminate the application with
    // error messages where appropriate.
    // start of testing.
      MemMapBaseConstSptr mmap;
      ComplexMemoryObjectSptr temp;
      // **** verify main process map ****
      // verify required memmap exists
      mmap = MemMapRegistry::GetMap("main");
      ASSERT(mmap)
      // verify required memmap attributes exists
      ASSERT( mmap->GetMapping("main_creature_vector_loc") );

      // **** verify creature struct map ****
      // verify required memmap exists
      mmap = MemMapRegistry::GetMap("creature");
      ASSERT(mmap)
      // verify required memmap attributes exists
      ASSERT( mmap->GetMapping("first_name") );
      ASSERT( mmap->GetMapping("nick_name") );
    // end of testing
  /*
      Note that each tool applciation does not need to verify the whole memory
      structure.  It only have to verify those that it uses.  For example,
      foreman.exe uses labour settings and professions.  It does not need to
      verify things like inventory or health.
  */

  // call to registry to return shared ptr to a memory map called "main"
  MemMapSpr spDFMemMap = MemMapRegistry::GetMap("main") ;
  if( spDFMemMap == NULL )
  {
    // report error.  memory map not found.
    // redundant if testing is done before
  }
  else
  {
    ComplexMemoryObjectSptr  spMainDF =
      ComplexMemoryObject::New(   0     // no offset address
                                , handleDF  // process handle of dwarfort.exe
                                , spDFMemMap
                              );
    ASSERT( spMainDF != NULL ); // assume no prob
    VectorObjectSptr creature_vector = VectorObject::New(
          spMainDF->GetSubObject("main_creature_vector_loc")
        , MemMapRegistry::GetMap("pointer") ) ;
        // need to specify a map with known size to make vector iterable.
    ASSERT( creature_vector  != NULL ); // assume no prob
    // iterate through creature vector
    for( VectorObject::Iterator it = creature_vector->Begin()
        ; it != creature_vector->End()
        ; ++ it )
    {
      ComplexMemoryObjectSptr creature =
             ComplexMemoryObject::New( PtrObject(*it).Dereference()
                                    ,  MemMapRegistry::GetMap("creature")
                                    );
      ASSERT( creature  != NULL ); // assume no prob
      // retrieve first and nick names
      std::string first_name =
        StringObject(creature->GetSubObject("first_name")).GetValue();
      // declare var for nick name instead to allow easier write back
      StringObject nick =
        StringObject(creature->GetSubObject("nick_name"));
      std::string nick_name = nick.GetValue();
      // change nickname and rewrite it back.  Note that if new string is
      // longer than existing string capacity, it gets truncated.  I don't
      // think it is feasible to try to do cross-process string reallocation.
      nick_name = "Newbie";
      nick.SetValue(nick_name);
      // note: should throw exception if in exception enabled and failed.
    }
  }

The above mockup end-coder code is entire independent on any address/offsets perculiar to a process.

It does still have this shortcoming:
Library user have to know type of attributes... e.g. first_name and nick_name are strings. ComplexMemoryObject::GetSubObject just retrieve the address. Currently, application have to interpret that (e.g. as strings) to get the data properly. If say ToadyOne decides that instead of std::string, first and nicknames are now stored using ICU's UnicodeString.... Bam... broken app. As a possible extension, notice that "type" is actually placed into the mock up xml data, which is unused atm. It could be used to switch between how to intrepret the offset address instead of simply assuming it right now...

Feasibility:
Coding-wise, not much problem. Although I use mockup code in the above, I think they can be turned into real code without much changes as the concepts behind are almost fully thought out. The most trouble-some part I find will just be the serializing to and from xml/some other text file (labour-wise, rather than design-wise). The only thing left is that enough tool makers (eh.. at least 2?) have to support this to make the effort worthwhile both to dev the library as well as preparing a new "config file" for each new version.

Anybody game for it?

[ December 04, 2007: Message edited by: sphr ]

Sean Mirrsen · « **Reply #1 on:** December 05, 2007, 01:14:00 am »

I ain't much of a programmer, but as an "end user", I have this to say. If all this is used for is circumventing bugs and debugging mods, then it will be less work for Toady to just enable a wizard mode. Seeing through walls, altering status of creatures, etc, etc - it is all a lot more easily reachable from within the game, not to mention that Toady quite probably already uses a similar mode when debugging. As for "proofing", I think that as a game, DF should not be hackable, or at least that hacking it should not be supported.

sphr · « **Reply #2 on:** December 05, 2007, 01:26:00 am »

hmm.. I'm not sure about your definition of hacking here..

ToadyOne is working on what he's working on. some of the stuff we want are not CHEATS, but rather than helpers (e.g. Foreman which allows you to enable/disable job types for all members of a profession) meant to save lots of pain. Maybe another one will be to watch for and warn player about wounded/unhappy dwarves. No "hacking" of the game as you might have thought although the very process of reverse engineering the memory structure in order to provide those things is already a form of hacking.

This is just a suggestion of something that will make TOOLS more future proof... NOT DWARF FORTRESS ITSELF. As in, say I use foreman.exe, I don't have to keep a different exe for every single DF version. And when a new one comes out, I don't have to wait for the tool maker, who has to in turn wait for somebody to document the changes in memory offsets in order to compile a new binary, which then has to be redistributed. Instead, we have one binary compatible with all supported versions of DF through the use of hardcoded and config file defined data.

But I think I should stop. This is more for people who wants to improve the whole game experience (at least temporarily until Toady catches up with the interface). If you are one to happily wait for all that, well good for you. All in all, if you are just talking about the morals and should and should not's of "hacking", this is not really the thread for you (no offence intended)

Dedas · « **Reply #3 on:** December 05, 2007, 09:30:00 am »

Sphr, you are doing a grand job, and despite of what some people say I really like it. Keep it up!

sphr · « **Reply #6 on:** December 07, 2007, 05:56:00 am »

The config file part is the extra option. mmaps for known versions of DF can and should be inbuilt. Within the file, we can just give the main map a different name. e.g. instead of a generic "main", we could name it "df_?

??" where

?? is the version. Nice thing about combined config file is that structures that don't change can be referred by subsequent maps.

But the long and short of it is, tool makers deal with conceptual "memory objects" and not specific memory offsets. Then the person who maintain the tool logic and the person who update the memory map to latest binary can work independently and have they final product work together.

I chose c++ cause it seems more appropriate for memory and pointer stuff, with the added possibility of porting to python/lua other scripting libraries. Actually, scripting language seems like a good idea for one shot tools too, although running it for somebody usually means the person has to install the whole package (activeperl, pythonwin, etc etc). Having a largely c++ / windows api code means that we can have independent smaller exe or those that uses native api to create ui without need to install the scripting packages. but I'm open to anything that will allow me to make some UI workaround tools to playing DF more productive (spend more time playing than hurting your fingers). Btw, I don't really know lua atm, being a more python guy But it's not because of lack of interest, but rather lack of opportunity.

Question: Currently, how does a tool knows what version of binary it is trying to read?

Jifodus · « **Reply #7 on:** December 07, 2007, 01:52:00 pm »

To identify a version, you'd just pick identifying strings of bytes in the DF executable image memory (except the ".data" section) and if all the strings of bytes match the reference strings of bytes, then it's probably that version. This method obviously isn't as perfect as having the user (mis)identifying which version it is, but it'll undoubted work 99.99% of the time. At the moment, I don't think any of the tools listed on the wiki actually try to identify which version it's looking at automatically.

Also I chose Lua primarily for its easy interface with C/++ in addition to its embeddability. I would have embedded perl, but I already tried to embed perl in another program without success. I actually hadn't thought about using perl/python though I do suppose now that those languages do have capabilities to use ReadProcessMemory/WriteProcessMemory. But then yeah, there is that other issue, users would have to install the whole package + extra libraries... and hope they work.

There really doesn't need to be native C/++ API beyond: finding a DF process, opening the DF process, reading from the DF process memory, and writing to the DF process memory. However, I am making my framework a bit more useful by including access to VirtualQueryEx, a memory scanner*, a memory allocator/deallocator in the DF process*, more? I will say that my framework is enough for creating basic tools.
* not implemented yet

Heh, I've never really needed to learn python so I don't know it. The closest I've gotten to python is looking at modding Mount & Blade, but since I never made any hacks using the official mod-maker I didn't learn python.

sphr · « **Reply #8 on:** December 08, 2007, 10:52:00 am »

thanks for the info! As for the identifying strings of bytes, any idea what they are so far? The wiki records the data map, but not so much on identification. Even if not accurate, it could be useful.

But assuming that we only run one dwarf fortress executable at one time (multiprocessor owners can probably run as many instances as they have number of processors, at least until DF goes multi-processing), I guess a not too nice way is to yet the user make sure that the version of the tool/data map used is compatible with the executable that is being runned.

BTW, I did a memory scan for 33b. It seems that there are at least one location that stores the version as plain ascii strings. Not sure if these locations are static though. The safest seems to be the starting menu's text itself. although different versions prob have this at different locations, I think it can serve well enough to identify a particularly known version?

I found the following locations.
The version strings themselves are the data to be expected.
("v0.27.169.33a", 0x01BA29B0)
("v0.27.169.33b", 0x01B99FF0)
("v0.27.169.33c", 0x01BA29B0)
("v0.27.169.33d", 0x01B78670)

[ December 08, 2007: Message edited by: sphr ]

Jifodus · « **Reply #9 on:** December 08, 2007, 01:42:00 pm »

quote:
Originally posted by sphr:
<STRONG>BTW, I did a memory scan for 33b. It seems that there are at least one location that stores the version as plain ascii strings. Not sure if these locations are static though. The safest seems to be the starting menu's text itself. although different versions prob have this at different locations, I think it can serve well enough to identify a particularly known version?</STRONG>

I wouldn't recommend against using the "menu" version strings... since I happen to know that the menu screen is stored in the index file.

I would recommend using something like PEBrowse and dump the ".text" section of the exe, and copying ~16 characters from 10 different locations and use them as the identifying strings. You will also want to keep track of what address you got them at, because you will want to read the memory at that address and do the comparison that way.

If all 10 of the strings match, then it is that version (assuming you've done your research and none of the other version match against the same 10 strings).

sphr · « **Reply #10 on:** December 09, 2007, 12:42:00 am »

good point. din look deep enough to realised it is loaded from an external data source, which will probably change if DF goes multilingual. For the time being, I'm going to be lazy and use the risky method (meaning doesn't work if language files are changed), as I'm devoting most of the efforts towards the actually making the memory mapping easier.

As for "finger-printing" the binary, was wondering if it would help if known methods are recorded in wiki, so that it can be maintained and updated. Also may helps if most new tools are based on a common fingerprinting system. What I was really hoping is something to help new tool developers to concentrate on tools, and not on all the system/process/memory offset management. Hopefully, by lowering the entry bar, we can get more innovative helper utilities (not necessary cheats) to make DF more fulfilling.

Nesoo

Re: discuss: Future Proofing of (memory hacking)Utilities

Jifodus

Re: discuss: Future Proofing of (memory hacking)Utilities

News:

Author Topic: discuss: Future Proofing of (memory hacking)Utilities (Read 1743 times)

sphr

discuss: Future Proofing of (memory hacking)Utilities

Sean Mirrsen

Re: discuss: Future Proofing of (memory hacking)Utilities

sphr

Re: discuss: Future Proofing of (memory hacking)Utilities

Dedas

Re: discuss: Future Proofing of (memory hacking)Utilities

sphr

Re: discuss: Future Proofing of (memory hacking)Utilities

Jifodus

Re: discuss: Future Proofing of (memory hacking)Utilities

sphr

Re: discuss: Future Proofing of (memory hacking)Utilities

Jifodus

Re: discuss: Future Proofing of (memory hacking)Utilities

sphr

Re: discuss: Future Proofing of (memory hacking)Utilities