Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1] 2

Author Topic: [memory hacking]: detecting binary version (take 2)  (Read 2017 times)

sphr

  • Bay Watcher
    • View Profile
[memory hacking]: detecting binary version (take 2)
« on: December 10, 2007, 03:00:00 am »

ok, the last attempt at trying to find identity block (searching for a block where the block itself is the data wanted) in DF process VM was not really working out.  This is take two at known version detection:

This time round, I'm using a mixture of binary filesize and crc32 checking.  Crc32 provides the file signature.  Filesize is just an extra safety switch in case two versions ends up with the same crc32 by coincidence.

Here's what I gathered so far, from a quick program:

33a : (crc32:0x0d752a37,filesize:5226496)
33b : (crc32:0x54b3ed5d,filesize:5230592)
33c : (crc32:0x9acba447,filesize:5255168)
33d : (crc32:0xcec9022c,filesize:5259264)

The only short-coming here I guess is that checksum computation could be expensive, but given DF binary's small size, and that I'm using an assembly implementation I got off codeproject, I think it is acceptable (the delay is not really noticeable by human), if it is done only once at the entry point and once again at every re-entrance.

Any suggestions/criticisms/loop-holes?

edit:added values for 33a

[ December 10, 2007: Message edited by: sphr ]

0x517A5D

  • Bay Watcher
  • Hex Editor‬‬
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #1 on: December 10, 2007, 05:40:00 pm »

I have two issues with this method.

First, this requires a tool to track down, or be given, the full path to the executable in the filesystem.  That's problematic for a memory-tweak tool.  The only reliable way I can see (without doing some research) to go from memory image to filesystem path involves code injection using a helper DLL.  That's a huge pain for very little gain.

Second, if we go the easy cheezy way of requiring the user to give us the path, what happens if we're given a path to a different version?  Bad things.

So I don't like this.

Logged

Jifodus

  • Bay Watcher
  • Resident Lurker
    • View Profile
    • Dwarf Fortress Projects
Re: [memory hacking]: detecting binary version (take 2)
« Reply #2 on: December 10, 2007, 07:38:00 pm »

If you're going to do CRC32...
Something like this would be much better.
code:

SIZE_T begin, size;
char buf[4096]; // 1 page
ReadProcessMemory(hdfprocess, 0x0040000, buf, 4096, NULL);
if (((short*)buf) == 'ZM') {
 // more checking & tweaking to get to...
 SECTION_INFO *section_info = buf+PE_offset+PE_headers_size;
 for (int i = 0; i < PE_header.num_sections; i++)
 {
   if (memcmp(section_info.name, "\0\0\0.text", 8) == 0) {
      begin = section_info.start;
      size = section_info.size + begin;
      break;
   }
 }
}
char *buf2 = new char[size];
ReadProcessMemory(hdfprocess, 0x00400000+begin, buf2, size, NULL);
int crc = crc32(buf2);


Of course, if I used proper structures, it would immensily help the readability. But I've written it once so I can copy-paste code from there. The 0x0040000 is the base address of the DF executable image, in theory, it should be there in all versions, unless Toady decides to get creative with the linker and change the base address.
Logged

sphr

  • Bay Watcher
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #3 on: December 10, 2007, 09:11:00 pm »

The "tool" I'm using to do it is the "tool" developed for DF itself.   Basically, I'm trying to see if it is feasible to do a tool library that allows optional external specification of key memory offsets (crc32 computation/checking is all embeded into the lib).  The tool just handle the logic and is itself version neutral .  Basically, if it finds any binary that it couldn't identify via the signature, it will output the signature to prompt people to look for/make a new memory mapping data that it can take in, and also to add the signature and version string to somewhere to be loaded so that next time it can recognize it (periodically, I guess all those external data for older versions can be packed into a new version of the library to make it more convenient).

As for getting the full path name and stuff....... No manual specification.  if you do memory hacking in windows, and already do stuff like GetModuleBaseName ... then I don't see any problem.  The companion GetModuleFileName(A/W) will net you the full path.  You don't need helper dll.  os handles that.  I've already done a proof of concept tool that automatically finds the df process, gets its module, check it's signature and output the version if it is recognized or output the signature itself if not recognized (that's how I get the above crc32 values in the first place)

@Jifodus
Using the PE header just have one dependency : the crc value has to be correctly updated in the first place.  I have not verified it, but somebody in the wiki discussion mentioned that crc for the df binaries are unset.  meaning : all 0.  meaning : can't used to differentiate between versions.  They DID mention reading the timestamp though.  Which could work in most cases unless somebody deliberately messes with the timestamps, or Toady's machine suffer a case of broken cpu clock (which should have rendered the machine unusable in most cases).  The crc32 is the "sure kill" way, but it doesn't necessarily mean that it is the "minimal" way.  Other cheaper methods (like timestamp) could work.  crc32 just add a perhaps-not-so-needed robustness. My main reason for starting this thread isn't to push for the use of my method.  Rather, it's to throw this into open discussion so that memory hackers know what methods there are out there to identify versions (previous to this, there is almost nil info on this.  Everybody just do mysterious constructions in his/her own forge).  By having this discussion thread, and perhaps later documenting all the useful stuff in wiki, hopefully we can fill up the version identifying gap to make life easier for tool makers in the future.  Also, hopefully, all tools will be somewhat backward compatible, instead of having to keep a separate binary for each version.

I'll see if I can make my testing program do something actually useful and release it, as a beta tool, to showcase crc32 versioning. Just not sure whether I want to wait until I put in the optional external memory maps.  Currently, although the logic of the tool is version independent, only the mapping for 33b, which I am using for testing, is internally added.  I guess I may have to either add in the mappings for later versions, or I add in the external maps so that people can specify new mappings via files before it will actually be tested.  Currently, when I am playing in between coding, I use the tool to list down dwarves who are unhappy or injured (show most serious injury level).  There is also a list all creature mode which is used more for exploring values in the creature structure.

edit: Btw, I'm using this guy's assembly implementation for the crc32 computation.

[ December 10, 2007: Message edited by: sphr ]

Jifodus

  • Bay Watcher
  • Resident Lurker
    • View Profile
    • Dwarf Fortress Projects
Re: [memory hacking]: detecting binary version (take 2)
« Reply #4 on: December 10, 2007, 10:35:00 pm »

My code doesn't use the PE header, it uses the ".text" section. I just use the PE header to find the ".text" section. I did say that my code isn't very clear. But I tried to make it clear that I was using the ".text" section.

To use GetModuleFileName(A/W) the user has to have at least 2000. Though I guess the number of people who don't use Windows 2000/XP/2003/Vista/2008 are very few, and they probably can access one of these.

Oh, sphr, do you ever go on IRC? Or check PMs?

Logged

sphr

  • Bay Watcher
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #5 on: December 11, 2007, 12:26:00 am »

PM checking not that often coz this board is somewhat lacking in certain convenient features (like new PM indicator?)

As for IRC, haven't done that in ages coz it is very ... unplanned (sit around in a chatroom and hope that people you want to meet appear?).  but if you've got a good channel, I might drop by.

bartavelle

  • Bay Watcher
  • Coin coin!
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #6 on: December 11, 2007, 03:51:00 am »

#bay12games, irc.worldirc.com

I think the CRC is the most robust method. It would however be totally cool to find the offsets automagically, for example by using topological methods "à la" Halvar Flake. I did preliminary research on that field with no luck ...

Logged

sphr

  • Bay Watcher
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #7 on: December 11, 2007, 04:59:00 am »

lol... topological means is only possible and feasible if coherence is guaranteed.  if, say, toady does something like rearranging the order of his variables for better reading......

Anyway, I've put together a tiny tool that can be made to lists all dwarves/creatures which are wounded or unhappy.  some options changeable by parameters... (run with "-help" parameter to see inlined help message)

I'll like to call it alpha, but it's actually less then that... it's just a program I wrote to test some concepts.

sample output

code:

total number of creatures = 324

kol "Forger.1", Armorer                (r:A6) aliv (h:209)(hlth: 8)
tosid "AxeMaster", Champion            (r:A6) aliv (h:486)(hlth:14)
èrith, Baby                            (r:A6) aliv (h: 78)

total number of creatures shown = 3


download at rapidshare

btw, I'll probably take a closer look at possibly using xml for external specification of memory offset mappings.  Have any ideas on that?

bartavelle

  • Bay Watcher
  • Coin coin!
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #8 on: December 11, 2007, 08:27:00 am »

Well, the whole concept of topological based tools is to be resistant to local changes, such as variable reordering, optimization flags changing ...
Logged

bartavelle

  • Bay Watcher
  • Coin coin!
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #9 on: December 11, 2007, 12:39:00 pm »

I wrote a POC script that gets the creature_vector_loc by:
* reading the dwarfort.exe file
* locating the string ' sculpture has melted!'
* find the only place where there is a 'push &" sculpture has melted"'
* disassemble back from this address
* count 5 "call" instructions
* the next instruction with an offset should be it!

Works with .33[abcde], and should work until a "call" is added between the two instructions, or the only conditionnal jump between them is reversed (cross fingers!).

A more robust implementation could be performed by graphing first all function calls, then identifying suitable functions and finally graphing them, finding nodes and offsets.

Yeah, just like bindiff :/

Logged

sphr

  • Bay Watcher
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #10 on: December 11, 2007, 09:27:00 pm »

LOL. Haven't used bindiff (must be "new") since I haven't really had the need or opportunity to do a lot of hacking (in a long long time.  there's was this thing I remember, a prehistoric creature known as segmented memory.....).  Think I should get my paws on it asap to check it out.  Used it's text cousin Windiff though.  Diffing tools are truly indispensable (along with caffeine, internet, "coding music", "idea" board, caffeine, "coding slippers", did I mention caffeine?).  It belongs right up there with stuff like Visual Assist X (that made ahem..  *cough* buggy *cough* VS2005 actually usable after you kill intellisense) and other similar stuff that just makes it almost impossible to believe that a long time ago, I actually managed to get by without them  :)  One of my design philosophies (for systems) has always been "If the  choice comes between making a human do something or making the machine do something, if the machine can do it, pick the machine."  Even if just identifies and lists out "suspicious" regions so that they can be human-inspected later.  It will serve greatly to make mapping a new binary easier and faster than a hacker + memory tool + calculator combo.  Can't match the latter combo in flexibility though.. LOL

BTW, was wondering if there is any memory tool that shows possible memory indirection as in a more graphical form?  What I wish I had:
Say I select a region of memory to be "analyzed".  For all double word aligned 4 byte values that looks like address, tool creates a "suspicious link" and show it (graphically best but text will do).  The through inspection, the user can either mark each link as confirmed, discard it as dummy, or leave it as suspicious for future discovery.  Will help greatly since I still belong to the human + memory tool + calculator generation.

Jifodus

  • Bay Watcher
  • Resident Lurker
    • View Profile
    • Dwarf Fortress Projects
Re: [memory hacking]: detecting binary version (take 2)
« Reply #11 on: December 12, 2007, 12:15:00 am »

On the subject (sort of), we do know, that the VS.net 2005 runtime library is staticly linked with Dwarf Fortress. We also know that the library is "libcmt.lib". We can automatically identify and locate all the CRT functions. Which I think could be a major benefit.
code:
void *malloc_ep = MemoryMap->getAddress("malloc");
DWORD threadID;
HANDLE allocatorThread = CreateRemoteThread(dfprocess, NULL, 0, malloc_ep, (void*)amount_to_alloc, 0, &threadID);
if (WaitForSingleObject(allocatorThread, INFINITE) != WAIT_OBJECT_0) {
 CloseHandle(allocatorThread);
 return NULL;
}
void *addr;
GetExitCodeThread(allocatorThread, (DWORD*)&addr);
CloseHandle(allocatorThread);
return addr;

code:
void *free_ep = MemoryMap->getAddress("free");
DWORD threadID;
HANDLE allocatorThread = CreateRemoteThread(dfprocess, NULL, 0, free_ep, remote_buffer, 0, &threadID);
if (WaitForSingleObject(allocatorThread, INFINITE) != WAIT_OBJECT_0) {
 CloseHandle(allocatorThread);
 return;
}
CloseHandle(allocatorThread);

Logged

bartavelle

  • Bay Watcher
  • Coin coin!
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #12 on: December 12, 2007, 02:21:00 am »

You might want to check IDA, if you have some money to invest, or can, like me, get a licence bought by your company:

You can define structures and declare them into the process memory, here the main creature vector : when the mouse goes over a suspected offset, it shows what it points to.

You can also declare long structures, such as the creature structure.

Finally, for some more money you can get the hex rays plugin that produces quite good C code from assembly : function that seem to be used to check wether a creature is "playable" (usually, it's not that spectacular  :).

It has TONS of other great features. IDA is just the best disassembler, and a must have tool for serious reverse engineering.

Logged

sphr

  • Bay Watcher
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #13 on: December 12, 2007, 04:26:00 am »

Thanks! I'll search for more info.  But getting the company to pay for it is highly unlikely in my case.. LOL... they'll splurge on stuff like the latest 3D Studio Max and plugins or something, but how on earth am I going to justify purchasing a disassembler... LOL

0x517A5D

  • Bay Watcher
  • Hex Editor‬‬
    • View Profile
Re: [memory hacking]: detecting binary version (take 2)
« Reply #14 on: December 12, 2007, 09:06:00 pm »

quote:
Originally posted by bartavelle:
<STRONG>It would however be totally cool to find the offsets automagically, for example by using topological methods "à la" Halvar Flake. I did preliminary research on that field with no luck ...</STRONG>

I did that with my enable magma buildings utility. http://www.dwarffortresswiki.net/index.php/Utilities#Enable_Magma_Buildings

Source is included in the zipfile, and I've also got sample search code on my wiki page. http://www.dwarffortresswiki.net/index.php/User:0x517A5D

I'd be happy to help you figure out how to use the code, or to find reliable search patterns.

I also recommend IDA Pro.  (I am a licensed user through my day job.)  It is sweet sweet.  Invaluable for this kind of thing.

Logged
Pages: [1] 2