The underlying event system that I'm using to get all the data, has some real problems. It will fail to create some attack events, and it will also incorrectly duplicate the last attack event, in place of the new one, in some cases.
These are the biggest problems. In my sniffer lua script, I just dump the information given by the event system (eventful/EventManager)
It is somewhat disappointing to know that no matter how much polish and hard work I put in here, I will not have a completely accurate tool.
The other big annoyance is that death strikes don't get wound information in the attack callback. My guess is that the game doesn't bother fully committing wounds to a body that has died this frame.
Regardless of these deficiencies, I should be able to give you a tool that is correct 90% of the time. The longer the recorded battle, the bigger the magnitude of the errors.
I had to write a heuristic system to tie the combat report text to the actual strike events, since they aren't directly related in the lua environment. This system works by scanning the combat log from beginning to end, and removing entries that are found to be matches (only allow each report line to be used once). So if an event gets dropped or duplicated, it can throw off the entire stream of future events for the (attacker, defender) pair.
I have split the log file up into sessions, that are easy to control via the df hack command line. Usage will involve pausing and starting/stopping a session around the combat actions that you want to analyze. This session scoping can be used to still have lots of data in a file, but it will be separated out, so that these event system errors can be localized.
It is still very usable.