Gotta love SymInitialize()

Yesterday morning I started playing with a complete rewrite of mod_whatkilledus and mod_backtrace. I have a handful of goals for the rewrite, but the largest one is to have the functionality available on the Windows platform.

I started with trying to get a backtrace for a crashing module on Windows using the DbgHelp library. It didn’t take long to get something minimal working with my MinGW/MSYS build of httpd trunk:

Wed Aug 22 09:17:07 2012
mod_wbt report:
Exception code:    EXCEPTION_ACCESS_VIOLATION
Exception address: 0043C47C
0043C47C
0040C6E8
0040CA5D
00442913
00442A27
0043F57C
00427A28
0046414C
76CC339A
779D9EF2
779D9EC5

Of course, not many people would find that helpful. An hour or so later I got another chance to play and with a few more changes I had this output:

Wed Aug 22 10:50:45 2012
mod_wbt report:
Exception code:    EXCEPTION_ACCESS_VIOLATION
Exception address: 0043C56C
  43C56C Return address
Symbol: apreq_parse_headers
  40C6E8 Return address
Symbol: ap_run_handler
  40CA5D Return address
Symbol: ap_invoke_handler
  442A03 Return address
Symbol: ap_process_async_request
  442B17 Return address
Symbol: ap_process_request
  43F66C Return address
Symbol: apreq_parse_headers
  427A28 Return address
Symbol: ap_run_process_connection
  46423C Return address
Symbol: mpm_signal_service
  76CC339A Return address
Symbol: BaseThreadInitThunk
  779D9EF2 Return address
Symbol: RtlInitializeExceptionChain
  779D9EC5 Return address
Symbol: RtlInitializeExceptionChain

Some of the symbols are hosed, but it was starting to look meaningful.

I got back to work on the Windows side this evening. The result of my first test of the same crash with a Visual Studio build of httpd was pretty disappointing:

Thu Aug 23 18:14:51 2012
Exception code:    EXCEPTION_ACCESS_VIOLATION
Exception address: 714B14E2
  no-symbol [0x714B14E2]
  ap_run_handler [0x6FF02115]
  ap_invoke_handler [0x6FF0257C]
  ap_die [0x6FF0F7FC]

Some of that is actually correct, but there’s not much information available. This test was with an httpd 2.4.3 community build created with Visual Studio 10, and no .pdb files were present. I switched to an old build of 2.2.15-dev (What tools did I have back then???), which had .pdb files for everything, and got a longer but just as screwy backtrace:

Thu Aug 23 19:53:53 2012
Exception code:    EXCEPTION_ACCESS_VIOLATION
Exception address: 714B10AE
SymInitialize() failed with error 87
  no-symbol [0x714B10AE]
  no-symbol [0x714B106D]
  ap_run_handler [0x6FF02115]
  ap_invoke_handler [0x6FF0257C]
  ap_die [0x6FF0F7FC]

I had noticed by this point that SymInitialize() was failing. The only time I could get it to “succeed” was with FALSE for the fInvadeProcess parameter, and that yielded this lovely backtrace:

Thu Aug 23 19:57:00 2012
Exception code:    EXCEPTION_ACCESS_VIOLATION
Exception address: 714B10AE
  no-symbol [0x714B10AE]
  no-symbol [0x714B106D]
  no-symbol [0x6FF02115]
  no-symbol [0x29800F8]
  no-symbol [0x29800F8]
  no-symbol [0x29800F8]
  no-symbol [0x6EECCCC0]
  no-symbol [0x6FF0257C]
  no-symbol [0x29800F8]
  no-symbol [0x29800F8]
  no-symbol [0x29800F8]

(At least it didn’t lie.)

The trick I finally found was to override the default search path, even though I understand from the documentation that it should find the symbol files anyway:

if (SymInitialize(process,
                  "C:\\Apache22\\bin;C:\\Apache22\\modules",
                  TRUE) != TRUE) {
    fprintf(log, "SymInitialize() failed with error %d\n",
            GetLastError());
}

(And no, I don’t intend to use stdio permanently in this context.)

That yields the following:

Thu Aug 23 19:59:46 2012
Exception code:    EXCEPTION_ACCESS_VIOLATION
Exception address: 714B10AE
SymInitialize() failed with error 87
  no-symbol [0x714B10AE]
  no-symbol [0x714B106D]
  ap_run_handler [0x6FF02115]
  ap_invoke_handler [0x6FF0257C]
  ap_process_request [0x6FF0F7FC]
  ap_process_http_connection [0x6FF0AA82]
  ap_run_process_connection [0x6FF04F25]
  ap_process_connection [0x6FF051B2]
  worker_main [0x6FF1E130]
  endthreadex [0x74F23433]
  endthreadex [0x74F234C7]
  BaseThreadInitThunk [0x76CC339A]
  RtlInitializeExceptionChain [0x779D9EF2]
  RtlInitializeExceptionChain [0x779D9EC5]

(And yes, SymInitialize() still returns an invalid parameter error.)

There’s still some bogosity (libc symbols needed), and for some reason it doesn’t grok the symbols in my DSO (which has a .pdb file), but this is pretty good.