Yesterday morning I started playing with a complete rewrite of mod_whatkilledus and mod_backtrace. I have a handful of goals for the rewrite, but the largest one is to have the functionality available on the Windows platform.
I started with trying to get a backtrace for a crashing module on Windows using the DbgHelp library. It didn’t take long to get something minimal working with my MinGW/MSYS build of httpd trunk:
Wed Aug 22 09:17:07 2012 mod_wbt report: Exception code: EXCEPTION_ACCESS_VIOLATION Exception address: 0043C47C 0043C47C 0040C6E8 0040CA5D 00442913 00442A27 0043F57C 00427A28 0046414C 76CC339A 779D9EF2 779D9EC5
Of course, not many people would find that helpful. An hour or so later I got another chance to play and with a few more changes I had this output:
Wed Aug 22 10:50:45 2012 mod_wbt report: Exception code: EXCEPTION_ACCESS_VIOLATION Exception address: 0043C56C 43C56C Return address Symbol: apreq_parse_headers 40C6E8 Return address Symbol: ap_run_handler 40CA5D Return address Symbol: ap_invoke_handler 442A03 Return address Symbol: ap_process_async_request 442B17 Return address Symbol: ap_process_request 43F66C Return address Symbol: apreq_parse_headers 427A28 Return address Symbol: ap_run_process_connection 46423C Return address Symbol: mpm_signal_service 76CC339A Return address Symbol: BaseThreadInitThunk 779D9EF2 Return address Symbol: RtlInitializeExceptionChain 779D9EC5 Return address Symbol: RtlInitializeExceptionChain
Some of the symbols are hosed, but it was starting to look meaningful.
I got back to work on the Windows side this evening. The result of my first test of the same crash with a Visual Studio build of httpd was pretty disappointing:
Thu Aug 23 18:14:51 2012 Exception code: EXCEPTION_ACCESS_VIOLATION Exception address: 714B14E2 no-symbol [0x714B14E2] ap_run_handler [0x6FF02115] ap_invoke_handler [0x6FF0257C] ap_die [0x6FF0F7FC]
Some of that is actually correct, but there’s not much information available. This test was with an httpd 2.4.3 community build created with Visual Studio 10, and no .pdb files were present. I switched to an old build of 2.2.15-dev (What tools did I have back then???), which had .pdb files for everything, and got a longer but just as screwy backtrace:
Thu Aug 23 19:53:53 2012 Exception code: EXCEPTION_ACCESS_VIOLATION Exception address: 714B10AE SymInitialize() failed with error 87 no-symbol [0x714B10AE] no-symbol [0x714B106D] ap_run_handler [0x6FF02115] ap_invoke_handler [0x6FF0257C] ap_die [0x6FF0F7FC]
I had noticed by this point that SymInitialize() was failing. The only time I could get it to “succeed” was with FALSE for the fInvadeProcess parameter, and that yielded this lovely backtrace:
Thu Aug 23 19:57:00 2012 Exception code: EXCEPTION_ACCESS_VIOLATION Exception address: 714B10AE no-symbol [0x714B10AE] no-symbol [0x714B106D] no-symbol [0x6FF02115] no-symbol [0x29800F8] no-symbol [0x29800F8] no-symbol [0x29800F8] no-symbol [0x6EECCCC0] no-symbol [0x6FF0257C] no-symbol [0x29800F8] no-symbol [0x29800F8] no-symbol [0x29800F8]
(At least it didn’t lie.)
The trick I finally found was to override the default search path, even though I understand from the documentation that it should find the symbol files anyway:
if (SymInitialize(process,
"C:\\Apache22\\bin;C:\\Apache22\\modules",
TRUE) != TRUE) {
fprintf(log, "SymInitialize() failed with error %d\n",
GetLastError());
}
(And no, I don’t intend to use stdio permanently in this context.)
That yields the following:
Thu Aug 23 19:59:46 2012 Exception code: EXCEPTION_ACCESS_VIOLATION Exception address: 714B10AE SymInitialize() failed with error 87 no-symbol [0x714B10AE] no-symbol [0x714B106D] ap_run_handler [0x6FF02115] ap_invoke_handler [0x6FF0257C] ap_process_request [0x6FF0F7FC] ap_process_http_connection [0x6FF0AA82] ap_run_process_connection [0x6FF04F25] ap_process_connection [0x6FF051B2] worker_main [0x6FF1E130] endthreadex [0x74F23433] endthreadex [0x74F234C7] BaseThreadInitThunk [0x76CC339A] RtlInitializeExceptionChain [0x779D9EF2] RtlInitializeExceptionChain [0x779D9EC5]
(And yes, SymInitialize() still returns an invalid parameter error.)
There’s still some bogosity (libc symbols needed), and for some reason it doesn’t grok the symbols in my DSO (which has a .pdb file), but this is pretty good.