home  
technical articles
 
work

Another Tale of Real World Debugging

Fixing Memory Leaks, Problems with debugging error reporting in VS.Net ...

Roshan James

This article is a tale of how one guy decided to clean his code of memory leaks and the problems he ran into as a consequence. This will show you how to enable memory leak detection in your old c++ code, how to ensure that the debugger reports leaks in the correct locations in your source code rather than in inbuilt header files and finally a case of how the c++ compiler and studio can go out of sync with each other in the way they report errors on line numbers.


(This is the first: A Tale of Real World Debugging - Matt Pietrek 
http://msdn.microsoft.com/msdnmag/issues/0900/hood/
)


    Today I had some time on my hands so I thought I'd cleanup some of my old code. I had this large block of old code on my hands that was rather tightly written. It however, leaked memory. Rewriting a few thousand lines would be a lot of work but more importantly - it was a hard piece of algo to redo.

Enabling Memory Leak Detection:

    So the best thing I could do was to clean up this code. After mucking around a bit I figured that if I add the following to my code I could enable memory leak detection. I could get memory leak information with Visual Studio when I ran the program under the debugger.

 
       	
#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>

This has to be added before any of my other includes. 
I also needed to add the call:

 

       	
_CrtDumpMemoryLeaks();


at the place where my application exits (or at the point where I wanted to examine the memory - this would typically be when the application exits).

So very cheerfully I wrote up a test program. This is generally a nice thing to do, you write a little program to test any new feature you want to use and avoid mixing any issues of the new feature with any existing problems you may have in the existing mass of code where you want to apply the new feature.

So my sample cpp file looked like this:

       	
#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>

struct teststruct
{
	int a;
};

int main(int argc, char* argv[])
{
	int * a = new int[1];
	int * a1 = new int[5];
	teststruct * aa = new teststruct;

	_CrtDumpMemoryLeaks();
	return 0;
}

Now when the program is run under the debugger I would see a list of leaked memory locations. I was pleased to see 3 memory blocks shown as leaked.

The Issue with Memory Leak Detection:

But this is where the trouble started. 
MSDN promises you debug information about the leaked memory objects that look like this:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vsdebug/html/vxconenablingmemoryleakdetection.asp

 

       	
Detected memory leaks!
Dumping objects ->
C:\PROGRAM FILES\VISUAL STUDIO\MyProjects\leaktest\leaktest.cpp(20) : {18} 
normal block at 0x00780E80, 64 bytes long.
 Data: <                > CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD
Object dump complete.

But I was getting leak information that looked like so :

       	
'memleaks.exe': Loaded 'D:\vsprojs\memleaks\Debug\memleaks.exe', Symbols loaded.
'memleaks.exe': Loaded 'C:\WINNT\system32\NTDLL.DLL', Cannot find or open a required DBG file.
'memleaks.exe': Loaded 'C:\WINNT\system32\KERNEL32.DLL', Cannot find or open a required DBG file.
Detected memory leaks!
Dumping objects ->
d:\program files\microsoft visual studio .net\vc7\include\crtdbg.h(674) : {48} normal block at 0x002F2C78, 4 bytes long.
 Data: <    > CD CD CD CD 
d:\program files\microsoft visual studio .net\vc7\include\crtdbg.h(677) : {47} normal block at 0x002F2C38, 4 bytes long.
 Data: <    > CD CD CD CD 
d:\program files\microsoft visual studio .net\vc7\include\crtdbg.h(677) : {46} normal block at 0x002F2BF8, 4 bytes long.
 Data: <    > CD CD CD CD 
Object dump complete.
The program '[2520] memleaks.exe: Native' has exited with code 0 (0x0).

If you notice, in the MSDN sample the point of the leak is shown as the cpp where the allocation took place. In my test file the point of the leak id always shown as crtdbg.h!

    Now while that seemed a little unfair, there was nothing I could think of to do, so I decided to live with it. Now the leak detection mechanism indicates each leak with a number in curly braces {}. So I could set a break point on the nth allocation by adding a watch on the variable _crtBreakAlloc with value as the emeory allocation value given in {}. This is described here:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vsdebug/html/vxconSettingBreakpointOnMemoryAllocationNumber.asp

    So I set my break point and started debugging. The break would, as a rule, be caught in file dbgheap.c and not in my source code. While this looked a little confusing, in minute I figured out that if I clicked 'Step Out' (Shift F11 in VS.Net) enough times it would finally return to the line of code in my program where I had used my 'new' operator.

    Fair enough, it doesn't give everything me, but it gives me the list of the memory leaks and a number indicating the allocation. I could set a breakpoint using that number and whenever my code executed that allocation I could step through it with a few 'Step Out's to locate the place where I had made my allocation. In short I could debug memory leaks.

    While this would not directly fix my problem, which is that my program leaked memory, if I could understand the locations and conditions under which I fail to de-allocate memory I could start fixing the issue. Quite happy with the exploits, I started off. But I was in for bad weather.

A Lost Debugger and Phantom code:

    I fixed an obvious de-allocation that I had missed and moved onto the next one when my luck ran out on me. When I hit a break point and step out into my source code I reached this file called lang.c that looked a little like this:

       	
//#line 1480 "lang.c"
        break;
      case 17:
        YYTRACE("statementlist ::=")
//#line 166 "lang.lem"		<-- the break point was pointing here
{
	yygotominor.yy52=0;
}

!!! Now this is rather surprising as my the debugger was pointing to location of the source that was commented. It was expected to point to the line of code that allocated the memory... something was wrong here.

    Now this lang.c see had a nice history of issues during the development of the source code. This file lang.c is a parser. It is generated by a parser generator tool called 'lemon' that I had used for generating the parser. The parser's input file was called lang.lem so you would see #line directives that would direct the compiler to show me errors in lang.lem instead of lang.c. Now due to some other problems I had I ended up commenting out all the #line directives. As a consequence lang.c would behave like any other c file indicating a compiler errors within itself rather than in lang.lem.
Lemon: http://www.hwaci.com/sw/lemon/

But why was the debugger pointing to a line that was an invalid piece of source code while I was stepping through it?

    As I step though the code I could see the debugger step cheerfully over break statements closing } etc and continue as though none of them existed. It was as though it was debugging some code that wasn't there and it was casually ignoring all the code that was there. Some kind of phantom code …

    This really didn't help my cause at all and try as I might I could not figure out what was causing the problem. Now there were memory allocations both above and below this line of code (where the debugger was pointing), so how could I know which allocation was the troublesome one?

    Quickly giving up with convincing the debugger to show me where the problem was, I decided to revert to trying to coax Visual Studio to indicate the correct line number and file name instead of always showing crtdbg.h. Since MSDN showed me that that kind of output is possible, I decided to give it a serious try.

Fixing the Memory Leak Message:

After some looking around you realize what is happening and how VS generates the messages 


'memleaks.exe': Loaded 'D:\vsprojs\memleaks\Debug\memleaks.exe', Symbols loaded.
'memleaks.exe': Loaded 'C:\WINNT\system32\NTDLL.DLL', Cannot find or open a required DBG file.
'memleaks.exe': Loaded 'C:\WINNT\system32\KERNEL32.DLL', Cannot find or open a required DBG file.
Detected memory leaks!
Dumping objects ->
d:\program files\microsoft visual studio .net\vc7\include\crtdbg.h(674) : {48} normal block at 0x002F2C78, 4 bytes long.
 Data: <    > CD CD CD CD 
d:\program files\microsoft visual studio .net\vc7\include\crtdbg.h(677) : {47} normal block at 0x002F2C38, 4 bytes long.
 Data: <    > CD CD CD CD 
d:\program files\microsoft visual studio .net\vc7\include\crtdbg.h(677) : {46} normal block at 0x002F2BF8, 4 bytes long.
 Data: <    > CD CD CD CD 
Object dump complete.
The program '[2520] memleaks.exe: Native' has exited with code 0 (0x0).

The file crtdbg.h looks like this 

       	
#ifdef _CRTDBG_MAP_ALLOC

inline void * __cdecl operator new(size_t s)  line 674
        { return ::operator new(s, _NORMAL_BLOCK, __FILE__, __LINE__); }

inline void* __cdecl operator new[](size_t s)
        { return ::operator new[](s, _NORMAL_BLOCK, __FILE__, __LINE__); }

    The values __FILE__ and __LINE_ indicate the current file and line for any file during compilation. The documentation about crtdbg.h says that it binds malloc calls to a debug version of malloc. It would be rerouting the new operator to the malloc code through this snippet of code. 
So always the __FILE__ would indicate the crtdbg.h and the __LINE__ would be in this file. Now If I could simply ensure that the __FILE__ and __LINE__ values correspond to the place where the new statement actually is in my source code then probably it would be reported correctly when the program is run in debug mode. How could I do that? Simple, I write a little stub for my own new function ;)

       	
void * operator new(size_t nSize, const char * lpszFileName, int nLine)
		{ return ::operator new(nSize, 1, lpszFileName, nLine); }

    Now if I pass this new my file name and line number for the location where the memory allocation is taking place (each time) then I would get my error reporting right.
That was simple I simply add another line that says:


#define new new(__FILE__,__LINE__)

Thus changing all the 'new' usages in my code. So the earlier test program would look like:


#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>

#ifdef _DEBUG
void * operator new(size_t nSize, const char * lpszFileName, int nLine)
		{ return ::operator new(nSize, 1, lpszFileName, nLine); }
#define new new(__FILE__,__LINE__)
#endif 


struct teststruct
{
	int a;
};

int main(int argc, char* argv[])
{
	int * a = new int[1];
	int * a1 = new int[1];
	teststruct * aa = new teststruct;

	_CrtDumpMemoryLeaks();
	return 0;
}

And this would output the messages:


'memleaks.exe': Loaded 'D:\vsprojs\memleaks\Debug\memleaks.exe', Symbols loaded.
'memleaks.exe': Loaded 'C:\WINNT\system32\NTDLL.DLL', Cannot find or open a required DBG file.
'memleaks.exe': Loaded 'C:\WINNT\system32\KERNEL32.DLL', Cannot find or open a required DBG file.
Detected memory leaks!
Dumping objects ->
d:\vsprojs\memleaks\memleaks.cpp(20) : {48} normal block at 0x002F2C78, 4 bytes long.
 Data: <    > CD CD CD CD 
d:\vsprojs\memleaks\memleaks.cpp(19) : {47} normal block at 0x002F2C38, 4 bytes long.
 Data: <    > CD CD CD CD 
d:\vsprojs\memleaks\memleaks.cpp(18) : {46} normal block at 0x002F2BF8, 4 bytes long.
 Data: <    > CD CD CD CD 
Object dump complete.
The program '[288] memleaks.exe: Native' has exited with code 0 (0x0).


So that is one hurdle crossed. I could report error messages with alloc in the file and line that caused them. So if I were to include the following block in each of the files in my source code then I could be assured that all leaks would be reported in correct location.


#ifdef _DEBUG
void * operator new(size_t nSize, const char * lpszFileName, int nLine)
		{ return ::operator new(nSize, 1, lpszFileName, nLine); }
#define new new(__FILE__,__LINE__)
#endif 

I wonder why there is no note of this anywhere that is easy to find.
Simple so far? So I added the magic snippet to my code and recompiled it.

Phantom of the Source Code:

Now as fate would have it about things such as rabbit holes, operas and source code, the phantom came back. All the allocation in the file lang.c was shown in some invalid location or the other. Lines that contained comments, case statements, breaks etc showed as lines that allocate memory.

For some reason I refused to think, and blamed the debugger and try as I might I could not fix the issue. 
It then happened that tinkering with my earlier block of code (the one where I redefined the 'new' keyword, I made a simple mistake and I got a whole list of compilation errors. 
And as fate would have it all the compilation errors in lang.c were pointing to totally wrong locations. But this time there was difference. These were compilation errors and they pointed to the wrong place! There was no debugger in action here… the phantom had showed itself in an odd place.

Scooby Doo and Phantom of the Source Code:

    Somehow the system was making a mistake of the way it tracked line number for this file. I couldn't think of anything odd about this file other than that it had a lot of commented #line statements and that it was generated by lemon.

The commented #lines were harmless, they are just comments. So what could be the problem unless lemon …
    Lemon is written for unix like systems and the one difference between that world and the Windows world is CRLF. In the windows world a new line is indicated by a CRLF, ie by a CR (carriage return) and a LF (line feed) character. In the unix world a new line is simple a LF. My input to lemon was my lang.lem file a windows/dos text file. Was lemon expecting a unix style text file instead?

    Compilers and such would report errors in source file by indicating the line numbers. Line numbers are calculated by the compiler by counting the number of CRLF pairs that it finds. Could there be anything wrong in the CRLF sequences in the lang.c file?

    I opened up lang.c in hex mode in visual studio and sure enough there sequences like 0x0D 0x0D 0x0A in the file. Now 0x0D is CR and 0x0A is LF. CRLFs should occur in pairs, so CRCRLF is an invalid combination.

    The compiler was obviously counting lines in a method different from that used by studio for rendering the source code. The compiler was probably counting each line as CRLF pair while Visual Studio probably displayed a new line at every CR it found. So when the debugger uses the information that the compiler put into the compiled file and says that there is a problem in line 10 the actual display of line 10 probably could much lower that the line 10 rendered on screen.
Probably the teams for devenv and cl at Redmond hadn't talked to each other about their line counting and rendering routines … hello Microsoft!

All I had to do to solve my problem to convert lang.c to a dos/windows style text file and the issue would be solved.

    After a bit of tinkering around and being the true lazy programmer I am, I tried to do the job without writing code. It struck me that wordpad can do such changes. So I loaded lang.c in wordpad and saved as msdos text file. 

    Bang! compilation errors, wordpad threw out invalid CRLF combinations totally, putting some of the lines on the same line. This caused some blocks within case statements to get realigned and the compiler screamed.

I finally wrote a little snippet to clean the source file for me:


#include <fstream.h>
#include <stdio.h>

void main(int argn,char *argv[])
{
	if(argn<2)
		cout<<"CRLFpatch <filename>\n";

	//i am not building this thing for security 
	//so dont buffer-overflow exploit me.
	char foutname[128]; 
	sprintf(foutname,"%s-t.bin",argv[1]);

	fstream fin(argv[1],ios::in|ios::binary);
	fstream fout(foutname,ios::out|ios::binary|ios::trunc);

	char ch;
	fin.read(&ch,1);
	while(!fin.eof())
	{
		if(ch == 0x0D);
		else if(ch == 0x0A)
		{
			ch = 0x0D;
			fout.write(&ch,1);
			ch = 0x0A;
			fout.write(&ch,1);
		}
		else
			fout.write(&ch,1);
		fin.read(&ch,1);
	}

	fin.close();
	fout.close();
}

Ran this on my lang.c source and finally, things worked!

    For a while I was telling myself that no one should have to go through this one, this is simply misguided effort. I wonder how someone else would have gone through with this. Maybe the first time I saw the line being reported wrongly I should have guessed the hand of some unix-style formatting.
Could MS have prevented this running around, due to one application treating new lines in one way and another one in another? Especially apps that should be well coupled like the compiler and the IDE? But then this is such a rare case...Also could Studio have had the option to detect incorrect text file formats and have complained? Could I have done anything differently here to save myself the time?

Now that I have debugged the reason I couldn't debug my application, I can get down to debugging my application.

Cheers
Roshan James

Additions to this article would be reflected at 
http://www27.brinkster.com/sparksite/articles/tech/torwd/another_tale_of_real_world_debug.htm

Suggested Reading:

Detecting and Isolating Memory Leaks Using Microsoft Visual C++
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvc60/html/memleaks.asp