Home arrow C++ arrow Page 7 - A Simple Garbage Collector for C++
C++

A Simple Garbage Collector for C++


The use of dynamically allocated memory must be managed, because it has a tremendous effect on the performance of your programs. The current trend in handling dynamic memory seems to be shifting toward an automated approach. While C++ uses the manual approach for managing dynamic memory, this does not mean that it can't be automated in that language -- thus giving the C++ programmer the best of both worlds. This article explains how to do it. It is excerpted from chapter two of The Art of C++, written by Herbert Schildt (McGraw-Hill/Osborne, 2004; ISBN: 0072255129).

Author Info:
By: McGraw-Hill/Osborne
Rating: 4 stars4 stars4 stars4 stars4 stars / 73
June 21, 2005
TABLE OF CONTENTS:
  1. · A Simple Garbage Collector for C++
  2. · Comparing the Two Approaches to Memory Management
  3. · Choosing a Garbage Collection Algorithm
  4. · What About auto_ptr?
  5. · An Overview of the Garbage Collector Classes
  6. · GCPtr In Detail
  7. · The Overloaded Assignment Operators
  8. · GCInfo
  9. · How to Use GCPtr
  10. · Allocating Arrays
  11. · A Larger Demonstration Program
  12. · Load Testing

print this article
SEARCH DEVARTICLES

A Simple Garbage Collector for C++ - The Overloaded Assignment Operators
(Page 7 of 12 )

GCPtr overloads operator=( ) twice: once for the assignment of a new address to a GCPtr pointer, and once for the assignment of one GCPtr pointer to another. Both versions are shown here:

// Overload assignment of pointer to GCPtr.
template <class T, int size>
T * GCPtr<T, size>::operator=(T *t) {
 
list<GCInfo<T> >::iterator p;
 
// First, decrement the reference count
 
// for the memory currently being pointed to.
 
p = findPtrInfo(addr);
 
p->refcount--;
  // Next, if the new address is already
  // existent in the system, increment its
  // count. Otherwise, create a new entry
  // for gclist.
  p = findPtrInfo(t);
  if(p != gclist.end())
   
p->refcount++;
 
else {
    // Create and store this entry.
    GCInfo<T> gcObj(t, size);
    gclist.push_front(gcObj);
 
}
 
addr = t; // store the address.
 
return t;
}
// Overload assignment of GCPtr to GCPtr.
template <class T, int size>
GCPtr<T, size> & GCPtr<T, size>::operator=(GCPtr &rv) {
 
list<GCInfo<T> >::iterator p;
 
// First, decrement the reference count
  // for the memory currently being pointed to.
  p = findPtrInfo(addr);
  p->refcount--;
 
// Next, increment the reference count of
  // the new address.
  p = findPtrInfo(rv.addr);
  p->refcount++; // increment ref count
 
addr = rv.addr;// store the address.
 
return rv;
}

The first overload of operator=( ) handles assignments in which a GCPtr pointer is on the left and an address is on the right. For example, it handles cases like the one shown here:

GCPtr<int> p;
// ...
p = new int(18);

Here, the address returned by new is assigned to p. When this happens, operator=(T *t) is called with the new address passed to t. First,the entry in gclist for the memory that is currently being pointed to is found, and its reference count is decremented. Next, a search of gclist is made for the new address. If it is found, its reference count is incremented. Otherwise, a new GCInfo object is created for the new address, and this object is added to gclist. Finally, the new address is stored in the invoking object’s addr, and this address is returned.

The second overload of the assignment operator, operator=(GCPtr &rv), handles the following type of assignment:

GCPtr<int> p;
GCPtr<int> q;
// ...
p = new int(88);
q = p;

Here, both p and q are GCPtr pointers, and p is assigned to q. This version of the assignment operator works much like the other. First, the entry in gclist for the memory that is currently being pointed to is found, and its reference count is decremented. Next, a search of gclist is made for the new address, which is contained in rv.addr, and its reference count is incremented. Then the invoking object’s addr field is set to the address contained in rv.addr. Finally, the right-hand object is returned. This allows a chain of assignments to take place, such as:

p = q = w = z;

There is one other important point to make about the way that the assignment operators work. As mentioned earlier in this chapter, it is technically possible to recycle memory as soon as its reference count drops to zero, but doing so puts an extra overhead on each pointer operation. This is why, in the overloaded assignment operators, the reference count for the memory previously pointed to by the left-hand operand is simply decremented, and no further action is taken. Thus, the management overhead associated with actually releasing memory and performing maintenance on gclist is avoided. These actions are deferred to later, when the garbage collector runs. This approach makes for faster runtimes for the code that uses a GCPtr. It also lets garbage collection take place at times that are (potentially) more convenient in terms of runtime performance.

The GCPtr Copy Constructor

Because of the need to keep track of each pointer to allocated memory, the default copy constructor (which makes a bitwise identical copy) cannot be used. Instead, GCPtr must define is own copy constructor, which is shown here:

// Copy constructor.
GCPtr(const GCPtr &ob) {
 
list<GCInfo<T> >::iterator p;
 
p = findPtrInfo(ob.addr);
  p->refcount++; // increment ref count
  addr = ob.addr;
  arraySize = ob.arraySize;
  if(arraySize > 0) isArray = true;
  else isArray = false;
  #ifdef DISPLAY
   
cout << "Constructing copy.";
    if(isArray)
      cout << " Size is " << arraySize << endl;
    else
      cout << endl;
  #endif
}

Recall that a class’ copy constructor is invoked when a copy of an object is required, such as when an object is passed as an argument to a function, when an object is returned from a function, or when one object is used to initialize another. GCPtr’s copy constructor duplicates the information contained in the original object. It also increments the reference count associated with the memory pointed to by the original object. When the copy goes out of scope, this reference count will be decremented.

Actually, the extra work performed by the copy constructor is not usually necessary because the overloaded assignment operators properly maintain the garbage collection list in most cases. However, there are a small number of cases in which the copy constructor is needed, such as when memory is allocated inside a function and a GCPtr to that memory is returned.

The Pointer Operators and Conversion Function

Because GCPtr is a pointer type, it must overload the pointer operators * and –>, plus the indexing operator [ ]. This is done by the functions shown here. Given that it is the ability to overload these operators that makes it possible to create a new pointer type, it is amazing how simple they are.

// Return a reference to the object pointed
// to by this GCPtr.
T &operator*() {
 
return *addr;
}
// Return the address being pointed to.
T *operator->() { return addr; }
// Return a reference to the object at the
// index specified by i.
T &operator[](int i) {
  return addr[i];
}

The operator*( ) function returns a reference to the object pointed to by the addr field of the invoking GCPtr, operator–>( ) returns the address contained in addr, and operator[ ] returns a reference to the element specified by the index. operator[ ] should be used only on GCPtrs that point to allocated arrays.

As mentioned earlier, no pointer arithmetic is supported. For example, neither the ++ nor the –– operator is overloaded for GCPtr. The reason is that the garbage collection mechanism assumes that a GCPtr is pointing to the start of allocated memory. If a GCPtr could be incremented, for example, then when that pointer was garbage collected, the address used with delete would be invalid.

You have two options if you need to perform operations that involve pointer arithmetic. First, if the GCPtr is pointing to an allocated array, you can create an Iter that will let you cycle through that array. This approach is described later. Second, you can convert a GCPtr into a normal pointer by use of the T * conversion function defined by GCPtr. This function is shown here:

// Conversion function to T *.
operator T*() { return addr; }

This function returns a normal pointer that points to the address stored in addr. It can be used like this.

GCPtr<double> gcp = new double(99.2);
double *p;
p = gcp; // now, p points to same memory as gcp
p++; // because p is a normal pointer, it can be incremented

In the preceding example, because p is a normal pointer, it can be manipulated in any way that any other pointer can. Of course, whether such manipulations yield meaningful results is dependent upon your application.

The main advantage of the conversion to T * is that it lets you use GCPtrs in place of normal C++ pointers when working with preexisting code that requires such pointers. For example, consider this sequence:

GCPtr<char> str = new char[80];
strcpy(str, "this is a test");
cout << str << endl;

Here, str is a GCPtr pointer to char that is used in a call to strcpy( ). Because strcpy( ) is expecting its arguments to be of type char *, the conversion to T* inside GCPtr is automatically invoked because, in this case, T is char. The same conversion is automatically invoked when str is used in the cout statement. Thus, the conversion function enables GCPtrs to seamlessly integrate with existing C++ functions and classes.

Keep in mind that the T * pointer returned by this conversion does not participate in or affect garbage collection. Thus, it is possible for the allocated memory to be freed even if a regular C++ pointer is still pointing to it. So, use the conversion to T* wisely––and infrequently.

The begin( ) and end( ) Functions

The begin( ) and end( ) functions, shown next, are similar to their counterparts in the STL:

// Return an Iter to the start of the allocated memory. Iter<T> begin() {
 
int size;
 
if(isArray) size = arraySize;
 
else size = 1;
 
return Iter<T>(addr, addr, addr + size);
}
// Return an Iter to one past the end of an allocated array.
Iter<T> end() {
 
int size;
 
if(isArray) size = arraySize;
 
else size = 1;
  return Iter<T>(addr + size, addr, addr + size);
}

The begin( ) function returns an Iter to the start of the allocated array pointed to by addr. The end( ) function returns an Iter to one past the end of the array. Although there is nothing that stops these functions from being called on a GCPtr that points to a single object, their purpose is to support operations on allocated arrays. (Obtaining an Iter to a single object is not harmful, just pointless.)

The shutdown( ) Function

If a program creates a circular reference of GCPtrs, then when the program ends there will still be dynamically allocated objects that need to be released. This is important because these objects might have destructors that need to be called. The shutdown( ) function handles this task. This function is registered by atexit( ) when the first GCPtr is created, as described earlier. This means that it will be called when the program terminates.

The shutdown( ) function is shown here:

// Clear gclist when program exits.
template <class T, int size>
void GCPtr<T, size>::shutdown() {
 
if(gclistSize() == 0) return; // list is empty
 
list<GCInfo<T> >::iterator p;
 
for(p = gclist.begin(); p != gclist.end(); p++) {
    // Set all reference counts to zero
    p->refcount = 0;
  }
 
#ifdef DISPLAY
    cout << "Before collecting for shutdown() for "
         << typeid(T).name() << "\n";
  #endif
  collect();
 
#ifdef DISPLAY
    cout << "After collecting for shutdown() for "
         << typeid(T).name() << "\n";
  #endif
}

First, if the list is empty, which it will normally be, then shutdown( ) simply returns. Otherwise, it sets to zero the reference counts of the entries that still exist in gclist, and then it calls collect( ). Recall that collect( ) releases any object that has a reference count of zero. Thus, setting the reference counts to zero ensures that all objects will be freed.

Two Utility Functions

GCPtr ends by defining two utility functions. The first is gclistSize( ), which returns the number of entries currently held in gclist. The second is showlist( ), which displays the contents of gclist. Neither of these are necessary for the implementation of a garbage collection pointer type, but they are useful when you want to watch the operation of the garbage collector.

------------------------------------------------------------------

Next: GCInfo >>

blog comments powered by Disqus
C++ ARTICLES

- Intel Threading Building Blocks
- Threading Building Blocks with C++
- Video Memory Programming in Text Mode
- More Tricks to Gain Speed in Programming Con...
- Easy and Efficient Programming for Contests
- Preparing For Programming Contests
- Programming Contests: Why Bother?
- Polymorphism in C++
- Overview of Virtual Functions
- Inheritance in C++
- Extending the Basic Streams in C++
- Using Stringstreams in C++
- Custom Stream Manipulation in C++
- General Stream Manipulation in C++
- Serialize Your Class into Streams in C++

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 
Support 

Developer Shed Affiliates

 




© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials