Home arrow C++ arrow Page 6 - A Simple Garbage Collector for C++
C++

A Simple Garbage Collector for C++


The use of dynamically allocated memory must be managed, because it has a tremendous effect on the performance of your programs. The current trend in handling dynamic memory seems to be shifting toward an automated approach. While C++ uses the manual approach for managing dynamic memory, this does not mean that it can't be automated in that language -- thus giving the C++ programmer the best of both worlds. This article explains how to do it. It is excerpted from chapter two of The Art of C++, written by Herbert Schildt (McGraw-Hill/Osborne, 2004; ISBN: 0072255129).

Author Info:
By: McGraw-Hill/Osborne
Rating: 4 stars4 stars4 stars4 stars4 stars / 73
June 21, 2005
TABLE OF CONTENTS:
  1. · A Simple Garbage Collector for C++
  2. · Comparing the Two Approaches to Memory Management
  3. · Choosing a Garbage Collection Algorithm
  4. · What About auto_ptr?
  5. · An Overview of the Garbage Collector Classes
  6. · GCPtr In Detail
  7. · The Overloaded Assignment Operators
  8. · GCInfo
  9. · How to Use GCPtr
  10. · Allocating Arrays
  11. · A Larger Demonstration Program
  12. · Load Testing

print this article
SEARCH DEVARTICLES

A Simple Garbage Collector for C++ - GCPtr In Detail
(Page 6 of 12 )

GCPtr is the heart of the garbage collector. It implements a new pointer type that keeps a reference count for objects allocated on the heap. It also provides the garbage collection functionality that recycles unused memory.

GCptr is a template class with this declaration:

template <class T, int size=0> class GCPtr {

GCPtr requires that you specify the type of the data that will be pointed to, which will be substituted for the generic type T. If an array is being allocated, you must specify the size of the array in the size parameter. Otherwise, size defaults to zero, which indicates that a single object is being pointed to. Here are two examples.

GCPtr<int> p; // declare a pointer to a single integer GCPtr<int, 5> ap; // declare a pointer to an array of 5 integers

Here, p can point to single objects of type int, and ap can point to an array of 5 ints.

In the preceding examples, notice that you do not use the * operator when specifying the name of the GCPtr object. That is, to create a GCPtr to int, you do not use a statement like this:

GCPtr<int> *p; // this creates a pointer to a GCPtr<int> object

This declaration creates a normal C++ pointer called p that can point to a GCPtr<int> object. It does not create a GCPtr object that can point to int. Remember, GCPtr defines a pointer type by itself.

Be careful when specifying the type parameter to GCPtr. It specifies the type of the object to which the GCPtr object can point. Therefore, if you write a declaration like this:

GCPtr<int *> p; // this creates a GCPtr to pointers to ints

you are creating a GCPtr object that points to int * pointers, not a GCPtr to ints.

Because of its importance, each member of GCPtr is examined in detail in the following sections.

GCPtr Data Members

GCPtr declares the following data members:

// gclist maintains the garbage collection list.
static list<GCInfo<T> > gclist;
// addr points to the allocated memory to which
// this GCPtr pointer currently points.
T *addr;
/* isArray is true if this GCPtr points
  
to an allocated array. It is false
  
otherwise. */
bool isArray; // true if pointing to array
// If this GCPtr is pointing to an allocated
// array, then arraySize contains its size.
unsigned arraySize; // size of the array
static bool first; // true when first GCPtr is created

The gclist field contains a list of GCInfo objects. (Recall that GCInfo links a reference count with a piece of allocated memory.) This list is used by the garbage collector to determine when allocated memory is unused. Notice that gclist is a static member of GCPtr. This means that for each specific pointer type, there is only one gclist. For example, all pointers of type GCPtr<int> share one list, and all pointers of type GCPtr<double> share a different list. gclist is an instance of the STL list class. Using the STL substantially simplifies the code for GCPtr because there is no need for it to create its own set of list-handling functions.

GCPtr stores the address of the memory to which it points in addr. If addr points to an allocated array, then isArray will be true, and the length of the array will be stored in arraySize.

The first field is a static variable that is initially set to true. It is a flag that the GCPtr constructor uses to know when the first GCPtr object is created. After the first GCPtr object is constructed, first is set to false. It is used to register a termination function that will be called to shut down the garbage collector when the program ends.

The findPtrInfo( ) Function

GCPtr declares one private function: findPtrInfo( ). This function searches gclist for a specified address and returns an iterator to its entry. If the address is not found, an iterator to the end of gclist is returned. This function is used internally by GCPtr to update the reference counts of the objects in gclist. It is implemented as shown here:

// Find a pointer in gclist.
template <class T, int size>
typename list<GCInfo<T> >::iterator
 
GCPtr<T, size>::findPtrInfo(T *ptr) {
 
list<GCInfo<T> >::iterator p;
 
// Find ptr in gclist.
  for(p = gclist.begin(); p != gclist.end(); p++)
    if(p->memPtr == ptr)
      return p;
 
return p;
}

The GCIterator typedef

At the start of the public section of GCPtr is the typedef of Iter<T> to GCiterator. This typedef is bound to each instance of GCPtr, thus eliminating the need to specify the type parameter each time an Iter is needed for a specific version of GCPtr. This simplifies the declaration of an iterator. For example, to obtain an iterator to the memory pointed to by a specific GCPtr, you can use a statement like this:

GCPtr<int>::GCiterator itr;

The GCPtr Constructor

The constructor for GCPtr is shown here:

// Construct both initialized and uninitialized objects. GCPtr(T *t=NULL) {
 
// Register shutdown() as an exit function.
 
if(first) atexit(shutdown);
 
first = false;
  list<GCInfo<T> >::iterator p;
 
p = findPtrInfo(t);
  // If t is already in gclist, then
 
// increment its reference count.
 
// Otherwise, add it to the list.
 
if(p != gclist.end())
   
p->refcount++; // increment ref count
  else {
    // Create and store this entry.
   
GCInfo<T> gcObj(t, size);
   
gclist.push_front(gcObj);
  }
 
addr = t;
 
arraySize = size;
 
if(size > 0) isArray = true;
 
else isArray = false;
 
#ifdef DISPLAY
   
cout << "Constructing GCPtr. ";
    if(isArray)
      cout << " Size is " << arraySize << endl;
    else
      cout << endl;
  #endif
}

GCPtr( ) allows both initialized and uninitialized instances to be created. If an initialized instance is declared, then the memory to which this GCPtr will point is passed through t. Otherwise, t will be null. Letís examine the operation of GCPtr( ) in detail.

First, if first is true, it means that this is the first GCPtr object to be created. If this is the case, then shutdown( ) is registered as a termination function by calling atexit( ).The atexit( ) function is part of the standard C++ function library, and it registers a function that will be called when a program terminates. In this case, shutdown( ) releases any memory that was prevented from being released because of a circular reference.

Next, a search of gclist is made, looking for any preexisting entry that matches the address in t. If one is found, then its reference count is incremented. If no preexising entry matches t, a new GCInfo object is created that contains this address, and this object is added to gclist.

GCPtr( ) then sets addr to the address specified by t and sets the values of isArray and arraySize appropriately. Remember, if you are allocating an array, you must explicitly specify the size of the array when you declare the GCPtr pointer that will point to it. If you donít, the memory wonít be released correctly, and, in the case of an array of class objects, the destructors wonít be called properly.

The GCPtr Destructor

The destructor for GCPtr is shown here:

// Destructor for GCPtr.
template <class T, int size>
GCPtr<T, size>::~GCPtr() {
 
list<GCInfo<T> >::iterator p;
 
p = findPtrInfo(addr);
  if(p->refcount) p->refcount--; // decrement ref count
 
#ifdef DISPLAY
    cout << "GCPtr going out of scope.\n";
  #endif
 
// Collect garbage when a pointer goes out of scope. 
  collect();
 
// For real use, you might want to collect
 
// unused memory less frequently, such as after
 
// gclist has reached a certain size, after a
 
// certain number of GCPtrs have gone out of scope,
 
// or when memory is low.
}

Garbage collection takes place each time a GCPtr goes out of scope. This is handled by ~GCPtr( ). First, a search of gclist is made, looking for the entry that corresponds to the address pointed to by the GCPtr being destroyed. Once found, its refcount field is decremented. Next, ~GCptr( ) calls collect( ) to release any unused memory (that is, memory whose reference count is zero).

As the comment at the end of ~GCPtr( ) states, for real applications, it is probably better to collect garbage less often than each time a single GCPtr goes out of scope. Collecting less frequently will usually be more efficient. As explained earlier, collecting each time a GCPtr is destroyed is useful for demonstrating the garbage collector because it clearly illustrates the garbage collectorís operation.

Collect Garbage with collect( )

The collect( ) function is where garbage collection takes place. It is shown here:

// Collect garbage. Returns true if at least
// one object was freed.
template <class T, int size>
bool GCPtr<T, size>::collect() {
bool memfreed = false;
#ifdef DISPLAY
  cout << "Before garbage collection for ";
  showlist();
#endif
list<GCInfo<T> >::iterator p;
do {
  // Scan gclist looking for unreferenced pointers.
 
for(p = gclist.begin(); p != gclist.end(); p++) {
    // If in-use, skip.
    if(p->refcount > 0) continue;
   
memfreed = true;
   
// Remove unused entry from gclist.
    gclist.remove(*p);
   
// Free memory unless the GCPtr is null.
    if(p->memPtr) {
      if(p->isArray) {
        #ifdef DISPLAY
          cout << "Deleting array of size "
              
<< p->arraySize << endl;
        #endif
        delete[] p->memPtr; // delete array
     
}
      else {
        #ifdef DISPLAY
          cout << "Deleting: "
               << *(T *) p->memPtr << "\n";
        #endif
        delete p->memPtr; // delete single element
     
}
    }
   
// Restart the search.
    break;
  }
} while(p != gclist.end());
#ifdef DISPLAY
    cout << "After garbage collection for ";
    showlist();
  #endif
  return memfreed;
}

The collect( ) function works by scanning the contents of gclist, looking for entries that have a refcount of zero. When such an entry is found, it is removed from gclist by calling the remove( ) function, which is a member of the STL list class. Then the memory associated with that entry is freed.

Recall that in C++, single objects are freed by delete, but arrays of objects are freed via delete[ ]. Thus, the value of the entryís isArray field determines whether delete or delete[ ] is used to free the memory. This is one reason you must specify the size of an allocated array for any GCPtr that will point to one: it causes isArray to be set to true. If isArray is not set correctly, it is impossible to properly release the allocated memory.

Although the point of garbage collection is to recycle unused memory automatically, you can take a measure of manual control if necessary. The collect( ) function can be called directly by user code to request that garbage collection take place. Notice that it is declared as a static function within GCPtr. This means that it can be invoked without reference to any object. For example:

GCPtr<int>::collect(); // collect all unused int pointers

This causes gclist<int> to be collected. Because there is a different gclist for each type of pointer, you will need to call collect( ) for each list that you want to collect. Frankly, if you need to closely manage the release of dynamically allocated objects, you are better off using the manual allocation system provided by C++. Directly calling collect( ) is best reserved for specialized situations, such as when free memory is running unexpectedly low.


blog comments powered by Disqus
C++ ARTICLES

- Intel Threading Building Blocks
- Threading Building Blocks with C++
- Video Memory Programming in Text Mode
- More Tricks to Gain Speed in Programming Con...
- Easy and Efficient Programming for Contests
- Preparing For Programming Contests
- Programming Contests: Why Bother?
- Polymorphism in C++
- Overview of Virtual Functions
- Inheritance in C++
- Extending the Basic Streams in C++
- Using Stringstreams in C++
- Custom Stream Manipulation in C++
- General Stream Manipulation in C++
- Serialize Your Class into Streams in C++

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 
Support 

Developer Shed Affiliates

 




© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials