FAQs in section [16]:
[16.1] Does delete p delete the pointer p, or the
pointed-to-data *p?
The pointed-to-data.
The keyword should really be delete_the_thing_pointed_to_by. The same
abuse of English occurs when freeing the memory pointed to by a pointer in C:
free(p) really means free_the_stuff_pointed_to_by(p).
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.2] Can I free() pointers allocated with new?
Can I delete pointers allocated with malloc()?
No!
It is perfectly legal, moral, and wholesome to use malloc() and delete in
the same program, or to use new and free() in the same program.
But it is illegal, immoral, and despicable to call free() with a
pointer allocated via new, or to call delete on a pointer allocated via
malloc().
Beware! I occasionally get e-mail from people telling me that it works
OK for them on machine X and compiler Y. That does not make it right!
Sometimes people say, "But I'm just working with an array of char."
Nonetheless do not mix malloc() and delete on the same pointer, or
new and free() on the same pointer! If you allocated via p = new char[n], you must use delete[] p; you must not use
free(p). Or if you allocated via p = malloc(n), you
must use free(p); you must not use delete[] p
or delete p! Mixing these up could cause a catastrophic failure at
runtime if the code was ported to a new machine, a new compiler, or even a new
version of the same compiler.
You have been warned.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.3] Why should I use new instead of trustworthy old
malloc()?
Constructors/destructors, type safety, overridability.
- Constructors/destructors: unlike malloc(sizeof(Fred)),
new Fred() calls Fred's constructor. Similarly, delete p calls
*p's destructor.
- Type safety: malloc() returns a void* which isn't type safe.
new Fred() returns a pointer of the right type (a Fred*).
- Overridability: new is an operator that can be overridden by a
class, while malloc() is not overridable on a per-class basis.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.4] Can I use realloc() on pointers allocated via new?
No!
When realloc() has to copy the allocation, it uses a bitwise copy
operation, which will tear many C++ objects to shreds. C++ objects should be
allowed to copy themselves. They use their own copy constructor or assignment
operator.
Besides all that, the heap that new uses may not be the same as the
heap that malloc() and realloc() use!
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.5] Do I need to check for NULL after p = new Fred()?
[Recently changed endl to std::endl (in 8/01). Click here to go to the next FAQ in the "chain" of recent changes.]
No! (But if you have an old compiler, you may have to
force the new operator to throw an
exception if it runs out of memory.)
It turns out to be a real pain to always write explicit NULL tests after
every new allocation. Code like the following is very tedious:
Fred* p = new Fred();
if (p == NULL)
throw std::bad_alloc();
If your compiler doesn't support (or if you refuse to use)
exceptions, your code might be even more tedious:
Fred* p = new Fred();
if (p == NULL) {
std::cerr << "Couldn't allocate memory for a Fred" << std::endl;
abort();
}
Take heart. In C++, if the runtime system cannot allocate sizeof(Fred) bytes
of memory during p = new Fred(), a std::bad_alloc exception will be
thrown. Unlike malloc(), new never returns NULL!
Therefore you should simply write:
Fred* p = new Fred(); // No need to check if p is NULL
However, if your compiler is old, it may not yet support this. Find out by
checking your compiler's documentation under "new". If you have an old
compiler, you may have to force the compiler to
have this behavior.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.6] How can I convince my (older) compiler to
automatically check new to see if it returns NULL?
Eventually your compiler will.
If you have an old compiler that doesn't automagically perform
the NULL test, you can force the runtime
system to do the test by installing a "new handler" function. Your "new
handler" function can do anything you want, such as throw an exception,
delete some objects and return (in which case operator new will
retry the allocation), print a message and abort() the program, etc.
Here's a sample "new handler" that prints a message and throws an exception.
The handler is installed using std::set_new_handler():
#include <new> // To get std::set_new_handler
#include <cstdlib> // To get abort()
#include <iostream> // To get std::cerr
class alloc_error : public std::exception {
public:
alloc_error() : exception() { }
};
void myNewHandler()
{
// This is your own handler. It can do anything you want.
throw alloc_error();
}
int main()
{
std::set_new_handler(myNewHandler); // Install your "new handler"
// ...
}
After the std::set_new_handler() line is executed, operator new will
call your myNewHandler() if/when it runs out of memory. This means
that new will never return NULL:
Fred* p = new Fred(); // No need to check if p is NULL
Note: If your compiler doesn't support exception
handling, you can, as a last resort, change the line throw ...;
to:
std::cerr << "Attempt to allocate memory failed!" << std::endl;
abort();
Note: If some global/static object's constructor uses new, it won't use
the myNewHandler() function since that constructor will get called
before main() begins. Unfortunately there's no convenient way to
guarantee that the std::set_new_handler() will be called before the first
use of new. For example, even if you put the std::set_new_handler()
call in the constructor of a global object, you still don't know if the module
("compilation unit") that contains that global object will be elaborated first
or last or somewhere inbetween. Therefore you still don't have any guarantee
that your call of std::set_new_handler() will happen before any other
global's constructor gets invoked.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.7] Do I need to check for NULL before delete p?
No!
The C++ language guarantees that delete p will do nothing if p is equal to
NULL. Since you might get the test backwards, and since most testing
methodologies force you to explicitly test every branch point, you should
not put in the redundant if test.
Wrong:
if (p != NULL)
delete p;
Right:
delete p;
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.8] What are the two steps that happen when I say delete p?
delete p is a two-step process: it calls the destructor, then releases the
memory. The code generated for delete p looks something like this (assuming
p is of type Fred*):
// Original code: delete p;
if (p != NULL) {
p->~Fred();
operator delete(p);
}
The statement p->~Fred() calls the destructor for the Fred object
pointed to by p.
The statement operator delete(p) calls the memory deallocation
primitive, void operator delete(void* p). This primitive is similar in
spirit to free(void* p). (Note, however, that these two are
not interchangeable; e.g., there is no guarantee that the two memory
deallocation primitives even use the same heap!).
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.9] In p = new Fred(), does the Fred
memory "leak" if the Fred constructor throws an exception?
No.
If an exception occurs during the Fred constructor of p = new Fred(),
the C++ language guarantees that the memory sizeof(Fred) bytes that were
allocated will automagically be released back to the heap.
Here are the details: new Fred() is a two-step process:
- sizeof(Fred) bytes of memory are allocated using the primitive
void* operator new(size_t nbytes). This primitive is similar in spirit
to malloc(size_t nbytes). (Note, however, that these two are
not interchangeable; e.g., there is no guarantee that the two memory
allocation primitives even use the same heap!).
- It constructs an object in that memory by calling the Fred
constructor. The pointer returned from the first step is passed as the this
parameter to the constructor. This step is wrapped in a try ... catch
block to handle the case when an exception is thrown during this
step.
Thus the actual generated code looks something like:
// Original code: Fred* p = new Fred();
Fred* p = (Fred*) operator new(sizeof(Fred));
try {
new(p) Fred(); // Placement new
} catch (...) {
operator delete(p); // Deallocate the memory
throw; // Re-throw the exception
}
The statement marked "Placement new" calls the
Fred constructor. The pointer p becomes the this pointer inside the
constructor, Fred::Fred().
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.10] How do I allocate / unallocate an array of things?
Use p = new T[n] and delete[] p:
Fred* p = new Fred[100];
// ...
delete[] p;
Any time you allocate an array of objects via new (usually with the
[n] in the new expression), you must use
[] in the delete statement. This syntax is necessary because there
is no syntactic difference between a pointer to a thing and a pointer to an
array of things (something we inherited from C).
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.11] What if I forget the [] when deleteing array allocated
via new T[n]?
All life comes to a catastrophic end.
It is the programmer's not the compiler's responsibility to get the
connection between new T[n] and delete[] p correct. If you get
it wrong, neither a compile-time nor a run-time error message will be generated
by the compiler. Heap corruption is a likely result. Or worse. Your program
will probably die.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.12] Can I drop the [] when deleteing array of some
built-in type (char, int, etc)?
No!
Sometimes programmers think that the [] in the delete[] p only exists
so the compiler will call the appropriate destructors for all elements in the
array. Because of this reasoning, they assume that an array of some built-in
type such as char or int can be deleted without the []. E.g., they
assume the following is valid code:
void userCode(int n)
{
char* p = new char[n];
// ...
delete p; // < ERROR! Should be delete[] p !
}
But the above code is wrong, and it can cause a disaster at runtime. In
particular, the code that's called for delete p is operator delete(void*), but the code that's called for delete[] p is
operator delete[](void*). The default behavior for the latter is to
call the former, but users are allowed to replace the latter with a different
behavior (in which case they would normally also replace the corresponding
new code in operator new[](size_t)). If they replaced the
delete[] code so it wasn't compatible with the delete code, and you
called the wrong one (i.e., if you said delete p rather than
delete[] p), you could end up with a disaster at runtime.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.13] After p = new Fred[n], how does the
compiler know there are n objects to be destructed during delete[] p?
Short answer: Magic.
Long answer: The run-time system stores the number of objects, n, somewhere
where it can be retrieved if you only know the pointer, p. There are two
popular techniques that do this. Both these techniques are in use by
commercial grade compilers, both have tradeoffs, and neither is perfect. These
techniques are:
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.14] Is it legal (and moral) for a member function to say delete this?
As long as you're careful, it's OK for an object to commit suicide
(delete this).
Here's how I define "careful":
- You must be absolutely 100% positive sure that this object was
allocated via new (not by new[], nor by placement
new, nor a local object on the stack, nor a global, nor a
member of another object; but by plain ordinary new).
- You must be absolutely 100% positive sure that your member function
will be the last member function invoked on this object.
- You must be absolutely 100% positive sure that the rest of your
member function (after the delete this line) doesn't touch any piece of
this object (including calling any other member functions or touching any
data members).
- You must be absolutely 100% positive sure that no one even touches
the this pointer itself after the delete this line. In other words, you
must not examine it, compare it with another pointer, compare it with NULL,
print it, cast it, do anything with it.
Naturally the usual caveats apply in cases where your this pointer is a
pointer to a base class when you don't have a virtual
destructor.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.15] How do I allocate multidimensional arrays using new?
There are many ways to do this, depending on how flexible you want the array
sizing to be. On one extreme, if you know all the dimensions at compile-time,
you can allocate multidimensional arrays statically (as in C):
class Fred { /*...*/ };
void someFunction(Fred& fred);
void manipulateArray()
{
const unsigned nrows = 10; // Num rows is a compile-time constant
const unsigned ncols = 20; // Num columns is a compile-time constant
Fred matrix[nrows][ncols];
for (unsigned i = 0; i < nrows; ++i) {
for (unsigned j = 0; j < ncols; ++j) {
// Here's the way you access the (i,j) element:
someFunction( matrix[i][j] );
// You can safely "return" without any special delete code:
if (today == "Tuesday" && moon.isFull())
return; // Quit early on Tuesdays when the moon is full
}
}
// No explicit delete code at the end of the function either
}
More commonly, the size of the matrix isn't known until run-time but you know
that it will be rectangular. In this case you need to use the heap
("freestore"), but at least you are able to allocate all the elements in one
freestore chunk.
void manipulateArray(unsigned nrows, unsigned ncols)
{
Fred* matrix = new Fred[nrows * ncols];
// Since we used a simple pointer above, we need to be VERY
// careful to avoid skipping over the delete code.
// That's why we catch all exceptions:
try {
// Here's how to access the (i,j) element:
for (unsigned i = 0; i < nrows; ++i) {
for (unsigned j = 0; j < ncols; ++j) {
someFunction( matrix[i*ncols + j] );
}
}
// If you want to quit early on Tuesdays when the moon is full,
// make sure to do the delete along ALL return paths:
if (today == "Tuesday" && moon.isFull()) {
delete[] matrix;
return;
}
// ...
}
catch (...) {
// Make sure to do the delete when an exception is thrown:
delete[] matrix;
throw; // Re-throw the current exception
}
// Make sure to do the delete at the end of the function too:
delete[] matrix;
}
Finally at the other extreme, you may not even be guaranteed that the matrix is
rectangular. For example, if each row could have a different length, you'll
need to allocate each row individually. In the following function,
ncols[i] is the number of columns in row number i, where
i varies between 0 and nrows-1 inclusive.
void manipulateArray(unsigned nrows, unsigned ncols[])
{
typedef Fred* FredPtr;
// There will not be a leak if the following throws an exception:
FredPtr* matrix = new FredPtr[nrows];
// Set each element to NULL in case there is an exception later.
// (See comments at the top of the try block for rationale.)
for (unsigned i = 0; i < nrows; ++i)
matrix[i] = NULL;
// Since we used a simple pointer above, we need to be
// VERY careful to avoid skipping over the delete code.
// That's why we catch all exceptions:
try {
// Next we populate the array. If one of these throws, all
// the allocated elements will be deleted (see catch below).
for (unsigned i = 0; i < nrows; ++i)
matrix[i] = new Fred[ ncols[i] ];
// Here's how to access the (i,j) element:
for (unsigned i = 0; i < nrows; ++i) {
for (unsigned j = 0; j < ncols[i]; ++j) {
someFunction( matrix[i][j] );
}
}
// If you want to quit early on Tuesdays when the moon is full,
// make sure to do the delete along ALL return paths:
if (today == "Tuesday" && moon.isFull()) {
for (unsigned i = nrows; i > 0; --i)
delete[] matrix[i-1];
delete[] matrix;
return;
}
// ...
}
catch (...) {
// Make sure to do the delete when an exception is thrown:
// Note that some of these matrix[...] pointers might be
// NULL, but that's okay since it's legal to delete NULL.
for (unsigned i = nrows; i > 0; --i)
delete[] matrix[i-1];
delete[] matrix;
throw; // Re-throw the current exception
}
// Make sure to do the delete at the end of the function too.
// Note that deletion is the opposite order of allocation:
for (unsigned i = nrows; i > 0; --i)
delete[] matrix[i-1];
delete[] matrix;
}
Note the funny use of matrix[i-1] in the deletion process. This
prevents wrap-around of the unsigned value when i goes one step
below zero.
Finally, note that pointers and arrays are evil.
It is normally much better to encapsulate your pointers in a class that has a
safe and simple interface. The following FAQ
shows how to do this.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.16] But the previous FAQ's code is SOOOO tricky and error
prone! Isn't there a simpler way?
[Recently fixed the Star Trek movie number thanks to Chris Sheppard (in 4/01) and wordsmithed last paragraph at the suggestion of prapp (in 4/01). Click here to go to the next FAQ in the "chain" of recent changes.]
Yep.
The reason the code in the previous FAQ was so
tricky and error prone was that it used pointers, and we know that
pointers and arrays are evil. The solution is to
encapsulate your pointers in a class that has a safe and simple interface. For
example, we can define a Matrix class that handles a rectangular
matrix so our user code will be vastly simplified when compared to the
the rectangular matrix code from the previous FAQ:
// The code for class Matrix is shown below...
void someFunction(Fred& fred);
void manipulateArray(unsigned nrows, unsigned ncols)
{
Matrix matrix(nrows, ncols); // Construct a Matrix called matrix
for (unsigned i = 0; i < nrows; ++i) {
for (unsigned j = 0; j < ncols; ++j) {
// Here's the way you access the (i,j) element:
someFunction( matrix(i,j) );
// You can safely "return" without any special delete code:
if (today == "Tuesday" && moon.isFull())
return; // Quit early on Tuesdays when the moon is full
}
}
// No explicit delete code at the end of the function either
}
The main thing to notice is the lack of clean-up code. For example, there
aren't any delete statements in the above code, yet there will be no
memory leaks, assuming only that the Matrix destructor does its job
correctly.
Here's the Matrix code that makes the above possible:
class Matrix {
public:
Matrix(unsigned nrows, unsigned ncols);
// Throws a BadSize object if either size is zero
class BadSize { };
// Based on the Law Of The Big Three:
~Matrix();
Matrix(const Matrix& m);
Matrix& operator= (const Matrix& m);
// Access methods to get the (i,j) element:
Fred& operator() (unsigned i, unsigned j);
const Fred& operator() (unsigned i, unsigned j) const;
// These throw a BoundsViolation object if i or j is too big
class BoundsViolation { };
private:
Fred* data_;
unsigned nrows_, ncols_;
};
inline Fred& Matrix::operator() (unsigned row, unsigned col)
{
if (row >= nrows_ || col >= ncols_) throw BoundsViolation();
return data_[row*ncols_ + col];
}
inline const Fred& Matrix::operator() (unsigned row, unsigned col) const
{
if (row >= nrows_ || col >= ncols_) throw BoundsViolation();
return data_[row*ncols_ + col];
}
Matrix::Matrix(unsigned nrows, unsigned ncols)
: data_ (new Fred[nrows * ncols]),
nrows_ (nrows),
ncols_ (ncols)
{
if (nrows == 0 || ncols == 0)
throw BadSize();
}
Matrix::~Matrix()
{
delete[] data_;
}
Note that the above Matrix class accomplishes two things: it moves
some tricky memory management code from the user code (e.g., main())
to the class, and it reduces the overall bulk of program. The latter point is
important. For example, assuming Matrix is even mildly reusable,
moving complexity from the users [plural] of Matrix into
Matrix itself [singular] is equivalent to moving complexity from the
many to the few. Anyone who's seen Star Trek 2 knows that the good of
the many outweighs the good of the few... or the one.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.17] But the above Matrix class is specific to
Fred! Isn't there a way to make it generic?
Yep; just use templates:
Here's how this can be used:
#include "Fred.hpp" // To get the definition for class Fred
// The code for Matrix<T> is shown below...
void someFunction(Fred& fred);
void manipulateArray(unsigned nrows, unsigned ncols)
{
Matrix<Fred> matrix(nrows, ncols); // Construct a Matrix<Fred> called matrix
for (unsigned i = 0; i < nrows; ++i) {
for (unsigned j = 0; j < ncols; ++j) {
// Here's the way you access the (i,j) element:
someFunction( matrix(i,j) );
// You can safely "return" without any special delete code:
if (today == "Tuesday" && moon.isFull())
return; // Quit early on Tuesdays when the moon is full
}
}
// No explicit delete code at the end of the function either
}
Now it's easy to use Matrix<T> for things other than Fred. For
example, the following uses a Matrix of std::string (where
std::string is the standard string class):
#include <string>
void someFunction(std::string& s);
void manipulateArray(unsigned nrows, unsigned ncols)
{
Matrix<std::string> matrix(nrows, ncols); // Construct a Matrix<std::string>
for (unsigned i = 0; i < nrows; ++i) {
for (unsigned j = 0; j < ncols; ++j) {
// Here's the way you access the (i,j) element:
someFunction( matrix(i,j) );
// You can safely "return" without any special delete code:
if (today == "Tuesday" && moon.isFull())
return; // Quit early on Tuesdays when the moon is full
}
}
// No explicit delete code at the end of the function either
}
You can thus get an entire family of classes from a
template. For example,
Matrix<Fred>, Matrix<std::string>, Matrix< Matrix<std::string> >, etc.
Here's one way that the template can be implemented:
template<class T> // See section on templates for more
class Matrix {
public:
Matrix(unsigned nrows, unsigned ncols);
// Throws a BadSize object if either size is zero
class BadSize { };
// Based on the Law Of The Big Three:
~Matrix();
Matrix(const Matrix<T>& m);
Matrix<T>& operator= (const Matrix<T>& m);
// Access methods to get the (i,j) element:
T& operator() (unsigned i, unsigned j);
const T& operator() (unsigned i, unsigned j) const;
// These throw a BoundsViolation object if i or j is too big
class BoundsViolation { };
private:
T* data_;
unsigned nrows_, ncols_;
};
template<class T>
inline T& Matrix<T>::operator() (unsigned row, unsigned col)
{
if (row >= nrows_ || col >= ncols_) throw BoundsViolation();
return data_[row*ncols_ + col];
}
template<class T>
inline const T& Matrix<T>::operator() (unsigned row, unsigned col) const
{
if (row >= nrows_ || col >= ncols_) throw BoundsViolation();
return data_[row*ncols_ + col];
}
template<class T>
inline Matrix<T>::Matrix(unsigned nrows, unsigned ncols)
: data_ (new T[nrows * ncols])
, nrows_ (nrows)
, ncols_ (ncols)
{
if (nrows == 0 || ncols == 0)
throw BadSize();
}
template<class T>
inline Matrix<T>::~Matrix()
{
delete[] data_;
}
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.18] What's another way to build a Matrix template?
[Recently created thanks to Jesper Rasmussen (in 4/01). Click here to go to the next FAQ in the "chain" of recent changes.]
Use the standard vector template, and make a vector of
vector.
The following uses a vector<vector<T> > (note the space between the
two > symbols).
#include <vector>
template<class T> // See section on templates for more
class Matrix {
public:
Matrix(unsigned nrows, unsigned ncols);
// Throws a BadSize object if either size is zero
class BadSize { };
// No need for any of The Big Three!
// Access methods to get the (i,j) element:
T& operator() (unsigned i, unsigned j);
const T& operator() (unsigned i, unsigned j) const;
// These throw a BoundsViolation object if i or j is too big
class BoundsViolation { };
private:
vector<vector<T> > data_;
};
template<class T>
inline T& Matrix<T>::operator() (unsigned row, unsigned col)
{
if (row >= nrows_ || col >= ncols_) throw BoundsViolation();
return data_[row][col];
}
template<class T>
inline const T& Matrix<T>::operator() (unsigned row, unsigned col) const
{
if (row >= nrows_ || col >= ncols_) throw BoundsViolation();
return data_[row][col];
}
template<class T>
Matrix<T>::Matrix(unsigned nrows, unsigned ncols)
: data_ (nrows)
{
if (nrows == 0 || ncols == 0)
throw BadSize();
for (unsigned i = 0; i < nrows; ++i)
data_[i].resize(ncols);
}
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.19] Does C++ have arrays whose length can be specified at
run-time?
Yes, in the sense that the standard library has a std::vector
template that provides this behavior.
No, in the sense that built-in array types need to have their length specified
at compile time.
Yes, in the sense that even built-in array types can specify the first index
bounds at run-time. E.g., comparing with the previous FAQ, if you only need
the first array dimension to vary then you can just ask new for an array of
arrays, rather than an array of pointers to arrays:
const unsigned ncols = 100; // ncols = number of columns in the array
class Fred { /*...*/ };
void manipulateArray(unsigned nrows) // nrows = number of rows in the array
{
Fred (*matrix)[ncols] = new Fred[nrows][ncols];
// ...
delete[] matrix;
}
You can't do this if you need anything other than the first dimension of the
array to change at run-time.
But please, don't use arrays unless you have to. Arrays
are evil. Use some object of some class if you can. Use arrays only
when you have to.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.20] How can I force objects of my class to always be
created via new rather than as locals or global/static objects?
Use the Named Constructor Idiom.
As usual with the Named Constructor Idiom, the constructors are all private
or protected, and there are one or more public static create() methods
(the so-called "named constructors"), one per constructor. In this case the
create() methods allocate the objects via new. Since the constructors
themselves are not public, there is no other way to create objects of the
class.
class Fred {
public:
// The create() methods are the "named constructors":
static Fred* create() { return new Fred(); }
static Fred* create(int i) { return new Fred(i); }
static Fred* create(const Fred& fred) { return new Fred(fred); }
// ...
private:
// The constructors themselves are private or protected:
Fred();
Fred(int i);
Fred(const Fred& fred);
// ...
};
Now the only way to create Fred objects is via Fred::create():
int main()
{
Fred* p = Fred::create(5);
// ...
delete p;
}
Make sure your constructors are in the protected section if you expect
Fred to have derived classes.
Note also that you can make another class Wilma a friend of Fred if you want to allow a Wilma to have a member object
of class Fred, but of course this is a softening of the original goal, namely
to force Fred objects to be allocated via new.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.21] How do I do simple reference counting?
[Recently moved definition of Fred::create() methods below the definition of class FredPtr (in 4/01). Click here to go to the next FAQ in the "chain" of recent changes.]
If all you want is the ability to pass around a bunch of pointers to the same
object, with the feature that the object will automagically get deleted when
the last pointer to it disappears, you can use something like the following
"smart pointer" class:
// Fred.h
class FredPtr;
class Fred {
public:
Fred() : count_(0) /*...*/ { } // All ctors set count_ to 0 !
// ...
private:
friend FredPtr; // A friend class
unsigned count_;
// count_ must be initialized to 0 by all constructors
// count_ is the number of FredPtr objects that point at this
};
class FredPtr {
public:
Fred* operator-> () { return p_; }
Fred& operator* () { return *p_; }
FredPtr(Fred* p) : p_(p) { ++p_->count_; } // p must not be NULL
~FredPtr() { if (--p_->count_ == 0) delete p_; }
FredPtr(const FredPtr& p) : p_(p.p_) { ++p_->count_; }
FredPtr& operator= (const FredPtr& p)
{ // DO NOT CHANGE THE ORDER OF THESE STATEMENTS!
// (This order properly handles self-assignment)
++p.p_->count_;
if (--p_->count_ == 0) delete p_;
p_ = p.p_;
return *this;
}
private:
Fred* p_; // p_ is never NULL
};
Naturally you can use nested classes to rename FredPtr to
Fred::Ptr.
Note that you can soften the "never NULL" rule above with a little more
checking in the constructor, copy constructor, assignment operator, and
destructor. If you do that, you might as well put a p_ != NULL check
into the "*" and "->" operators (at least as an assert()). I
would recommend against an operator Fred*() method, since that would
let people accidentally get at the Fred*.
One of the implicit constraints on FredPtr is that it must only point
to Fred objects which have been allocated via new. If you want to be
really safe, you can enforce this constraint by making all of Fred's
constructors private, and for each constructor have a public (static)
create() method which allocates the Fred object via new and returns a
FredPtr (not a Fred*). That way the only way anyone could
create a Fred object would be to get a FredPtr ("Fred* p = new Fred()" would be replaced by "FredPtr p = Fred::create()"). Thus
no one could accidentally subvert the reference counted mechanism.
For example, if Fred had a Fred::Fred() and a Fred::Fred(int i, int j), the changes to class Fred would be:
class Fred {
public:
static FredPtr create(); // Defined below class FredPtr {...}
static FredPtr create(int i, int j); // Defined below class FredPtr {...}
// ...
private:
Fred();
Fred(int i, int j);
// ...
};
class FredPtr { /* ... */ };
inline FredPtr Fred::create() { return new Fred(); }
inline FredPtr Fred::create(int i, int j) { return new Fred(i,j); }
The end result is that you now have a way to use simple reference counting to
provide "pointer semantics" for a given object. Users of your Fred class
explicitly use FredPtr objects, which act more or less like Fred*
pointers. The benefit is that users can make as many copies of their
FredPtr "smart pointer" objects, and the pointed-to Fred object will
automagically get deleted when the last such FredPtr object vanishes.
If you'd rather give your users "reference semantics" rather than "pointer
semantics," you can use reference counting to provide
"copy on write".
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.22] How do I provide reference counting with copy-on-write
semantics?
Reference counting can be done with either pointer semantics or reference
semantics. The previous FAQ shows how to do
reference counting with pointer semantics. This FAQ shows how to do reference
counting with reference semantics.
The basic idea is to allow users to think they're copying your Fred objects,
but in reality the underlying implementation doesn't actually do any copying
unless and until some user actually tries to modify the underlying Fred
object.
Class Fred::Data houses all the data that would normally go into the
Fred class. Fred::Data also has an extra data member,
count_, to manage the reference counting. Class Fred ends up being a
"smart reference" that (internally) points to a Fred::Data.
class Fred {
public:
Fred(); // A default constructor
Fred(int i, int j); // A normal constructor
Fred(const Fred& f);
Fred& operator= (const Fred& f);
~Fred();
void sampleInspectorMethod() const; // No changes to this object
void sampleMutatorMethod(); // Change this object
// ...
private:
class Data {
public:
Data();
Data(int i, int j);
Data(const Data& d);
// Since only Fred can access a Fred::Data object,
// you can make Fred::Data's data public if you want.
// But if that makes you uncomfortable, make the data private
// and make Fred a friend class via friend Fred;
// ...
unsigned count_;
// count_ is the number of Fred objects that point at this
// count_ must be initialized to 1 by all constructors
// (it starts as 1 since it is pointed to by the Fred object that created it)
};
Data* data_;
};
Fred::Data::Data() : count_(1) /*init other data*/ { }
Fred::Data::Data(int i, int j) : count_(1) /*init other data*/ { }
Fred::Data::Data(const Data& d) : count_(1) /*init other data*/ { }
Fred::Fred() : data_(new Data()) { }
Fred::Fred(int i, int j) : data_(new Data(i, j)) { }
Fred::Fred(const Fred& f)
: data_(f.data_)
{
++ data_->count_;
}
Fred& Fred::operator= (const Fred& f)
{
// DO NOT CHANGE THE ORDER OF THESE STATEMENTS!
// (This order properly handles self-assignment)
++ f.data_->count_;
if (--data_->count_ == 0) delete data_;
data_ = f.data_;
return *this;
}
Fred::~Fred()
{
if (--data_->count_ == 0) delete data_;
}
void Fred::sampleInspectorMethod() const
{
// This method promises ("const") not to change anything in *data_
// Other than that, any data access would simply use "data_->..."
}
void Fred::sampleMutatorMethod()
{
// This method might need to change things in *data_
// Thus it first checks if this is the only pointer to *data_
if (data_->count_ > 1) {
Data* d = new Data(*data_); // Invoke Fred::Data's copy ctor
-- data_->count_;
data_ = d;
}
assert(data_->count_ == 1);
// Now the method proceeds to access "data_->..." as normal
}
If it is fairly common to call Fred's default
constructor, you can avoid all those new calls by sharing
a common Fred::Data object for all Freds that are constructed via
Fred::Fred(). To avoid static initialization order problems, this
shared Fred::Data object is created "on first use" inside a function.
Here are the changes that would be made to the above code (note that the shared
Fred::Data object's destructor is never invoked; if that is a problem,
either hope you don't have any static initialization order problems, or drop
back to the approach described above):
class Fred {
public:
// ...
private:
// ...
static Data* defaultData();
};
Fred::Fred()
: data_(defaultData())
{
++ data_->count_;
}
Fred::Data* Fred::defaultData()
{
static Data* p = NULL;
if (p == NULL) {
p = new Data();
++ p->count_; // Make sure it never goes to zero
}
return p;
}
Note: You can also provide reference
counting for a hierarchy of classes if your Fred class would normally
have been a base class.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.23] How do I provide reference counting with
copy-on-write semantics for a hierarchy of classes?
The previous FAQ presented a reference counting
scheme that provided users with reference semantics, but did so for a single
class rather than for a hierarchy of classes. This FAQ extends the previous
technique to allow for a hierarchy of classes. The basic difference is that
Fred::Data is now the root of a hierarchy of classes, which probably
cause it to have some virtual functions. Note
that class Fred itself will still not have any virtual functions.
The Virtual Constructor Idiom is used to make
copies of the Fred::Data objects. To select which derived class to
create, the sample code below uses the Named Constructor
Idiom, but other techniques are possible (a
switch statement in the constructor, etc). The sample code assumes two
derived classes: Der1 and Der2. Methods in the derived classes are unaware
of the reference counting.
class Fred {
public:
static Fred create1(const std::string& s, int i);
static Fred create2(float x, float y);
Fred(const Fred& f);
Fred& operator= (const Fred& f);
~Fred();
void sampleInspectorMethod() const; // No changes to this object
void sampleMutatorMethod(); // Change this object
// ...
private:
class Data {
public:
Data() : count_(1) { }
Data(const Data& d) : count_(1) { } // Do NOT copy the 'count_' member!
Data& operator= (const Data&) { return *this; } // Do NOT copy the 'count_' member!
virtual ~Data() { assert(count_ == 0); } // A virtual destructor
virtual Data* clone() const = 0; // A virtual constructor
virtual void sampleInspectorMethod() const = 0; // A pure virtual function
virtual void sampleMutatorMethod() = 0;
private:
unsigned count_; // count_ doesn't need to be protected
friend Fred; // Allow Fred to access count_
};
class Der1 : public Data {
public:
Der1(const std::string& s, int i);
virtual void sampleInspectorMethod() const;
virtual void sampleMutatorMethod();
virtual Data* clone() const;
// ...
};
class Der2 : public Data {
public:
Der2(float x, float y);
virtual void sampleInspectorMethod() const;
virtual void sampleMutatorMethod();
virtual Data* clone() const;
// ...
};
Fred(Data* data);
// Creates a Fred smart-reference that owns *data
// It is private to force users to use a createXXX() method
// Requirement: data must not be NULL
Data* data_; // Invariant: data_ is never NULL
};
Fred::Fred(Data* data) : data_(data) { assert(data != NULL); }
Fred Fred::create1(const std::string& s, int i) { return Fred(new Der1(s, i)); }
Fred Fred::create2(float x, float y) { return Fred(new Der2(x, y)); }
Fred::Data* Fred::Der1::clone() const { return new Der1(*this); }
Fred::Data* Fred::Der2::clone() const { return new Der2(*this); }
Fred::Fred(const Fred& f)
: data_(f.data_)
{
++ data_->count_;
}
Fred& Fred::operator= (const Fred& f)
{
// DO NOT CHANGE THE ORDER OF THESE STATEMENTS!
// (This order properly handles self-assignment)
++ f.data_->count_;
if (--data_->count_ == 0) delete data_;
data_ = f.data_;
return *this;
}
Fred::~Fred()
{
if (--data_->count_ == 0) delete data_;
}
void Fred::sampleInspectorMethod() const
{
// This method promises ("const") not to change anything in *data_
// Therefore we simply "pass the method through" to *data_:
data_->sampleInspectorMethod();
}
void Fred::sampleMutatorMethod()
{
// This method might need to change things in *data_
// Thus it first checks if this is the only pointer to *data_
if (data_->count_ > 1) {
Data* d = data_->clone(); // The Virtual Constructor Idiom
-- data_->count_;
data_ = d;
}
assert(data_->count_ == 1);
// Now we "pass the method through" to *data_:
data_->sampleInspectorMethod();
}
Naturally the constructors and sampleXXX methods for Fred::Der1
and Fred::Der2 will need to be implemented in whatever way is
appropriate.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.24] Can you absolutely prevent people
from subverting the reference counting mechanism, and if so, should
you?
[Recently created (in 4/01) and wordsmithing thanks to Stan Brown (in 8/01). Click here to go to the next FAQ in the "chain" of recent changes.]
No, and (normally) no.
There are two basic approaches to subverting the reference counting mechanism:
- The scheme could be subverted if someone got a Fred*
(rather than being forced to use a FredPtr). Someone could get a
Fred* if class FredPtr has an operator*() that returns
a Fred&": FredPtr p = Fred::create(); Fred* p2 = &*p;. Yes
it's bizarre and unexpected, but it could happen. This hole could be closed
in two ways: overload Fred::operator&() so it returns a
FredPtr, or change the return type of FredPtr::operator*() so
it returns a FredRef (FredRef would be a class that simulates
a reference; it would need to have all the methods that Fred has, and
it would need to forward all those method calls to the underlying Fred
object; there might be a performance penalty for this second choice depending
on how good the compiler is at inlining methods). Another way to fix this is
to eliminate FredPtr::operator*() and lose the corresponding
ability to get and use a Fred&. But even if you did all this, someone
could still generate a Fred* by explicitly calling
operator->(): FredPtr p = Fred::create(); Fred* p2 = p.operator->();.
- The scheme could be subverted if someone had a leak and/or dangling
pointer to a FredPtr Basically what we're saying here is that
Fred is now safe, but we somehow want to prevent people from doing
stupid things with FredPtr objects. (And if we could solve that via
FredPtrPtr objects, we'd have the same problem again with them). One
hole here is if someone created a FredPtr using new, then
allowed the FredPtr to leak (worst case this is a leak, which is bad
but is usually a little better than a dangling pointer). This hole
could be plugged by declaring FredPtr::operator new() as private,
thus preventing someone from saying new FredPtr(). Another hole here
is if someone creates a local FredPtr object, then takes the address
of that FredPtr and passed around the FredPtr*. If that
FredPtr* lived longer than the FredPtr, you could have a
dangling pointer shudder. This hole could be plugged by preventing people
from taking the address of a FredPtr (by overloading
FredPtr::operator&() as private), with the corresponding loss of
functionality. But even if you did all that, they could still create a
FredPtr& which is almost as dangerous as a FredPtr*, simply by
doing this: FredPtr p; ... FredPtr& q = p; (or by passing the
FredPtr& to someone else).
And even if we closed all those holes, C++ has those wonderful pieces
of syntax called pointer casts. Using a pointer cast or two, a sufficiently
motivated programmer can normally create a hole that's big enough to drive a
proverbial truck through. (By the way, pointer casts are
evil.)
So the lessons here seems to be: (a) you can't prevent espionage no matter how
hard you try, and (b) you can easily prevent mistakes.
So I recommend settling for the "low hanging fruit": use the easy-to-build and
easy-to-use mechanisms that prevent mistakes, and don't bother trying to
prevent espionage. You won't succeed, and even if you do, it'll (probably)
cost you more than it's worth.
So if we can't use the C++ language itself to prevent espionage, are there
other ways to do it? Yes. I personally use old fashioned code reviews for
that. And since the espionage techniques usually involve some bizarre syntax
and/or use of pointer-casts and unions, you can use a tool to point out most
of the "hot spots."
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.25] Can I use a garbage collector in C++?
[Recently added cross-references thanks to Stan Brown (in 8/01). Click here to go to the next FAQ in the "chain" of recent changes.]
Yes.
Compared with the "smart pointer" techniques (see [16.21],
the two kinds of garbage collector techniques (see
[16.26]) are:
- less portable
- usually more efficient (especially when the average object size is
small or in multithreaded environments)
- able to handle "cycles" in the data (reference counting techniques
normally "leak" if the data structures can form a cycle)
- sometimes leak other objects (since the garbage collectors are
necessarily conservative, they sometimes see a random bit pattern that appears
to be a pointer into an allocation, especially if the allocation is large;
this can allow the allocation to leak)
- work better with existing libraries (since smart pointers need to
be used explicitly, they may be hard to integrate with existing
libraries)
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.26] What are the two kinds of garbage collectors for
C++?
[Recently added a URL for Bartlett's collector thanks to Abhishek (in 4/01) and added a URL for Attardi and Flagella's CMM thanks to Markus Laker (in 8/01). Click here to go to the next FAQ in the "chain" of recent changes.]
In general, there seem to be two flavors of garbage collectors for C++:
-
Conservative garbage collectors. These know little or nothing about
the layout of the stack or of C++ objects, and simply look for bit patterns
that appear to be pointers. In practice they seem to work with both C and C++
code, particularly when the average object size is small. Here are some
examples, in alphabetical order:
-
Hybrid garbage collectors. These usually scan the stack
conservatively, but require the programmer to supply layout information for
heap objects. This requires more work on the programmer's part, but may
result in improved performance. Here are some examples, in alphabetical
order:
Since garbage collectors for C++ are normally conservative, they can sometimes
leak if a bit pattern "looks like" it might be a pointer to an otherwise
unused block. Also they sometimes get confused when pointers to a block
actually point outside the block's extent (which is illegal, but some
programmers simply must push the envelope; sigh) and (rarely) when a
pointer is hidden by a compiler optimization. In practice these problems are
not usually serious, however providing the collector with hints about the
layout of the objects can sometimes ameliorate these issues.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
[16.27] Where can I get more info on garbage
collectors for C++?
For more information, see the
Garbage Collector FAQ.
[ Top | Bottom | Previous section | Next section | Search the FAQ ]
E-mail the author
[ C++ FAQ Lite
| Table of contents
| Subject index
| About the author
| ©
| Download your own copy ]
Revised Aug 15, 2001