Returning Pointers From Functions

This is something that was not entirely obvious to me. My problem was that I wanted to return a character pointer to a string of characters in a safe and easy way. The following listing is a simple C++ program that will compile, but contains a couple semantic errors.

#include <iostream>
#include <string.h>
 
char const * func1();
char const * func2();
char const * func3();
 
int main(int argc, char** argv)
{
   std::cout << func1() << std::endl;
   std::cout << func2() << std::endl;
   std::cout << func3() << std::endl;
 
   return 0;
}
 
char const * func1() {
   char const * string = "Hello from function 1";
   return string;
}
 
char const * func2() {
   char * string = new char[256];
   strcpy(string,"Hello from function 2");
   return string;
}
 
char const * func3() {
   static char const * string = "Hello from function 3";
   return string;
}

On my system I got the following output:

Hello from function 1
Hello from function 2
Hello from function 3

With these results, a naive programmer could come to the conclusion that all three function are essentially the same. This assumption is entirely wrong; each function stores its memory in a drastically different way with varying consequences. Even worse, this program will not always produce the same results in each execution or on different systems. Let's investigate each function more in detail.

In func1 we create a constant character pointer to the string "Hello from function 1" on the local stack. Stack memory is deallocated once the variable goes out of scope. The pointer that is returned points to invalid memory. But then why does it still print "Hello from function 1"? This is because while the memory was deallocated it has not been deleted or used somewhere else. The scary part is that this memory could be reallocated by any process on the system. This is bad, never ever return a pointer to local data.

In func2 we allocate memory for a string with a buffer of 256. This points to memory in the global heap, not the local stack. So when the variable goes out of scope it does not get deleted. This returns the pointer to the character string "Hello from function 2." The problem here is not in func2, but in main. The allocated memory is never deallocated from the heap which results in a memory leak. Unlike func1, func2 is completely valid code, but a developer using func2 needs to remember to delete the returned character pointer with the function delete.

char const * str = func2();
// do stuff with str
delete str;

This can be a pain and, in my opinion, not acceptable in all situations. Especially when returning a pointer to a simple data type.

The last function, func3, is something that I came across while searching for a better solution. This is a little bit of a hack, but useful nonetheless. In this case we have created a static constant character pointer. It is a local variable, but does not get destroyed when we leave the scope of func3. Now, since the pointer points to valid memory owned by the program, we can use that pointer to print the desired results. Observe that func3 returns a pointer to a constant character. That way main can not modify this character string. This works well, but you probably are going to want to copy that data as soon as possible because the data will be replaced the next time the function is called.

Note that func3 is not going to be thread safe. Thread 1 calls func3 and makes the modifications to the variable string. Then before it is returned, thread 2 calls func3 and changes the static character pointer. Then the pointer is returned to thread 2. The pointer is returned to thread 1. Now thread 1 has a pointer to memory that thread 2 generated and what thread 1 generated is lost.

The last two functions work, but can cause unexpected results if not used properly. What I would really like is a way that I could safely return a pointer to a string and not have to worry about the returned data being misused. With a little extra overhead we can use the standard C++ library provided auto_ptr. This is a smart pointer that assumes responsibility of a pointer. It will do all the allocation and deallocation for you.

#include <memory>
#include <string.h>
 
std::auto_ptr<char> func4() {
   char * string = new char[256];
   strcpy(string, "Hello from function 4");   
   return std::auto_ptr<char>(string);
}

We can then call the function as follows and rest assured that the program will not cause any memory leaks or point to invalid data.

std::cout << func4().get() << std::endl;

Pointers are tricky. When used correctly, they can be very powerful. But, when used incorrectly, can cause some pretty nasty bugs. As a last note, if all you really want is to return a string, it may be best to avoid pointers altogether and just return an std::string.