Home » Miscellanea » Programming Languages » C/C++ » Pass by Value vs. Pass by Reference

Pass by Value vs. Pass by Reference

Computer programs written using imperative programming languages like C (but also C++ when used without object-oriented style) make extensive use of function calls. More precisely, they have a “standard” entry-point function (i.e., the main()) which encapsulates the entire logic (i.e., computational flow) of the program. The whole computational flow may be in turn organized into several, isolated, “smallest units” of work, which are responsible for accomplishing easier tasks through dedicated functions. In addition, this pattern can be replicated within each dedicated function, thereby generating nested chains of functions.
A crucial aspect of implementing function calls is the way in which arguments are passed from the caller to the callee.
In this post, I’m gonna explore the two most famous ways of passing an argument to a function: pass-by-value and pass-by-reference.
In fact, when talking about “pass-by-value” and “pass-by-reference” there are some tricky issues which need to be clearly understood.

In a nutshell, passing an argument by value to a function means that the function will have its own copy of the argument, namely the value of the argument is copied from the caller to the callee. It turns out that modifying that copy will not modify the original argument.
Conversely, when passing by reference, the parameter inside the function refers to the same object that was passed in. It turns out that any changes operated on the object from inside the function will be seen outside as well.

Unfortunately, there are two ways in which the phrases “pass by value” and “pass by reference” are used, and those originate potential confusion.

C
Technically speaking, in C everything is passed by value. That is, whatever you give as an argument to a function, it will be copied into the scope of that function. For instance, calling a function void foo(int) with foo(x) copies the value of x as the parameter of foo.
Let’s explain it better with the well-known example of a function that swaps the content of two variables:

void swap(int x, int y) {
    int tmp = x;
    x = y;
    y = tmp;
}

int main() {
    int m = 6;
    int n = 10;
    printf("Before swapping (m,n) evaluate to: (%d,%d)\n",m,n);
    swap(m,n);
    printf("After swapping (m,n) evaluate to: (%d,%d)\n",m,n); // m and n still equal to their original values
}

In the example above the two arguments m,n are copied from the main “into” the scope of the function swap. This means that the swapping that occurs within that function “lives” until swap returns. In fact, m,n within the main still preserve their original values.

However, function arguments can be of any type (i.e., not only primitive types like int above).
In C a very useful type is a pointer to some other type. For instance, we could define the following:

    int x = 10;
    int* px = &x;

The first one is the declaration of an integer variable x whereas the second statement declares a variable px of type int*, that is “pointer to an integer”. Moreover, the pointer px is assigned to the address of the integer variable x via the reference operator (&).
To get rid of the difference between a variable of a certain type and a pointer variable to that type, please refer to the following Figure:

pointer

Now, when passing a pointer to a function, you are still passing it by value. Indeed, the value of the pointer variable is copied into the function. In the example above, this means that a copy of px, namely a copy of the address of x (e.g., 5678), is passed to the function.
It turns out that modifying that pointer inside the function will not change the pointer outside the function (i.e., outside the pointer will still contain the address of x). However, if you modify the object which the pointer points to (i.e., x) from within the function then the object itself results modified also outside the function. But why?

As two pointers (i.e., the original and the copied one) that have the same value always point at the same object (i.e., they contain the same address), the object that is being pointed to may be accessed and modified through both. This gives the semantics of having passed the pointed-to-object (i.e., x) by reference, although no references ever actually existed: to put it simply, there is no references in C as opposed to C++.

Let’s take a look at our swap function which now takes as input two pointers to integers instead of two integers directly:

void swap(int* x, int* y) {
    int tmp = *x; // the * operator is used to "de-reference" the pointer 
                  // i.e., to get the actual object which the pointer points to
    *x = *y;
    *y = tmp;
}

int main() {
    int m = 6;
    int n = 10;
    printf("Before swapping (m,n) evaluate to: (%d,%d)\n",m,n);
    swap(&m,&n); // now swap takes two integer pointers as arguments
    printf("After swapping (m,n) evaluate to: (%d,%d)\n",m,n); // m and n now have swapped their original values
}

So, when passing one or more pointers into a function (i.e., int* x, int* y), we may state that the corresponding objects they point to (i.e., int x, int y, respectively) was “passed by reference” but in truth the objects were never actually passed anywhere at all. This is just a side effect that results from copying their pointers into the function, and gives us the colloquial meaning of “pass by value” and “pass by reference”.

The usage of this terminology is backed up by terms within the standard. When you have a pointer type, the type that it is pointing to is known as its referenced type. That is, the referenced type of int* is int.
In addition, while the unary operator * (as in *x and *y) is known as indirection in the standard, it is commonly also known as dereferencing a pointer. This further (misleadingly) promotes the notion of “passing by reference” in C.

C++
C++ adopted many of its original language features from C. Among them are pointers and so this colloquial form of “passing by reference” can still be used, that is the swap function defined above which takes pointers and exchanges their values through the dereference operator (*) is still valid.
However, using this terminology with C++ might be confusing, because C++ introduces a feature that C doesn’t have: the ability to truly pass references.

A type followed by an ampersand (&) is a reference type. For instance, int& is a reference to an int. When passing an argument to a function that takes reference type, the object is truly passed by reference. There are no pointers involved nor copying of objects. The name bound inside the body of the function actually refers to exactly the same object that was passed in. The difference with pointers might appear subtle but in fact is fundamental.
Here’s how the swap function looks like if using reference types instead of pointers as input arguments:


// the ampersand operator (&) states that both the input arguments to this function
// will actually be references to the arguments used during function call  
void swap(int& x, int& y) {  
    int tmp = x;
    x = y;
    y = tmp;
}

int main() {
    int m = 6;
    int n = 10;
    printf("Before swapping (m,n) evaluate to: (%d,%d)\n",m,n);
    swap(m,n); // we now invoke the function on the two integer types
               // but in fact their corresponding reference types will be passed to the swap function
    printf("After swapping (m,n) evaluate to: (%d,%d)\n",m,n); // m and n now have swapped their original values
}

In the example above, nothing was passed by value and nothing was copied.
Unlike in C, where passing by reference was a “side-effect” of just passing a pointer by value, in C++ we can natively pass by reference.

Wrap Up
Somehow, it is only a matter of how the concept of “passing by reference” is actually realized by a programming language: C implements this by using pointers and passing them by value to functions whereas C++ provides two implementations. From a side, it reuses the same mechanism derived from C (i.e., pointers + pass by value). On the other hand, C++ also provides a native “pass by reference” solution which makes use of the idea of reference types. Thus, even in C++ if you are passing a pointer à la C, you are not truly passing by reference, you are passing a pointer by value (that is, of course, unless you are passing a reference to a pointer! e.g., int*&).
Because of this potential ambiguity in the term “pass by reference”, perhaps it’s best to only use it in the context of C++ when you are using a reference type.
You may, however, come across uses of “pass by reference” when pointers + “pass by value” are being used, but now at least you know what is really happening under the hood!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: