Uniform reference semantics


We have chosen the phrase "uniform reference semantics" to refer to a common programming practice in which all objects are systematically manipulated through references (or pointers if you prefer).

In some languages (C, Pascal, C++), you need explicit pointers to accomplish uniform reference semantics, in others (Java, Lisp) it is implicit.

We argue that uniform reference semantics is good programming practice in nearly all programming languages.


Let us take a language such as C. Any language in which assignment cannot be redefined will do. The only wide-spread language in which assignment can be redefined is C++, so we cannot use C++ as an example. We therefore use C.

Suppose we want to manipulate objects of some sort, let us take for instance boats. In order to hide implementation details, we provide a library for manipulating boats, and a type boat that we can use to define variables of type boat. Let us look at some example of a piece of code:

	boat x = boat_create();
	boat y = x;
	paint(x, red);
	paint(y, blue);
It is clear that after the last statement, y is blue, but what color is x? There are two interpretations, depending on how you think of the assignment operator =.

Interpretation 1: Assignment copies the boat so that after the execution of the statement y = x, y and x are different boats that happen to have the exact same values of all their attributes, color, size, owner, whatever. We call this style copy semantics.

Interpretation 2: Assignment only copies a reference so that after the execution of the statement y = x, y and x are two references to the same boat. Any subsequent modification to that boat, whether done through the value of x or the value of y, is going to be visible both through x and y. We call this style reference semantics.

Choosing between copy semantics and reference semantics

So, which type of semantics should we choose? We argue that reference semantics is the only reasonable choice in a language such as C, and in most other languages except possibly C++.

Recall that we want our code to be independent of the implementation details. Thus, we cannot let the exact implementation of boats influence the meaning of the code fragment above. Instead, we must choose our favorite interpretation, and then make sure our implementation respects it.

Suppose for a moment that we choose copy semantics. We argue that this choice leads to absurdities. With copy semantics, we would have to make sure the boat was implemented as a C struct (or equivalent construct in other languages). That is the only way we can make assignment respect our choice of semantics. But suppose we also want our boats to contain an arbitrary list of objects, such as sails, radio equipment, etc. The best way to implement such lists, would be to have the boat structure contain a pointer to a list of these objects, like this:

	 struct sail_cell
	    sail s;
	    struct sail_cell *next;
	 struct radio_cell
	    radio r;
	    struct radio_cell *next;

	 struct boat
	   float length; /* in meters */
	   struct sail_cell *sails;
	   struct radio_cell *radios;

	 typedef struct boat boat;
Now we are in big trouble. Our assignment operation is shallow, so that it only copies the boat structure itself, not the list of sails nor the list of radio equipment, which continue to be shared after assignment. Now there is a risk that we sometimes have reference semantics and sometimes copy semantics. In our example above x will be red and y will be blue, but if we do this:
	 remove_sail(y, s);
there is a risk that the sail s will be removed both from x and y, which does not respect our choice of copy semantics. For reasons of modularity, we cannot let this happen. The meaning of the operation (in this case remove_sail), must be independent of the implementation. The only way that could be accomplished would be if the assignment operation also copied the two lists. But since we cannot redefine what assignment means, this is impossible.

We must conclude that copy semantics is not possible in languages such as C.

Implementing reference semantics

If the above example used reference semantics, we would have to make sure that the assignment operation only copied the reference. The best way to do that in a language such as C, is to use pointers.

But, if we want reference semantics to be the default case (which is why we call it uniform), we would always have to put in a * to indicate pointer. But since we would always use pointers, we can avoid the * by the use of typedefs.

Here is the same example using uniform reference semantics. Notice how we systematically use the pointer as the main abstraction:

         typedef struct sail_cell *sail_list;

	 struct sail_cell
	    sail s;
	    sail_list next;
	 typedef struct radio_cell *radio_list;

	 struct radio_cell
	    radio r;
	    radio_list next;

	 typedef struct boat *boat;

	 struct boat
	   float length; /* in meters */
	   sail_list sails;
	   radio_list radios;
Here the names sail_list, radio_list, and boat are all aliases for pointers. Thus, when we say declare a variable such as boat x, x will contains a pointer, but this fact is implicit, since we know that we always use reference semantics anyway.

Consequences of reference semantics

In any program, we sometimes need to copy references to the same object, and sometimes entire objects.

If we use copy semantics (assuming we are using a language such as C++ in which we can redefine assignment) we have to use special syntax to indicate that we mean for references to be copied. We can use pointers (in C) or references (in C++) to accomplish that.

If we use reference semantics, we instead have to use special syntax to indicate that we mean to copy the object. We accomplish that by using explicit copy operations, for instance:

	boat z = copy_boat(x);
which works fine, since the function copy_boat can copy enough structure to respect the semantics in this case.

Parameter passing

Passing parameters to a function is similar to assignment. When we use uniform reference semantics, no copying is ever made as a result of parameter passing. Only references are duplicated. It is therefore easy to write functions that have side effects on objects. For instance, if we use reference semantics, this code makes sense:
	paint(x, blue);
We pass (a reference to) x to the procedure paint, which modifies the object appropriately. If we used copy semantics, this code would not work. The object contained in x would be copied as a result of parameter passing. The procedure paint would receive a copy of the object. Any attempt to modify that copy will not have any effect on the original object. We would have to use special syntax such as:
	paint(&x, blue);

Copying is rare

As it turns out, it is rare to want to copy objects in most programs. The reason is simple: if the program manipulates computer versions of real objects, then copying does not make a lot of sense. It is rare in real life to have multiple copies of objects. An exception would be a plant for mass production of some sort of objects.

Thus, in most cases, you want to use references anyway. It is an advantage not to have to use special syntax for the most common case. Reference semantics makes this possible.

Good programmers use uniform reference semantics

Most good programmers use uniform reference semantics. In fact, most good programmers use programming languages that use uniform (or near-uniform) reference semantics, such as Lisp, Smalltalk, Java, and Eiffel.

Summary of advantages of uniform reference semantics

Uniform reference semantics gives cleaner and smaller programs. This is the major advantage, but there are a number of minor ones.

The programs consume less memory since usually only one copy of each object is present at all times, whereas copy semantics tend to generate multiple copies of objects.

The programs are faster since only references are manipulated, and references are usually smaller than the objects themselves. Assignment and parameter passing is faster since only one machine word needs to be copied, whereas with copy semantics, the entire object needs to be duplicated. Programs are also faster as a result of being smaller. Larger programs increase register spills to the stack, cache misses, and virtual memory paging.

The programs use much less special syntax for the common cases, making programs more readable and therefore more maintainable.

In most languages, copy semantics can only be partially implemented, making programs that attempt to use copy semantics sensitive to modifications (and therefore less maintainable) and less modular.

So why would anyone want to use copy semantics?

If uniform reference semantics has so many advantages, why would anyone want to use copy semantics?

Well, there is one main problem with reference semantics. When objects have multiple references scattered all over the data space, then it becomes hard to know when one can safely delete the object. Deleting (say by using free in C) an object to which references still remain can create core dumps, or worse, create mysterious bugs and incorrect behavior in general. Conversely, not deleting an object to which no more references exist generates so called memory leaks, eventually exhausting all of memory. Memory leaks are a big problem, especially for long-lived programs such as servers, and must be avoided.

The use of copy semantics avoids the problem by making sure that each object is referenced exactly once. If more references are needed, the object is copied as well. When a reference to an object is removed, one can simply delete the object as well.

Correctly managing multiple references to objects in large programs, requires automatic memory management (so called garbage collection). This is why the preferred languages for good programmers cited above (Lisp, Smalltalk, Java, Eiffel) all use automatic memory management.

So what can one do if one is stuck with a language such as C which does not have automatic memory management? Fortunately, there is a solution. There exists a freely distributable library for C and C++ that adds automatic memory management to these languages. The library is known as Boehm's conservative garbage collector, as it was created by Hans Boehm.

So what about C++?

We mentioned above that in C++, assignment can be redefined. Thus, C++ is one of the few languages in which copy semantics can be implemented uniformly. The way to do that is to make sure that the copy constructor for objects duplicates all of the internal structure that is considered part of the object.

But if what we argued above is true, i.e. if most of the time what you want is to duplicate references only, and not to copy the entire object, then why would you want to implement copy semantics at all?

The only explanation we can think of is that, since C++ does not have automatic memory management, copy semantics is sometimes the only way to obtain a correct program (with no premature attempts to free memory for objects in use, and no memory leaks). The price to pay for correct programs can be very high, however. Updating an object may require finding all of its copies and updating them as well. For instance, suppose a program contains two containers, one containing all the staff of the company, and another containing all the people with parking permits. Suppose a staff member moves to a different address. We need to update the object in our container of staff members to reflect the new address. But we also have to search the container of people with parking permits as well since there might be a copy of our staff member in it. The same procedure needs to be repeated for each container that might contain a copy of the staff member object. Had we use reference semantics, we would be sure that all containers would contain references to one unique object. The address would have to be updated only once.

How to implement reference semantics

Uniform reference semantics requires some care so that the implementation fully respects it.

Suppose, for instance, that we want to implement a stack as a linked list of elements. It may seem natural to represent the empty stack as a NULL pointer. Indeed, that is probably how it has been taught in algorithmics class. Unfortunately such an implementation does not respect the reference semantics. Example:

	stack s = stack_create();
	stack t = s;
	push(t, 10);
Notice the similarity between this example and the first one in which we painted boats. Here, if stack_create returns NULL, then already we are in big trouble. There is no way to implement the push operation in that case. Since parameter passing is by value, push cannot have any side effects on t, nor on s. We would have to resort to something like:
	push(&t, 10);
but that is unacceptable as it reveals the implementation of the stack. Furthermore, it has no effect on s. What we want is for the push operation to push the value 10 on both s and t. How can we solve this problem? We use a header object. Here is how we can implement the stack (if you don't know what void * means, ignore it and think of it as int for a moment):
	typedef struct cell *list;
	struct cell
	  void *element;
	  list next;

	typedef struct stack *stack;

	struct stack
	  list elements;


	  stack s = malloc(sizeof(struct stack));
	  s -> elements = NULL;
	  return s;
With this implementation, the empty stack is represented as a pointer to an instance of the struct stack. It serves no purpose other than being pointed to by all the shared references to it, in our example s and t. We call that instance a header object. Such header objects are often necessary in order to preserve uniform reference semantics. This is particularly the case for containers that can contain zero elements such as stacks, queues, trees, etc.

Once we have such a header object, we can use it to store useful information. In our examples, we can store the number of elements currently in the stack in case that information is needed frequently. In a queue, we can store pointers to the head and the tail for quick access, etc.