Designing and Coding Reusable C++
Martin Carroll and Margaret Ellis

Chapter 1. Introduction to reusability

Essential properties of reusable code:

It is easy to find and to understand
There is a reasonable assurance that it is correct
It requires no separation from any containing code
It requires no changes to be used in a new program

Myths of reuse:

Reuse will resolve the software crisis.
All code should be reusable.
Reusing code is always preferable to coding from scratch.
Object-oriented languages make writing reusable code easy.

Nontechnical obstacles:

The author of Widget must have suspected that a reusable version of Widget would be useful.
The author of Widget must have expected to be rewarded for writing Widget reusably and making it available.
Someone must maintain Widget.
The eventual user of Widget must suspect that Widget exists and must be able to find it.
The eventual user of Widget must be able to obtain Widget.
There must be no legal obstacles to reuse of Widget.
The user of Widget must be rewarded for reusing Widget.

Technical obstacles:

Reusable code must work in many contexts.
We almost never know all the contexts.
User requirements often conflict.
We cannot provide everything everyone wants.
The contexts change.

Chapter 2. Class design

Every C++ class (whether reusable or not) should represent some abstraction. Functions should represent abstract behavior. [p13]

Attempts to define a minimal standard interface for all classes, although well motivated, are misguided. No function should be provided by every class. The argument in support of this claim works as follows: for each function that might be proposed fort the minimal standard interface, it is possible to describe a class that should not provide that function. [p18]

Two operations require special mention because they have a reputation for being generally useful in spite of their undesirable properties: the shallow and deep copy copy operations. For most real classes, neither shallow nor deep copy correctly implements the copy constructor. For nontrivial classes, shallow and deep copy operations usually have an undesirable property: they do not preserve program invariants. [p25]

Library designers must pay careful attention to conversions. "Fanout" can be defined as the number of other types that a type can be converted to implicitly. Large fanouts are undesirable because they are potential causes of ambiguity. [p35]

The interface of a C++ library should use "const" everywhere it applies - that is, everywhere that the use of "const" makes a promise that the library keeps. Failure to use "const" maximally can cause problems for library users. [p38]

The regular functions (functions whose semantics are the same in all well-designed classes - the copy constructor, the destructor, the principal assignment operator, and the equality and inequality operators [p15]) should implement the same semantics in all classes.

Although there is no minimal standard interface, the nice functions (the default constructor, the destructor, the copy constructor, the assignment operator, and the equality operator) should be provided by most classes. No function should be provided by all classes. The shallow and deep copy operations should be provided by almost no classes.

Careful thought should be given to uniformity of interface for classes within a library, but consistency should not be so rigidly adhered to that it renders the interface of a class inappropriate or counterintuitive.

When deciding what conversions to provide, library designers should provide sensible conversions while preventing multiple ownership, avoid nonsensible conversions when possible, and limit fanout.

Use of const in libraries also requires attention. In general, libraries should implement abstract const in their interfaces, and they should use the const keyword every place it makes a promise that the library keeps.

Chapter 3. Extensibility

A user might want to inherit a class's implementation but not its interface. Private derivation accomplishes this kind of inheritance. [p49]

The ability to pass a pointer or reference to an object of type X to a function declared to take a pointer or reference to a type from which X directly or indirectly inherits is called "substitutability". [p50]

There are costs associated with providing extensibility. Occasionally, a reasonable alternative to designing a C++ library extensibly is to provide all the functionality users will ever want so that they do not need to extend the library's classes.

More often, users will want extensibility. Extensibility in C++ is provided primarily through inheritance. Properly defining the inheritance semantics of a class and assuming only those semantics throughout the library are essential to writing an extensible class. The burden for successful inheritance rests partly on the user - inheritance will not be successful if a publicly derived type does not adhere to the inheritance semantics of its intended base classes.

It can be difficult to derive from classes not written carefully to allow inheritance. The obstacles to inheritability are as follows:

Nonvirtual member functions
Overprotection of data and function members
Undermodularization of member functions
Use of friends
Excess data members
Nonvirtual derivations
Inheritance-preventing member functions

Because most of the obstacles to inheritability cannot occur in "interface" classes, libraries for which extensibility is important should "interface" all the classes whose interfaces users might wish to inherit. [An interface class is a class containing no data members, all of whose member functions are pure virtual, and all of whose base classes are interface classes. A class X is interfaced if either X is an interface class or each public member function of X is declared in at least one interface class from which X directly or indirectly inherits.]

Chapter 4. Efficiency

One well-known technique for reducing code size is never to put the definition of two large functions in the same library implementation file, if it is possible for a program to need one but not the other of them. We shall call this technique "source-file partitioning". [p86]

Explicit and implicit "inline" declarations are only a request to the compiler. All current [1995] C++ compilers have limits on the functions they can inline (goto, loops, more than 15 statements in length, recursion can inhibit inlining). If a function f that is declared inline is not inline expanded at one or more call sites in a translation unit, then many compilers will generate in that translation unit an out-of-line copy of f with internal linkage. If such a copy is generated in n translation units, the executable file will contain n copies of f. The amount of code devoted to "outlined inlines" can be significant if programmers are not careful about which functions they declare inline. [p87]

Programmers tend to think that inlining: speeds execution, may cause code bloat, and should only be considered for small functions. All three of these thoughts may, or may not, be justified. [p89]

Returning references from functions has the advantage of being more efficient that return by value. Returning references from functions has two disadvantages: it makes user code more error prone, and it restricts the ways a class can be implemented. [p94]

C++ libraries should generally free resources that they have acquired as soon as possible. [p99]

On many systems, the stack space that is available to a program is significantly less than the heap space available. For this reason, huge objects should be allocated on the heap, rather than declared on the stack. [p101]

Efficiency is a crucial property for reusable code.

Build time is particularly important for development teams. Minimizing the amount of code that a library includes, preinstantiating templates, defining function templates inline, hoisting template code, and using pointer containers can help keep down build times.

Library implementors can reduce users' code size by partitioning the library's source files and by ensuring that functions declared inline are not laid down out of line. The library implementation itself should use as few templates as possible.

To many users, the most important measure of efficiency is run time. Run time can often be improved significantly through appropriate inlining. It is not always obvious, however, which functions to inline. Returning references is a technique for improving run time, but it can make user code more error prone and limit the ways a class can be implemented.

Free-store and stack space must also be used efficiently. Being careful to use efficient algorithms and freeing resources as soon as possible are two of the best ways to minimize use of space. Large objects usually should be created in the free store rather than on the stack.

Unfortunately, efficiency trades off with almost every other desirable property of a C++ library. In particular, designing a library to be as efficient as possible usually renders that library more difficult to implement and to use.

Chapter 5. Errors

In practice, checking two kinds of invariants, function preconditions and representation invariants, can detect many errors. [p113]

Writers of reusable code must ensure that their code is exception safe - that it behaves correctly even when an exception is thrown. [p126] A class X is exception safe if it is impossible for an exception thrown during execution of any of X's member functions to cause the user of X to be left with an inconsistent X object. [p128]

Code intended for reuse must consider whether to detect and how to handle any error that might arise. Invariants can be used to detect many kinds of errors. Libraries should make good use of function preconditions and representation invariants.

Different variants of a library may handle errors differently. Here are the most common ways to handle an error:

Correct the problem and continue execution.
Exit or abort (not acceptable for many libraries).
Throw an exception.
Create a nil value.
Interpret invalid data as valid.
Do not detect the error (and therefore have undefined behavior).

Among the errors that library designers must consider is exhaustion of system resources. The stack might overflow, the free store might be exhausted, or some file system limit might be reached, to name three possibilities.

With the introduction of exceptions to the C++ language, special care must be taken to ensure that reusable code is exception safe. Classes must be designed so objects are not rendered inconsistent when an exception is thrown. Libraries must be designed to avoid other ill effects from nonlocal flow of control when an exception is thrown.

Chapter 6. Conflict

When two libraries conflict, use of both of them in a single program will be difficult, if not impossible. To maximize reusability, library designers should avoid conflicting with other code. Use of sound naming conventions and the namespace construct is essential for all global, public macro, and environmental names defined by a library unless the library can safely be unclean. Good-citizen libraries avoid another form of conflict: conflicting attempts to own global or application-specific resources.

Chapter 7. Compatibility

Almost every change to a C++ library is source incompatible in theory. [p159]

Library developers should be concerned with providing backward compatibility for their current users and with anticipating forward compatibility so that they can provide backward compatibility in future releases. A library should try to provide source compatibility, link compatibility, and run compatibility whenever possible. Some libraries will also try to provide process compatibility. Providing compatibility requires careful thought about changes to a library. Deprecating (discouraging the use of in the documentation), rather than removing, functionality provides source compatibility and allows users to change their code at their convenience.

Incompatibilities between releases of libraries should always be documented clearly, along with instructions on how to upgrade user programs. Library providers should also be aware of the possibility that users are relying on undocumented properties of a library.

Chapter 8. Inheritance hierarchies

There is come confusion among C++ programmers about whether to base a design for a class hierarchy on templates or on inheritance. They sometimes over-use inheritance. [p192]

Very popular (yet unfounded and contradictory) inheritance hierarchy design rules include:

Singly rooted hierarchies are best.
Multiply rooted hierarchies are best.
Shallow and wide hierarchies are best. The depth of a hierarchy should be no more than seven plus-or-minus two.
Deep and narrow hierarchies are best. The fanout of a hierarchy should be no more than seven plus-or-minus two.

The appropriate rootedness, depth, and fanout for an inheritance hierarchy depend on the domain the hierarchy is intended to model and on the desired properties of the hierarchy.

The design of a reusable library can be based on one of several inheritance hierarchy styles or on a combination of styles. These include:

Direct hierarchy
Interfaced hierarchy
Interfaced + Factory hierarchy
Handle hierarchy
Interfaced Handle hierarchy

Although a direct inheritance hierarchy is the easiest style to implement and understand as well as the most efficient, interfaced hierarchies, object factories, and handle hierarchies facilitate link compatibility between releases of a library. Further, interfaced hierarchies increase a library's extensibility. The table summarizes the most important differences among the hierarchy styles. As always, no single design is best for all libraries. Library designers must decide which is the best choice for their library and their users.

Hierarchy style	Complexity	Efficiency	Extensibility	Link compatibility
Direct	simple	good	mediocre	minimal
Interfaced	complex	reduced	good	partial
Interfaced + Factory	complex	reduced	good	total
Handle	simple	reduced	poor	total
Interfaced Handle	complex	reduced	good	total

Library designers should be careful not to use inheritance when use of templates would produce a better design.

Chapter 9. Portability

The portable code is, the more reusable it is. [p203]

Portability often trades off with efficiency and ease of implementation. Specifically, portable code that is easy to implement is often not efficient enough on one or more platforms. [p204]

Currently, writing highly portable C++ code is challenging. Part of the challenge comes from the continuing evolution of the C++ language. There is controversy over how an implementation should interpret certain constructs. Further, many implementations of the language are not complete.

Even after the ANSI/ISO C++ standard is finalized, the language will allow legal programs that will not be portable. C++ inherits from C many undefined, unspecified, and implementation-defined behaviors and adds a few new ones. Memory and object layout, in particular, need careful attention in code that must be portable.

Template instantiation mechanisms vary considerably among C++ implementations. Some automatic instantiation schemes require template code to be organized in specific ways, but the requirements vary from implementation to implementation. Manual instantiation schemes use a variety of directives to give users control over template instantiation. Thus, porting code that uses templates can involve some effort.

Finally, portability can be complicated for programs that depend on standard as well as nonstandard run-time libraries, system commands, file systems, and window systems.

Chapter 10. Using other libraries

We discuss drawbacks of using other libraries in code intended for reuse: requiring users to obtain the reused code, concerns about efficiency, the potential for name-space conflicts introduced by reusing other libraries, and the problem of synchronizing releases of libraries. [p233]

There are strong reasons to prefer reusing another library's collection class. First, why should we write yet another collection class if a suitable one already exists? Second, if we are the aspiring authors of a medical library, not a container class library, we might not have experience writing high-quality container classes. Third, because we are providers of a library intended for reuse, we would like to set a good example by practicing reuse ourselves. [p234]

Using other libraries eases implementation but brings the problems of acquisition, conflict, release synchronization, and possibly efficiency.

Self-contained libraries avoid these problems, but they trade off ease of implementation, ease of use, and efficiency. Some programmers also use the library that otherwise would have been reused. Self-contained libraries can reduce ease of use for those programmers by requiring them to learn multiple interfaces and to write and invoke conversion functions explicitly. Self-contained libraries can also cause such users' executables to be bloated. Finally, self-contained libraries isolate themselves from other libraries - with both desirable and undesirable effects.

Chapter 11. Documentation

Code that is not documented properly is not reusable. [p245]

Producing high-quality documentation often takes a significant fraction of the time required to design, implement, and test a library. The need for good documentation is one of the reasons developing reusable code is more expensive that producing single-use code. [p245]

A library should be documented while it is being designed and implemented. Documenting usually reveals ways to improve the code being documented. Postponing the effort of documenting, even for apparently simple library facilities, is a mistake. Donald Knuth's observations on designing the typesetting language TEX are germane:

The designer of a new system must not only be the implementor and the first large-scale user; the designer should also write the first user manual ... If I had not participated fully in all these activities, literally hundreds of improvements would never have been made, because I would never have thought of them or perceived why they were important. [p246]

Good documentation is crucial for reusable code. Every reusable C++ library should be accompanied by at least a design paper, a set of tutorials, and a reference manual.

The design paper for a library should discuss significant decisions made in the design of the library, why each was decided the way it was, for whom the library is intended, and what the library provides those users.

Tutorials should be written clearly and simply, and should be written appropriately for the background of the library's intended users. They should discuss the library functionality in terms of abstract values, not implementation. Examples in tutorials should show legal, correct code.

A library reference manual should define the abstraction of each library class, show the syntactic interface for each class, give the semantics of each function in the interface of each class, and present any restrictions on template arguments.

Chapter 12. Miscellaneous topics

If a C++ library defines and uses any nonsimple, nonlocal, static objects, it will be possible for a user of the library to build successfully a program that uses an object before it is constructed (unless the library implementors have taken precautions to prevent such uses; see Section 12.1.6). Such a program will have undefined behavior. [p266]

Two classes are coupled if either class's interface or implementation uses the other class ... Some programmers believe that coupling is generally undesirable; others believe that it is usually a good idea. Actually, coupling has both advantages and disadvantages. [p283]

Designers of C++ libraries need to be aware of the static initialization problem. We recommend that libraries not define and use nonsimple, nonlocal, static objects. Instead, libraries should use objects in the free store. To allocate and initialize such objects, libraries can use init functions, init checks, and init objects. Each of these approaches has disadvantages.

Localizing costs is an important consideration for any library design. If a program does not use a feature of a library, it should not incur costs associated with the presence of that feature.

Containers are an important kind of reusable class. Designers of container classes should be careful to make those classes either endogenous (contained values are stored directly in the underlying data structures) or exogenous (contained values are stored in separate objects), but not a hybrid. [A mistake some library designers make is to provide a class that for some operations models an endogenous container that contains things of type T*, and for other operations models an exogenous container that contains things of type T. (p278)] Designing iterators with the right semantics requires attention and care as well.

Usually, coupling of classes within a library simplifies implementation of the library. Coupling sometimes makes the library easier to use, other times, it makes the library more difficult to use. Library designers must therefore weigh carefully the advantages and disadvantages of any proposed couplings.

Sometimes, making a difficult decision can be avoided by deferring it to users. A common technique for deferring a decision is to allow the user to specify a parameter for a template.

Designing and Coding Reusable C++ Martin Carroll and Margaret Ellis