19. Vector, Templates, and Exceptions

"Success is never final." -- Winston Churchill

19.1 The problems

We want to be able to:

  • Change the size of vectors on the fly.
  • Make vectors that can store any type we wish.
  • Catch out-of-range errors.

We want these features to so can worry about our users' needs and not about low-level language details.

Why don't we want fixed-size vectors? We often don't know how much data is coming!

Why don't we use vector for all of our collection needs? Wel, consider searching a million-record vector for a customer record, versus searching a binary search tree. Or think of lookups in a hash table. Different data structures fit different uses! Example: the day's sales records at our shop.

19.2 Changing size
19.2.1 Representation

We could just allocate a new array every single time someone calls push_back(): but that is not very efficient! Better to allocate some extra memory for every chunk we use, like this:

class vector {
    int sz;
    double* elem;
    int space;  // number of elems + free space
public:
    // ...
};
                            

19.2.2 reserve and capacity

Here's what our key method, reserve, looks like:

void vector::reserve(int newalloc)
{
    if(newalloc <= space) return;
    double* p = new double[newalloc];
    for(int i = 0; i < sz; ++i) p[i] = elem[i];  // copy elements
    delete[] elem;  // deallocate old space
    elem = p;
    space = newalloc;
}
                            

19.2.3 resize

We need to handle the following cases:

  • The new size is larger than the old allocation.
  • The new size is larger than the old size, but smaller than or equal to the old allocation.
  • The new size is equal to the old size.
  • The new size is smaller than the old size.

resize() is actually very simple, given we've written reserve():

void vector::resize(int newsize)
{
    reserve(newsize);
    for(int i = sz; i < newsize; ++i) elem[i] = 0;
    sz = newsize;
}
                            

Can you verify we handle all of these cases correctly?

19.2.4 push_back

What do we do when we run out of space? Interestingly, most implementations of vector ask for double the amount of space the vector is already using. (The Lindy effect!) Here is the code for push_back():

void vector::push_back(double d)
    // increase vector size by one and initialize with d 
{
    if(space == 0) reserve(8);
    else if(sz == space) reserve(2*space);
    elem[sz] = d;  // put d at the end 
    ++sz;  // increase the size 
}
                            

19.2.5 Assignment

Assignment copies the elements to the new vector, but does not worry about the extra space: maybe the new vector won't need it! The code:

vector& vector::operator=(const vector& a)
{
    if(this == &a) return *this;  // self-assignment! 

    if(a.sz <= space) { // enough space! 
// so just copy  
        for(int i = 0; i < a.sz; ++i) p[i] = a.elem[i];
        sz = a.sz;
        return *this;
    }
}
                            

19.2.6 Our vector so far

See our textbook for source code here.

19.3 Templates

We have been coding things like vector<double > without really asking what that is doing. We want to freely specify the element type for our vectors. For example:

  • vector<double>
  • vector<int>
  • vector<Month>
  • vector<char>

"Basically, a template is a mechanism that allows the programmer to use types as parameters for a class or function." (Stroustrup) The compiler will generate code suitable to the specific class we pass. Importantly, not only does the C++ standard library use templates, but we can create our own!

19.3.1 Types as template parameters

We achieve that goal like this:

template<class T> class vector {
// implementation of vectors follows here
}
                            

One way to think of this is that the compiler uses the template and the type we give it to generate the proper code for that type, so that each element of a vector of char takes up 1 byte, and each element of a vector of int takes up 4 bytes.

For member functions, the compiler generates code to handle the appropriate type. So if our template has:

void push_back(const T& d);
                            

Then for our Token vector in Token_stream we get:

void push_back(const Token& d);
                            

19.3.2 Generic programming

"Generic programming: Writing code that works on a variety of types presented as arguments, as long as those argument types meet spcific syntatctic and semantic requirements." (p. 682)

Polymorphism means the "programmer presents many versions of a concept using a single interface." (p. 682) Generic programming relies on types offered at compile time, while object-oriented programming relies on class hierarchies and virtual functions. Templates resolve the concrete method to call at compile time (making them faster), while object-oriented programming resolves the concrete method to use at run-time, making it more flexible.

19.3.3 Concepts

C++ 14 supports concepts, which can be used to characterize what types can fill in the T slot in a template. A few examples:

  • Element can be an element in a container.
  • Container can hold elements.
  • Forward_iterator supports traversing a sequence, such as a linked list, a vector, or an array.

Thus we can write:

template<Element T> class vector { /* */ };
                            

19.3.4 Containers and inheritance

Templates do not support putting a subclass of a type T in a container expecting the parent type.

19.3.5 Integers as template parameters

Interesting but beyond what we can cover.

19.3.6 Template argument deduction

Interesting but beyond what we can cover.

19.3.7 Generalizing vector

Interesting but beyond what we can cover.

19.4 Range checking and exceptions

A discussion of the trade-offs in range-checking every vector access or not.

19.4.1 An aside: design considerations

19.4.2 A confession: macros

19.5 Resources and exceptions

The basic idea here is that when we use new and delete things can happen between them that prevents the delete from ever being called. It is better to use constructor/destructor pairs to manage resources, and keep your objects within a single scope where possible.

19.5.1 Potential resource management problems

19.5.2 Resource acquisition is initialization

19.5.3 Guarantees

19.5.4 unique_ptr

19.5.5 Return by moving

19.5.6 RAII for vector

Test Yourself!
  1. When writing an assignment operator overload, why do we return *this?
    1. Because in C++, assignment returns a value.
    2. Because we want to draw attention to the object that was assigned to.
    3. Because we need to tell the compiler what object we are assigning to.
  2. Why don't we add new memory to our vector's storage on every call to push_back()?
    1. Because the system won't let us make that many calls to malloc.
    2. Because it costs a lot of CPU cycles to do that.
    3. Because we are lazy, and don't want to write that much code.
  3. Generic programming means
    1. programming according to generic standards.
    2. writing code that works with a variety of types presented as arguments.
    3. writing just standard variety code with nothing outstanding about it.
  4. Programming using templates is called
    1. object-disoriented programming
    2. object-oriented programming
    3. generic programming
  5. Programming using class hierarchies is called
    1. object-oriented programming
    2. object-disoriented programming
    3. generic programming
  6. Resolving what type's method will be called at run-time is characteristic of
    1. object-disoriented programming
    2. generic programming
    3. object-oriented programming
Answers

1. a; 2. b; 3. b; 4. c; 5. a; 6. c;

Drill