Iterators

Abstraction is quite likely the most powerful tool in programming. We've seen it applied as "procedural abstraction" (i.e. functions) and in "abstract data types" (i.e. classes), and we'll add another today - abstracting the process of "iteration" or "traversal" over a sequence or a container.

To do this, we'll first define a common interface for iteration. But not all containers will naturally conform to this interface - traversing over an array looks a whole lot different than traversing over a linked list. So, we'll define custom objects called "iterators" for each different kind of sequence or container that act as the "tour guide" that conforms to our common interface but handles the container-specific details behind the scenes.


1: Warm Up Exercise
1.1 Not Started
1.1 Exercise: Warm Up

Let's take a look at two functions that traverse and print out different kinds of containers:

  // Array traversal by pointer
  void print(int *begin, int size) {
    int *end = begin + size;
    for (int *p = begin; p != end; ++p) {
      cout << *p;
    }
  }
  // Linked list traversal via next pointers
  void print (Node *begin) {
    for (Node *p = begin; p; p = p->next) {
       cout << p->datum;
    }
  }

Briefly answer the following questions. (A word or short phrase is sufficient!)

Array Traversal Linked List Traversal
Which variable tracks the current position in the container?
How is it initialized?
How do we move it to the next item in the container?
How do we know we have reached the last item in the container?
How do we access the value of the current item in the container?


2: Iterators: The Big Idea

It would be nicer if we could write a single version of print() that could operate on both arrays and linked lists (and vectors and sets and other containers!). As we alluded to above - we'll need a common abstraction for iteration, which can be used by print() regardless of the container type. As a preview, here's what that function will look like:

template <typename IterType>
void print(IterType begin, IterType end) {
  for (IterType it = begin; it != end; ++it) {
    cout << *it;
  }
}

What is this doing? At a high level, we've got a function template that can be flexible to accommodate different IterType types. When used with a specific container, IterType will match to the type of iterator that container provides. Recall that an iterator is supposed to act like a "tour guide" for a container. With that in mind, we can roughly interpret the rest of the code - we've peforming different operations on the iterator, expecting it to take us through the container's elements. *it should give us access to the current element. ++it should move the iterator onward to the next one.

The last piece of the puzzle is how we get the begin and end iterators to pass into the function. Basically, we ask the container to provide them for us by calling member functions. Here's an example using STL containers, which define these and iterator types, combined with the print() function defined above:

int main() {
  std::vector<int> vec;
  std::list<double> list;
  std::set<string> set;

  // Assume some elements are added to the containers above.
  // The code below will then print out the elements for each!
  print(vec.begin(), vec.end());
  print(list.begin(), list.end());
  print(set.begin(), set.end());
}

In essence, we presume that containers have objects called iterators that we can get by calling .begin() and .end() functions, and that those iterators will support operations like *, ++, etc. to take us on a tour through the element's containers.



3: Linked List Iterators
3.1 Not Started

Let's fill in some more details and work through an example of actually creating an iterator for our linked list class…


In the previous video, we implemented the prefix increment operator, i.e. ++it. What's the difference between this and the postfix increment with it++? Let's take a look and also define the latter for our linked list iterators.


Okay, what about the -- operators?


Now that we've covered the data representation and operator implmentation, we'll look at constructing iterators and how the linked list provides them via the list.begin() and list.end() member functions.


Finally, writing out the type of an iterator like List<int>::Iterator can be a bit obnoxious. But, the C++ auto keyword can make our lives easier here! Let's take a look…


3.1 Exercise: Linked List Memory Management

As we finish building the Iterator class, a reasonable question is whether we need to define custom versions of the "Big Three" functions (i.e. copy constructor, assignment operator, and destructor). After all, the iterator does contain a pointer to a dynamically allocated Node, which is one of the "hints" that a class might need custom Big Three implementations.

Let's work through a couple exercises to assess the situation. First, let's think about shallow vs. deep copies. Consider the code below, and draw out a memory diagram, tracing through the code to the final state of memory (assuming the built-in implementation of the copy constructor for Iterator, which will use a shallow copy).

int main() {
  List<int> list;
  list.push_back(1);
  list.push_back(2);
  list.push_back(3);

  List<int>::Iterator it1 = list.begin();
  ++it1;

  List<int>::Iterator it2 = it1;
  ++it2;

  // Draw memory at this point
}

Consider your diagram…does everything look as it should, even though the copy of the iterator did not also result in a deep copy of the node it was pointing to? Explain your reasoning.

We can also consider whether Iterator needs a custom implementation of the destructor, perhaps something like shown below:

class Iterator {
  friend class List;
public:
  // Public default constructor
  Iterator()
  : node_ptr(nullptr) { }

  // Potential custom destructor - should we add this???
  ~Iterator() {
    delete node_ptr;
  }
private:
  // private constructor
  Iterator(Node *np) : node_ptr(np) { }

  // Member variable - pointer to the current node
  Node *node_ptr;
};

Consider the same main() program from earlier, referring back to your diagram. If we were to add a custom destructor that also deletes the Node the iterator is pointing to, what would happen at the end of this main function when destructors run for it1, it2, and list? (i.e. Would we get any memory errors? If so, what kind?)


You're welcome to check your solution with this walkthrough video:




4: Generalizable Function Templates Using Iterators
4.1 Not Started

Finally, we'll take a look back at our original goal - write flexible functions that treat iteration via iterators as an abstraction so that they aren't fixed to work with only a single kind of container.


4.1 Exercise: Generic length() Function Template with Iterators

Consider a generic length() function that takes in begin/end iterators and computes the length of the sequence they point to (using traversal by iterator and counting the number of steps). We would like the length() function to be useable with any container that supports an iterator interface.

For example, we could use it like this:

int main() {
  List<int> list; // assume it's filled with some numbers
  cout << length(list.begin(), list.end()) << endl;
}

Or like this!

int main() {
  std::vector<Card> cards; // assume it's filled with some cards
  cout << length(cards.begin(), cards.end()) << endl;
}

Determine which of the following potential implementations of length() are correct. Write "correct" or "incorrect". If they are not correct, additionally describe what's wrong with them.

// Implementation A
template <typename Iter_type>
int length(Iter_type begin, Iter_type end) {
  int count = 0;
  List<int>::iterator it = begin;
  while(it != end) {
    ++count;
    ++it;
  }
  return count;
}
// Implementation B
template <typename Iter_type>
int length(Iter_type begin, Iter_type end) {
  int count = 0;
  for(Iter_type it = begin; it < end; ++it) {
    ++count;
  }
  return count;
}
// Implementation C
template <typename Iter_type>
int length(Iter_type begin, Iter_type end) {
  int count = 0;
  while(begin != end) {
    ++count;
    ++begin;
  }
  return count;
}
// Implementation D
template <typename Iter_type>
int length(Iter_type begin, Iter_type end) {
  return end - begin;
}


You're welcome to check your solution with this walkthrough video:




5: Iterator Validity

One last thing… Iterators are kind of like "fancy pointers", and we've got the concept of a "dangling pointer" (a pointer to an object that's no longer safe to use). We have a parallel concept for iterators, referred to as an "invalid", "invalidated", or "dangling" iterator.


You've reached the end of this lecture! Your work on any exercises will be saved if you re-open this page in the same web browser.

Participation Credit
Make sure to sign in to the page, complete each of the exercises, and double check the participation indicator at the top left of this page to ensure you've earned credit.