Array-Based Data Structures

In this lecture and the next, we'll implement a set (an associative container for storing unique elements) based on an array.

We'll be following our normal process for building an ADT - starting with our motivating use cases and the interface we want, followed by a fundamental data representation and invariants, and finally filling in the implementations for each member function.

We'll also introduce the notion of time complexity to formally analyze the efficiency of operations on data structures.

It turns out that arrays (and data structures built using them) provide very efficient access to data in a couple different ways:

  • Sequential Access: Iterating through a sequence of elements from start to end.

  • Random Access: Accessing an element at a particular index (i.e. position) in the sequence.

    On the other hand, arrays are less efficient in cases where we need to:

  • Shift Elements: If we need to preserve the order of elements after inserting/removing from the middle of an array, we have to shift them one-by-one.

Updated Winter 2025

1: Containers and Data Structures

First, a brief introduction to the big-picture idea of containers and data structures, including a roadmap to building one ourselves:




2: An Array-Based Unsorted Set

We'll start with an introduction to sets and their interfaces, as well as some motivation for why we would want to use them.


We haven't seen static member variables before. The basic idea is that the variable will be shared by all instances of the class, rather than each having their own in memory. Here's the details:


Our next step is to choose a data representation and invariants. This ends up as the foundation for the data structure, the implementation of its functions, and the efficiency we can achieve. Throughout the course, we'll end up looking at several different possibilities for an unsorted set. We'll start here with an unsorted array.




3: IntSet Implementation
3.1 Not Started 3.2 Not Started

Let's get into the implementation of a default constructor and a few member functions for the array-based unsorted set.


3.1 Exercise: IntSet::insert()

Implement the insert() member function for the IntSet class, which adds a given value v to the set.

First, your code should call contains() as a helper to check if v is already in the set:

  • If the given value is not already in the set, it should add the value to the next available position in the elts array and increase elts_size by 1.
  • If the value is already in the set, insert() does nothing.

The main() function provided includes testing code to verify your implementation. Note that you should not worry about implementing remove() yet… save that for the next exercise below.


You're welcome to check your solution with this walkthrough video:

Note: This walkthrough uses several different files for the code, which is different than the above, where we had everthing embedded into one file. (The solution for insert() is the same, though!)



3.2 Exercise: IntSet::remove()

Below are some potential implementations of the remove() function for IntSet. Which ones work correctly?

It may be helpful to trace through the code on this set, removing the 1, for example:

Or, you might also consider pasting them into the code for the exercise above and uncommenting the additional set of tests in main() for the remove() function.

Which of the implementations of remove() below are correct? Write "correct" or "incorrect". For each that is not correct, explain what's wrong with it.

// Potential Implementation 1
void remove(int v) {
  int i = indexOf(v);
  if (i == -1) { return; }
  elts[i] = elts[i+1];
  --elts_size;
}
// Potential Implementation 2
void remove(int v) {
  int i = indexOf(v);
  if (i == -1) { return; }
  elts[i] = elts[elts_size-1];
  --elts_size;
}
// Potential Implementation 3
void remove(int v) {
  int i = indexOf(v);
  if (i == -1) { return; }
  elts[i] = elts[0];
  ++elts;
  --elts_size;
}
// Potential Implementation 4
void remove(int v) {
  int i = indexOf(v);
  if (i == -1) { return; }
  for( ; i < elts_size-1 ; ++i) {
    elts[i] = elts[i+1];
  }
  --elts_size;
}

Two of the implementations above for remove() work correctly. Which one is the most efficient for sets with lots of elements? How does this fit in with what the representation invariants require (or rather, what they don't require)?


You're welcome to check your solution with this walkthrough video:


NOTE: For completeness, I'll mention here that ultimately both implementations that work correctly end up having similar runtime complexities in that their runtime scales linearly with the amount of elements in the set - that's because even though the one I describe in the walkthrough video does less work to remove the element, it still needs to call indexOf, which has a linear runtime. We'll talk more about time complexity in the next lecture.



4: Time Complexity
4.1 Not Started

When we're asessing the fitness of a data structure for a given task, it's helpful to determine its time complexity, which quantifies how well it performs as the size of the data we're working with scales up.


In particular, let's look at the efficiency of several common operations on arrays:


4.1 Exercise: Time Complexity

Determine the time complexity of operations on our unsorted IntSet. Recall the underlying data representation based on an (unsorted) array with no duplicates:

class IntSet {
private:
  int elts[ELTS_CAPACITY]; // INVARIANT: No duplicates
  int elts_size;
public:
  ...
};

Below are several (correct) implementations of IntSet functions. Using a worst-case analysis, determine whether each function has O(1) constant time complexity or O(n) linear time complexity. Explain your reasoning.

// IntSet constructor
IntSet::IntSet()
  : elts_size(0) { }
IntSet::size() {
  return elts_size;
}
int IntSet::indexOf(int v) const {
  for (int i = 0; i < elts_size; ++i) {
    if (elts[i] == v) {
        return i;
    }
  }
  return -1;
}

Hint: for the functions below, make sure to consider their calls other functions when determining their overall complexity.

bool IntSet::contains(int v) const {
  return indexOf(v) != -1;
}
void IntSet::insert(int v) {
  assert(size() < ELTS_CAPACITY);
  if (contains(v)) {
    return;
  }
  elts[elts_size] = v;
  ++elts_size;
}
void IntSet::remove(int v) {
  if (!contains(v)) {
    return;
  }
  elts[indexOf(v)] = elts[elts_size - 1];
  --elts_size;
}


You're welcome to check your solution with this walkthrough video:


You've reached the end of this lecture! Your work on any exercises will be saved if you re-open this page in the same web browser.

Participation Credit
Make sure to sign in to the page, complete each of the exercises, and double check the participation indicator at the top left of this page to ensure you've earned credit.