Algorithm problems - prep set 2

From NoskeWiki
Revision as of 06:55, 23 September 2023 by NoskeWiki (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

About

NOTE: This page is a daughter page of: Algorithm problems



Prep set 2

Below is a very small set of question to prepare for an interview.


QUESTION 1: Return a random subset of X integers from an array of integers (MED)

I was asked this in an interview and kind-of bombed it. My mistake was thinking it was a "warm up question" because it sounded so easy, and I'd solved the same problem recently in python by simply calling random shuffle then resize. Instead I should have applied my Programming Interview Strategy Sheet and I would have asked up front: "is the original array mutable", "how big can this array be", "am I memory bound", "what should I return for an out-of-bounds X value".

The quick solution is O(n) with "std::random_shuffle":

vector<int> randSubsetTheEasyWay(vector<int> v, const int x) {
  // Check X is sensible value.
  std::random_shuffle(v.begin(), v.end());
  v.resize(x);
  return v;
}

I suggested we could look at each element and give it (x_remaining / n_remaining) chance of selection, but actual that method isn't totally random so I abandoned it. Then I realize it could be a waste to shuffle/process all n element, if only x elements need shuffling... so I came up with:

vector<int> randSubset(vector<int> v, const int x) {
  // Check X is sensible value.
  x = std::max(x, v.size())
  for(int i=0; i < x; i++) {
    int randidx = rand(i, v.size() - 1);  // Pick between remaining element in vector.
    std::swap(randidx, x);                // Swap the randomly picked element to the front.
  }
  v.resize(x);
  return v;
}

I also pointed out that if x is greater than half then it is more efficient to randomly swap (n - x) element to the end and resize to x as normal. Both however, suffer from O(n) extra memory. What if we had a million elements and only needed to pick two? I know that we need a way to mark what's already selected, and I very briefly mentioned hash table, but my fault was I didn't program it. That was probably the answer my interviewer wanted, because it doesn't require O(n) extra space, only O(x) extra space and ~O(x) processing time. That would look like this:

vector<int> randSmallSubset(const vector<int>& v, const int x) {
  // Check X is sensible value.
  std::unordered_set<int> subsetIndexes;
  for(int i=0; i<x;) {
    int randidx = rand(0, v.size());                           // Pick between all element in vector.
    if (subsetIndexes.find(randidx) == subsetIndexes.end()) {  // Check for collisions.
      setOfStrs.insert(randidx);
      i++;
    }
  }

  vector<int> returnV(x);
  for (const int& idx : subsetIndexes)
    returnV.pushback(v[idx]);
  return returnV;
}

So now we have a O(x) solution or O(x log x) if we used a set (binary search tree). However it all depends on the size of x relative to n... so really you'd want a big if statement to help decide! Ultimately there is no single solution for this question I believe - you have to just discuss the benefits of each. A final solution might look something like this then!:

vector<int> randSubsetAdvanced(const vector<int>& v, const int x) {
  // Check X is sensible value:
  if (x <= 0) { vector<int> empty_vector; return empty_vector; }
  if (x >= v.size()) { vector<int> identical_vector(v); return identical_vector; }

  // Determine best subroutine:
  if (x < n / 2) {
    return randSmallSubset(v, x);
  } else {
    return randSubset(v, x)
  }
}


QUESTION 2: What are some of the changes in C++11

This probably isn't something you memorize, but maybe you use a few:

  • nullptr: nullptr - replaces bug prone NULL macro.
  • defaulted & deleted functions:
    • struct A { A()=default; virtual ~A()=default; };
    • struct NoCopy { NoCopy & operator =( const NoCopy & ) = delete; NoCopy ( const NoCopy & ) = delete; }; NoCopy a; NoCopy b(a); //compilation error, copy ctor is deleted
  • automatic type deduction: auto (great for loops etc).
  • Standard Library changes:
  • smart pointers: auto_ptr, shared_ptr, unique_ptr.
  • new algorithms: all_of(), any_of(), none_of() and copy_n()
  • new standard data structures: unordered_set, unordered_map, unordered_multiset, and unordered_multimap

here's a bigger list


See Also