Algorithm problems  prep set 2
Contents
About
Prep set 2
Below is a very small set of question to prepare for an interview.
QUESTION 1: Return a random subset of X integers from an array of integers (MED)
I was asked this in an interview and kindof bombed it. My mistake was thinking it was a "warm up question" because it sounded so easy, and I'd solved the same problem recently in python by simply calling random shuffle then resize. Instead I should have applied my Programming Interview Strategy Sheet and I would have asked up front: "is the original array mutable", "how big can this array be", "am I memory bound", "what should I return for an outofbounds X value".
The quick solution is O(n) with "std::random_shuffle":
vector<int> randSubsetTheEasyWay(vector<int> v, const int x) { // Check X is sensible value. std::random_shuffle(v.begin(), v.end()); v.resize(x); return v; }
I suggested we could look at each element and give it (x_remaining / n_remaining) chance of selection, but actual that method isn't totally random so I abandoned it. Then I realize it could be a waste to shuffle/process all n element, if only x elements need shuffling... so I came up with:
vector<int> randSubset(vector<int> v, const int x) { // Check X is sensible value. x = std::max(x, v.size()) for(int i=0; i < x; i++) { int randidx = rand(i, v.size()  1); // Pick between remaining element in vector. std::swap(randidx, x); // Swap the randomly picked element to the front. } v.resize(x); return v; }
I also pointed out that if x is greater than half then it is more efficient to randomly swap (n  x) element to the end and resize to x as normal. Both however, suffer from O(n) extra memory. What if we had a million elements and only needed to pick two? I know that we need a way to mark what's already selected, and I very briefly mentioned hash table, but my fault was I didn't program it. That was probably the answer my interviewer wanted, because it doesn't require O(n) extra space, only O(x) extra space and ~O(x) processing time. That would look like this:
vector<int> randSmallSubset(const vector<int>& v, const int x) { // Check X is sensible value. std::unordered_set<int> subsetIndexes; for(int i=0; i<x;) { int randidx = rand(0, v.size()); // Pick between all element in vector. if (subsetIndexes.find(randidx) == subsetIndexes.end()) { // Check for collisions. setOfStrs.insert(randidx); i++; } } vector<int> returnV(x); for (const int& idx : subsetIndexes) returnV.pushback(v[idx]); return returnV; }
So now we have a O(x) solution or O(x log x) if we used a set (binary search tree). However it all depends on the size of x relative to n... so really you'd want a big if statement to help decide! Ultimately there is no single solution for this question I believe  you have to just discuss the benefits of each. A final solution might look something like this then!:
vector<int> randSubsetAdvanced(const vector<int>& v, const int x) { // Check X is sensible value: if (x <= 0) { vector<int> empty_vector; return empty_vector; } if (x >= v.size()) { vector<int> identical_vector(v); return identical_vector; } // Determine best subroutine: if (x < n / 2) { return randSmallSubset(v, x); } else { return randSubset(v, x) } }
QUESTION 2: What are some of the changes in C++11
This probably isn't something you memorize, but maybe you use a few:
 nullptr:
nullptr
 replaces bug prone NULL macro.  defaulted & deleted functions:

struct A { A()=default; virtual ~A()=default; };

struct NoCopy { NoCopy & operator =( const NoCopy & ) = delete; NoCopy ( const NoCopy & ) = delete; }; NoCopy a; NoCopy b(a); //compilation error, copy ctor is deleted

 automatic type deduction:
auto
(great for loops etc).  Standard Library changes:
 smart pointers:
auto_ptr
,shared_ptr
,unique_ptr
.  new algorithms:
all_of()
,any_of()
,none_of()
andcopy_n()
 new standard data structures:
unordered_set
,unordered_map
,unordered_multiset
, andunordered_multimap