# Algorithm problems - prep set 2

## Prep set 2

Below is a very small set of question to prepare for an interview.

### QUESTION 1: Return a random subset of X integers from an array of integers (MED)

I was asked this in an interview and kind-of bombed it. My mistake was thinking it was a "warm up question" because it sounded so easy, and I'd solved the same problem recently in python by simply calling random shuffle then resize. Instead I should have applied my Programming Interview Strategy Sheet and I would have asked up front: "is the original array mutable", "how big can this array be", "am I memory bound", "what should I return for an out-of-bounds X value".

The quick solution is O(n) with "std::random_shuffle":

```vector<int> randSubsetTheEasyWay(vector<int> v, const int x) {
// Check X is sensible value.
std::random_shuffle(v.begin(), v.end());
v.resize(x);
return v;
}```

I suggested we could look at each element and give it (x_remaining / n_remaining) chance of selection, but actual that method isn't totally random so I abandoned it. Then I realize it could be a waste to shuffle/process all n element, if only x elements need shuffling... so I came up with:

```vector<int> randSubset(vector<int> v, const int x) {
// Check X is sensible value.
x = std::max(x, v.size())
for(int i=0; i < x; i++) {
int randidx = rand(i, v.size() - 1);  // Pick between remaining element in vector.
std::swap(randidx, x);                // Swap the randomly picked element to the front.
}
v.resize(x);
return v;
}```

I also pointed out that if x is greater than half then it is more efficient to randomly swap (n - x) element to the end and resize to x as normal. Both however, suffer from O(n) extra memory. What if we had a million elements and only needed to pick two? I know that we need a way to mark what's already selected, and I very briefly mentioned hash table, but my fault was I didn't program it. That was probably the answer my interviewer wanted, because it doesn't require O(n) extra space, only O(x) extra space and ~O(x) processing time. That would look like this:

```vector<int> randSmallSubset(const vector<int>& v, const int x) {
// Check X is sensible value.
std::unordered_set<int> subsetIndexes;
for(int i=0; i<x;) {
int randidx = rand(0, v.size());                           // Pick between all element in vector.
if (subsetIndexes.find(randidx) == subsetIndexes.end()) {  // Check for collisions.
setOfStrs.insert(randidx);
i++;
}
}

vector<int> returnV(x);
for (const int& idx : subsetIndexes)
returnV.pushback(v[idx]);
return returnV;
}```

So now we have a O(x) solution or O(x log x) if we used a set (binary search tree). However it all depends on the size of x relative to n... so really you'd want a big if statement to help decide! Ultimately there is no single solution for this question I believe - you have to just discuss the benefits of each. A final solution might look something like this then!:

```vector<int> randSubsetAdvanced(const vector<int>& v, const int x) {
// Check X is sensible value:
if (x <= 0) { vector<int> empty_vector; return empty_vector; }
if (x >= v.size()) { vector<int> identical_vector(v); return identical_vector; }

// Determine best subroutine:
if (x < n / 2) {
return randSmallSubset(v, x);
} else {
return randSubset(v, x)
}
}```

### QUESTION 2: What are some of the changes in C++11

This probably isn't something you memorize, but maybe you use a few:

• nullptr: `nullptr` - replaces bug prone NULL macro.
• defaulted & deleted functions:
• `struct A { A()=default; virtual ~A()=default; };`
• `struct NoCopy { NoCopy & operator =( const NoCopy & ) = delete; NoCopy ( const NoCopy & ) = delete; }; NoCopy a; NoCopy b(a); //compilation error, copy ctor is deleted`
• automatic type deduction: `auto` (great for loops etc).
• Standard Library changes:
• smart pointers: `auto_ptr`, `shared_ptr`, `unique_ptr`.
• new algorithms: `all_of()`, `any_of()`, `none_of()` and `copy_n()`
• new standard data structures: `unordered_set`, `unordered_map`, `unordered_multiset`, and `unordered_multimap`