## Monday, July 25, 2016

### google interview: compute the mininual coins - asymptotic constant version

I have cracked the asymptotically constant version of the min coin counts algorithm ..

// http://www.impactinterview.com/2009/10/140-google-interview-questions/#software_engineer
//
// Given a set of coin denominators, find the minimum number of coins to give a
// certain amount of change.

#include <cstdint>
#include <climits>
#include <vector>
#include <iostream>
#include <functional>
#include <chrono>
#include <memory>
#include <list>
#include <queue>

// I knew there was an asymptotically constant version of algorithm i just cracked
// it using top down dynamic programming combined squeeze theorem (the same way
// that Archimedes found pi)
//
// https://en.wikipedia.org/wiki/Squeeze_theorem
// https://betterexplained.com/articles/prehistoric-calculus-discovering-pi/

// how it works:

// First off we are going to search with a dynamic programming version of the
// depth first search.  We are searching by choosing a count for a coin
// denomination and then moving to the next denomination searching all the way
// down this branch until we exhaust it.  Since the algorithm is a dynamic
// version of DFS it means we are using a stack, not recursion.

// Also note that we are using squeeze theorem that means we need to track an
// upperBoound(the know best) and a lowerBound the theoretical floor that an
// unknown branch might achieve

// So by using DFS we can get down to the last coin denomination and get a
// solid result quickly and that potentially tightens the upperBound more.  As
// the upperbound tightens we are able to trim the search tree more, and as a
// result speed up the overall search more.

// 1) first we set the best and the initial upperBounds to the max number of
// coins possible + 1.  The upperbounds is the same as our best known min coins
// count.  which at the start inst anything.  Remember when the upperBounds on
// coin count is exceed the search will stop looking at that branch as a

// 2) Next initialize the stack.  We only stack partial computation steps.
// The first item to add is the max coins we could have for the largest coin,
// later we will decrement from this until we reach zero
//
// 3) Now start the iteration loop.  Pop off the current working item (note the
// actual algorithm is optimized and only pops from the stack if it really has
// finished that denomination).  There are 2 updates to make to the stack
//
// 3a) make a copy of the current item and reduce the coin count in it for the
// current denomination then compute the theoretical lowerBound for the
// remaining largest denomination of coins (ie compute the max coins for
// highest remaining denomination excluding current denomination and better), if
// the theoretical lowerBound of coins that have to be used plus the current
// used coins is greater than the upperBound prune the entire branch..  it can
// not possibly bet the current known min.  But if it is better than or equal
// we should push that back onto the stack as it may bet the current best
//
// 3b) Then make another copy of the current item as the next item.  moving to
// the next highest coin denomination (excluding to ones which we have already
// chosen coins for) and compute the max coins that it can use and add it to a
// next item.  Then check if its a terminal case (all coins used) if it inst
// terminal push that to the stack (it will pop off the stack as the next
// current) if it is the terminal then compare it to the upperBound if it can
// bet that update the best min and the upperBound
//
// repeat until the stack runs out.. then return the best min,

// So lets get started note i have the greedy and bot up with reach pruning
// versions for comparison
// A greedy implementation - a quick greedy way to the answer is to maximize
// the largest coin
//
// Note this can give the *wrong* answer
// Note expects reverse sorted coins (largest to smallest in denom)
std::vector<uint32_t> minCoins_greedy(uint32_t total,
const std::vector<uint32_t>& denom)
{
std::vector<uint32_t> res(denom.size()+1,0);

if (total == 0) return res;

for(int i = 0; i < denom.size(); ++i)
{
uint32_t count = total / denom[i];
total -= (count * denom[i]);

res[i] = count;
res[denom.size()] += count;
}

if (total != 0) res[denom.size()] = INT_MAX;

return res;
}

// ok so we can improve the bottom up DP version and shave a bit more time by
// visiting items in the order of current coin count as a result we will create
// a BFS style pruned search
//
// Note expects reverse sorted coins (largest to smallest in denom)
std::vector<uint32_t> minCoins_botUpReach(uint32_t total,
const std::vector<uint32_t>& denom)
{
// we are dividing by making the full decistion for each coin and then removing it from the list
if (total == 0) return std::vector<uint32_t>(denom.size()+1, 0);

// setup memory
std::vector< std::vector<uint32_t> > cache(total + 1, std::vector<uint32_t>(denom.size()+1, 0));
std::list<uint32_t> queue;

queue.push_back(0);
while(not queue.empty())
{
uint32_t curTotal = queue.front();
queue.pop_front();

for(int i = 0; i < denom.size(); ++i)
{
uint32_t newTotal = curTotal + denom[i];

if (newTotal <= total and
cache[newTotal][denom.size()] == 0)
{
// ok its empty
cache[newTotal] = cache[curTotal];
++(cache[newTotal][i]);
++(cache[newTotal][denom.size()]);

if (newTotal == total)
return cache[total];

queue.push_back(newTotal);
}
}
}

// impossible case
cache[0][denom.size()] = INT_MAX;
return cache[0];
}

// ok another idea we can reformulation top down upperBound and lowerBound
// squeeze tree pruning. greedy initialization of the upperbound
//
// Note expects reverse sorted coins (largest to smallest in denom)
std::vector<uint32_t> minCoins_topSqueeze(uint32_t total,
const std::vector<uint32_t>& denom)
{
const int LastIdx       = denom.size() - 1;
const int TotalCoinsIdx = denom.size();
const int RemTotalIdx   = denom.size() + 1;
const int WorkingIdx    = denom.size() + 2;
const int Size          = denom.size() + 3;

// prune the total of 0 corner case
if (total == 0) return std::vector<uint32_t>(denom.size()+1, 0);

std::vector<uint32_t> best(Size);
best[TotalCoinsIdx] = INT_MAX;

// remove 1 coin corner case
if (denom.size() < 2)
{
if (denom.size() == 1)
{
best[0] = total / denom[0];
best[TotalCoinsIdx] = (best[0] + denom[0]) == total ? best[0] : INT_MAX;
}
return best;
}

// whats my best guess min (upperbounds)
// dont use INT_MAX we are doing maths on it (make it overflowed max)
uint32_t upperBounds = total + 1;

// since we move through the denom vector its size is also the upper limit to
// the stack size
std::vector< std::vector<uint32_t> > stack(denom.size(), std::vector<uint32_t>(Size));
int stackTopIdx = 0;

// compute the max coins for the first layer thats the starting point
stack[0][0]             = total / denom[0];
stack[0][TotalCoinsIdx] = stack[0][0];                     // total coin count
stack[0][RemTotalIdx]   = total - (stack[0][0]*denom[0]);  // remaining total
stack[0][WorkingIdx]    = 0;                               // denom working offset

while (stackTopIdx >= 0)
{
if (stackTopIdx >= stack.size()) std::cout << "Stack size assumption failed!\n";

std::vector<uint32_t>& current = stack[stackTopIdx];

uint32_t workingIdx = current[WorkingIdx];

// first generate the current coins reduction case for this level
if (current[workingIdx] > 0)    // we have coin left in this layer
{
// compute is the absolute lower bonds the next coin level..
uint32_t nextCoinsLowerBounds
= current[RemTotalIdx] / denom[workingIdx+1];

// can this new count and the lower bounds of the next level of coins
// possibly bet the curent upper bounds?
if (current[TotalCoinsIdx]-1 + nextCoinsLowerBounds <= upperBounds)
{
// update the current top but first push a copy ahead of it
// for the next level of coins computation.
stack[stackTopIdx+1] = current;

// ok so remove one of current levels coins
current[workingIdx]    -= 1;
current[TotalCoinsIdx] -= 1;
current[RemTotalIdx]   += denom[workingIdx];

// move stack forward
++stackTopIdx;
}
}

// now generate the max case for the next level
++workingIdx;

std::vector<uint32_t>& next = stack[stackTopIdx];

// compute the max coins we can use for it and queue that
next[workingIdx]     = next[RemTotalIdx] / denom[workingIdx];
next[TotalCoinsIdx] += next[workingIdx];
next[RemTotalIdx]   -= denom[workingIdx] * next[workingIdx];
next[WorkingIdx]     = workingIdx;

// check if this result is a terminal of the search
if (next[RemTotalIdx] == 0)
{
// it was the end and solves the problem
// remove it from the stack
--stackTopIdx;

// check if it bets the best
if (upperBounds > next[TotalCoinsIdx])
{
best = next;
upperBounds = best[TotalCoinsIdx];
}
}
else if(workingIdx == LastIdx)  // because it can fail!
{
// its on the last coin and didnt correctly match totals.  So its a
// broken version. Remove it.
--stackTopIdx;
}
}

return best;
}

void test(const char* label,
std::function<std::vector<uint32_t> (uint32_t, const std::vector<uint32_t>&)> func,
uint32_t n,
const std::vector<uint32_t>& denom)
{
auto start = std::chrono::system_clock::now();
std::vector<uint32_t> res = func(n, denom);
auto end = std::chrono::system_clock::now();

std::cout << label << ": result:";
for (int i = 0; i < denom.size(); ++i)
std::cout << denom[i] << "x" << res[i] << " ";
std::cout << " count:" << res[denom.size()] << "\n";

std::cout << " --  time:"
<< std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count()
<< " nSec\n";
}

void testSet(uint32_t n,
const std::initializer_list<uint32_t>& dlist)
{
std::vector<uint32_t> denom(dlist);

std::cout << "\n*** test set for " << n << " *** \n";
if (n < 2001)  test("botUpReach", minCoins_botUpReach ,n, denom);
else           std::cout << "skipped botUpReach\n";

test("topSqueeze", minCoins_topSqueeze, n, denom);
}

int main()
{
// breakers
testSet(0,   {20,15,5,4,1});
testSet(1,   {2});

testSet(23,  {20,15,5,4,1}); // attack greedy
testSet(31,  {20,15,5,4,1}); // attack greedy
testSet(43, {20,15,5,4,1}); // attack greedy
testSet(46,  {20,15,5,4,1}); // attack greedy

testSet(100, {20,15,5,4,1}); // attack bottom up
testSet(1000, {20,15,5,4,1});

testSet(1000, {13,7,1});
testSet(2000, {20,15,5,4,1});

testSet(10000, {20,15,5,4,1});

testSet(100000000, {20,15,5,4,1});
testSet(100000023, {20,15,5,4,1});
testSet(100000031, {20,15,5,4,1});
testSet(100000043, {20,15,5,4,1});
testSet(100000046, {20,15,5,4,1});
testSet(100000100, {20,15,5,4,1});
}


*** test set for 0 ***
botUpReach: result:20x0 15x0 5x0 4x0 1x0  count:0
--  time:4100 nSec
topSqueeze: result:20x0 15x0 5x0 4x0 1x0  count:0
--  time:1614 nSec

*** test set for 1 ***
botUpReach: result:2x0  count:2147483647
--  time:14201 nSec
topSqueeze: result:2x0  count:2147483647
--  time:2656 nSec

*** test set for 23 ***
botUpReach: result:20x0 15x1 5x0 4x2 1x0  count:3
--  time:49285 nSec
topSqueeze: result:20x0 15x1 5x0 4x2 1x0  count:3
--  time:17680 nSec

*** test set for 31 ***
botUpReach: result:20x0 15x2 5x0 4x0 1x1  count:3
--  time:60896 nSec
topSqueeze: result:20x0 15x2 5x0 4x0 1x1  count:3
--  time:21468 nSec

*** test set for 43 ***
botUpReach: result:20x1 15x1 5x0 4x2 1x0  count:4
--  time:96826 nSec
topSqueeze: result:20x1 15x1 5x0 4x2 1x0  count:4
--  time:21117 nSec

*** test set for 46 ***
botUpReach: result:20x2 15x0 5x1 4x0 1x1  count:4
--  time:99908 nSec
topSqueeze: result:20x2 15x0 5x1 4x0 1x1  count:4
--  time:26414 nSec

*** test set for 100 ***
botUpReach: result:20x5 15x0 5x0 4x0 1x0  count:5
--  time:187759 nSec
topSqueeze: result:20x5 15x0 5x0 4x0 1x0  count:5
--  time:15795 nSec

*** test set for 1000 ***
botUpReach: result:20x50 15x0 5x0 4x0 1x0  count:50
--  time:2350739 nSec
topSqueeze: result:20x50 15x0 5x0 4x0 1x0  count:50
--  time:18175 nSec

*** test set for 1000 ***
botUpReach: result:13x76 7x1 1x5  count:82
--  time:2124334 nSec
topSqueeze: result:13x76 7x1 1x5  count:82
--  time:19523 nSec

*** test set for 2000 ***
botUpReach: result:20x100 15x0 5x0 4x0 1x0  count:100
--  time:4793008 nSec
topSqueeze: result:20x100 15x0 5x0 4x0 1x0  count:100
--  time:21261 nSec

*** test set for 10000 ***
skipped botUpReach
topSqueeze: result:20x500 15x0 5x0 4x0 1x0  count:500
--  time:22749 nSec

*** test set for 100000000 ***
skipped botUpReach
topSqueeze: result:20x5000000 15x0 5x0 4x0 1x0  count:5000000
--  time:23337 nSec

*** test set for 100000023 ***
skipped botUpReach
topSqueeze: result:20x5000000 15x1 5x0 4x2 1x0  count:5000003
--  time:65017 nSec

*** test set for 100000031 ***
skipped botUpReach
topSqueeze: result:20x5000000 15x2 5x0 4x0 1x1  count:5000003
--  time:50749 nSec

*** test set for 100000043 ***
skipped botUpReach
topSqueeze: result:20x5000001 15x1 5x0 4x2 1x0  count:5000004
--  time:42286 nSec

*** test set for 100000046 ***
skipped botUpReach
topSqueeze: result:20x5000002 15x0 5x1 4x0 1x1  count:5000004
--  time:44130 nSec

*** test set for 100000100 ***
skipped botUpReach
topSqueeze: result:20x5000005 15x0 5x0 4x0 1x0  count:5000005
--  time:14530 nSec


## Sunday, July 24, 2016

### google interview: compute the minual coins

Over all the result of this one is workable but I expect that there should be a way equal the greedy's performance in the asymptotic, because the fine details of least significant numbers have stop changing once you get far enough out of range of the largest coin.. hence it seems very plausible there is is a better algorithm larger numbers... I just dont have the time to work it out, maybe something that greedy guesses the largest coin value to some limit and then does the real search for the rest..

Update: I have just found some more time and added one more improvement that cuts of more time (bottom up to reachable only) .. i had another improvement in mind (reachable + astar distance until end via max usable coin).. BUT this had 2 issues the astar is a greedy so not perfect and the use of a priority queue added significant overhead, destroying all benefit of it anyway..

Update2: I cracked the asymptotic constant version.. kicks these out of the room.. check out http://code-slim-jim.blogspot.com/2016/07/google-interview-compute-mininual-coins.html (next post after this one)

// http://www.impactinterview.com/2009/10/140-google-interview-questions/#software_engineer
//
// Given a set of coin denominators, find the minimum number of coins to give a
// certain amount of change.

#include <cstdint>
#include <climits>
#include <vector>
#include <iostream>
#include <functional>
#include <chrono>
#include <memory>
#include <list>

// This question is quite interesting, even though it simple to ask the range
// of potential solutions is wide as far as i see it you have 2 main choices:
// 1) you can treat it as a graph and perform BFS, astar, etc to search on it
// 2) you can treat it as a recursion and dynamic programming question

// Given that this can form a tree you can also use greedy search to find a
// solution.  BUT since you are searching for the *minimal* length the result
// from greedy is likely to be sub optimal.

// Also note that we can have multiple solutions to the problem depending on
// value and denominations 6 => 1+5 = 2+4 = ...  etc
// Its also possible to have no solutions (for example exclude 1 from the denominations)

// So lets get started
// A greedy implementation - a quick greedy way to the answer is to maximize
// the largest coin
//
// Note this can give the *wrong* answer
// Note expects reverse sorted coins (largest to smallest)
std::vector<uint32_t> minCoins_greedy(uint32_t total,
const std::vector<uint32_t>& denom)
{
std::vector<uint32_t> res(denom.size()+1,0);

if (total == 0) return res;

for(int i = 0; i < denom.size(); ++i)
{
uint32_t count = total / denom[i];
total -= (count * denom[i]);

res[i] = count;
res[denom.size()] += count;
}

if (total != 0) res[denom.size()] = INT_MAX;

return res;
}

// The recursive implementation - simply walk the entire tree of possibilities
// First we need to formulate the induction.  We will add one coin each time
// and then recurse.  Hence the induction v[n] number of coins for total n is:
//   v[n] = 1 + v[n - c]
//
// Note denoms need to be in reverse order.. max to min!
std::vector<uint32_t> minCoins_naive(uint32_t total,
const std::vector<uint32_t>& denom)
{
std::vector<uint32_t> res(denom.size()+1,0);

if (total == 0) return res;

res[denom.size()] = INT_MAX;

for(int i = 0; i < denom.size(); ++i)
{
if (total >= denom[i])
{
std::vector<uint32_t> pass = minCoins_naive(total - denom[i],
denom);

if ((pass[denom.size()]  + 1) < res[denom.size()])
{
res = pass;
++(res[i]);
++(res[denom.size()]);
}
}
}

return res;
}

// Now the naive recursion is also computing items that we know are already
// larger than our current known best min hence we can the prune tree here -
// simply walk the entire tree until we reach a depth greater than the current
// best min then stop because it cant be down there anymore
//
// Note denoms need to be in reverse order.. max to min!
std::vector<uint32_t> minCoins_prunedImpl(uint32_t total,
const std::vector<uint32_t>& denom,
uint32_t depth,
uint32_t maxDepth)
{
// found trim point..
if (depth > maxDepth)
{
// terminate this search path
std::vector<uint32_t> res(denom.size()+1,0);
res[denom.size()] = INT_MAX;
return res;
}

std::vector<uint32_t> res = minCoins_greedy(total,denom);

// found loop termination..
if (res[denom.size()] == 1)
return res;

for(int i = 0; i < denom.size(); ++i)
{
if (total >= denom[i])
{
uint32_t newMaxDepth = (maxDepth < depth + res[denom.size()] ) ? maxDepth : (depth + res[denom.size()] );

std::vector<uint32_t> pass = minCoins_prunedImpl(total - denom[i],
denom,
depth+1,
newMaxDepth );

if ((pass[denom.size()]  + 1) < res[denom.size()])
{
res = pass;
++(res[i]);
++(res[denom.size()]);
}
}
}

return res;
}

std::vector<uint32_t> minCoins_pruned(uint32_t total,
const std::vector<uint32_t>& denom)
{
return minCoins_prunedImpl(total,
denom,
0,
INT_MAX);
}

// Ok so we know now that the naive isn't smart, and pruning improves this but
// the approach is still wasting allot of time...  what if approach it as a
// divide and conquer problem
//
// To achieve this we first reformulate the induction slightly
//   v[n] = 1 + v[n - c]
// into
//   v[n] = v[c] + v[n - c]
//
// now we have moved from 1 coin at a time on the left and many on the right to
// a full divide and conquer with many coins on both left and right. Since we are
// now choosing multiple coins at one time we may as well pick all the coins from
// each denomination and then remove it from the list.
//
// Note denoms need to be in reverse order.. max to min!
std::vector<uint32_t> minCoins_divAndConqImpl(uint32_t total,
const uint32_t* denomStart,
const uint32_t* denomEnd)
{
std::size_t denomSize = denomEnd - denomStart;
uint32_t maxCount = total / *denomStart;

std::vector<uint32_t> res(denomSize+1,0);
res[denomSize] = INT_MAX;

if (denomSize == 1)
{
// last case cant search... so just give the only result possible
uint32_t leftTotal = total - (maxCount * *denomStart);

if (leftTotal == 0)
{
res[0] = maxCount;
res[1] = maxCount;
}

return res;
}

uint32_t count = maxCount+1;
{
// first case might be special... no need to check it every time
if (total == (maxCount * *denomStart))
{
res[0] = maxCount;
res[denomSize] = maxCount;
count--;
}
}

while (count > 0)
{
--count;
uint32_t leftTotal = total - (count * *denomStart);

std::vector<uint32_t> sub_res = minCoins_divAndConqImpl(leftTotal,
denomStart + 1,
denomEnd);

if (sub_res[denomSize-1] != INT_MAX and
(sub_res[denomSize - 1] + count) < res[denomSize])
{
res[0] = count;
for (int i = 0; i < denomSize; ++i)  // NOTE the oddness.. because its copying total size as well
res[i+1] = sub_res[i];

res[denomSize] += count;
}
}

return res;
}

std::vector<uint32_t> minCoins_divAndConq(uint32_t total,
const std::vector<uint32_t>& denom)
{
// we are dividing by making the full decision for each coin and then removing it from the list
const uint32_t* denomStart = &(denom[0]);
const uint32_t* denomEnd   = &(denom[denom.size()-1]);
denomEnd += 1;

return minCoins_divAndConqImpl(total,
denomStart,
denomEnd);
}

// ok so the divide and conqueror makes a big difference..  but what if we stop
// looking at this as a recursive problem and look at it from a dynamic
// programming perspective, we will start with the top down approach and simply
// cache the results from the prior recursive calls
//
// Note denoms need to be in reverse order.. max to min!
std::vector<uint32_t>& minCoins_topImpl(uint32_t total,
const std::vector<uint32_t>& denom,
std::vector<std::vector<uint32_t> >& cache)
{
std::vector<uint32_t>& res = cache[total];

if (res[denom.size()] > 1 or total == 0)
{
return res;
}

res[denom.size()] = INT_MAX; // max and over flow the count..

for(int i = 0; i < denom.size(); ++i)
{
if (total >= denom[i])
{
std::vector<uint32_t>& pass = minCoins_topImpl(total - denom[i],
denom,
cache);

if ((pass[denom.size()] + 1) < res[denom.size()])
{
res = pass;
++(res[i]);
++(res[denom.size()]);
}
}
}

return res;
}

std::vector<uint32_t> minCoins_top(uint32_t total,
const std::vector<uint32_t>& denom)
{
// setup memory
std::vector<std::vector<uint32_t> > cache;
cache.resize(total + 1);
for (int i = 0; i < total+1; ++i)
cache[i].resize(denom.size()+1);

return minCoins_topImpl(total, denom, cache);
}

// ok but as always we can reverse the approach and produce a bottom-up
// dynamic programming implementation and save our selves all the call overheads
// since in this problem the tree search is not sparse and every case will be
// visited
//
// Note denoms need to be in reverse order.. max to min!
std::vector<uint32_t> minCoins_botImpl(uint32_t total,
const std::vector<uint32_t>& denom,
std::vector<std::vector<uint32_t> >& cache)
{
// compute
for (int target = 1; target <= total; ++target)
{
cache[target][denom.size()] = INT_MAX; // set count to max and overflowed

for(int i = 0; i < denom.size(); ++i)
{
if (target >= denom[i])
{
uint32_t delta = target - denom[i];

if (cache[delta][denom.size()] + 1 < cache[target][denom.size()])
{
cache[target] = cache[delta];
++(cache[target][i]);
++(cache[target][denom.size()]);
}
}
}
}

return cache[total];
}

std::vector<uint32_t> minCoins_bot(uint32_t total,
const std::vector<uint32_t>& denom)
{
// setup memory
std::vector<std::vector<uint32_t> > cache;
cache.resize(total + 1);
for (int i = 0; i < total+1; ++i)
cache[i].resize(denom.size()+1);

return minCoins_botImpl(total,
denom,
cache);
}

// ok so we can improve the bottom up DP version and shave a bit more time by
// visiting items in the order of current coin count as a result we will create
// only the counts that reach towards the
//
std::vector<uint32_t> minCoins_botUpReach(uint32_t total,
const std::vector<uint32_t>& denom)
{
// we are dividing by making the full decistion for each coin and then removing it from the list
if (total == 0) return std::vector<uint32_t>(denom.size()+1, 0);

// setup memory
std::vector< std::vector<uint32_t> > cache(total + 1, std::vector<uint32_t>(denom.size()+1, 0));
std::list<uint32_t> queue;

queue.push_back(0);
while(not queue.empty())
{
uint32_t curTotal = queue.front();
queue.pop_front();

for(int i = 0; i < denom.size(); ++i)
{
uint32_t newTotal = curTotal + denom[i];

if (newTotal <= total and
cache[newTotal][denom.size()] == 0)
{
// ok its empty
cache[newTotal] = cache[curTotal];
++(cache[newTotal][i]);
++(cache[newTotal][denom.size()]);

if (newTotal == total)
return cache[total];

queue.push_back(newTotal);
}
}
}

// impossible case
cache[0][denom.size()] = INT_MAX;
return cache[0];
}

// over all the results are:
// * the greedy continues to kick butt but can be wrong..  this was expected as it
//   guess and looks at 1 path to the solution
// * recursive and depth pruned implementation work but are awful because they
//   repeat the same work over and over
// * the divide and conquer performs well and bets the dynamic programming
//   implementations when the target n is small...  this was a surprising result to
//   me as the top down implementation effectively tries to exhaust the largest
//   coins first, and does this by calling the cache value mid point..  hence i
//   expected it to bet the divide and conquer, because it does effectively
//   the same with caching
// * the dynamic programming versions work best in the long term they:
//  + are pruning because they use the cache results to effect early termination
//  + effectively implement the divide and conquer version by taking largest
//    coin first and it should be faster then the div/conq because it can use the
//    the cached result mid point and can trim the computations of multiple coins draws
//    with the same value (hence its pointless to DP covert the div/conq version)
//  + The bottom up DP also optimizes further by removing the function call overhead

void test(const char* label,
std::function<std::vector<uint32_t> (uint32_t, const std::vector<uint32_t>&)> func,
uint32_t n,
const std::vector<uint32_t>& denom)
{
auto start = std::chrono::system_clock::now();
std::vector<uint32_t> res = func(n, denom);
auto end = std::chrono::system_clock::now();

std::cout << label << ": result:";
for (int i = 0; i < denom.size(); ++i)
std::cout << denom[i] << "x" << res[i] << " ";
std::cout << " count:" << res[denom.size()] << "\n";

std::cout << " --  time:"
<< std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count()
<< " nSec\n";
}

void testSet(uint32_t n,
const std::initializer_list<uint32_t>& dlist)
{
std::vector<uint32_t> denom(dlist);

std::cout << "\n*** test set for " << n << " *** \n";

test("greedy    ", minCoins_greedy,n, denom);
if (n < 30) test("naive     ", minCoins_naive ,n, denom);
else        std::cout << "skipping naive\n";
if (n < 200) test("pruned    ", minCoins_pruned,n, denom);
else         std::cout << "skipping pruned\n";
if (n < 2000) test("divAndConq", minCoins_divAndConq,n, denom);
else         std::cout << "skipping div and conqurer\n";
test("top       ", minCoins_top   ,n, denom);
test("bot       ", minCoins_bot   ,n, denom);
test("botUpReach", minCoins_botUpReach ,n, denom);
}

int main()
{
// breakers
testSet(0,   {20,15,5,4,1});
testSet(1,   {2});

testSet(23,  {20,15,5,4,1}); // attack greedy
testSet(46,  {20,15,5,4,1}); // attack greedy
testSet(31,  {20,15,5,4,1}); // attack greedy
testSet(43, {20,15,5,4,1}); // attack greedy
testSet(100, {20,15,5,4,1}); // attack bottom up
testSet(1000, {20,15,5,4,1});

testSet(1000, {13,7,1});

testSet(10000, {20,15,5,4,1});

}

The output looks like:
*** test set for 0 ***
greedy    : result:20x0 15x0 5x0 4x0 1x0  count:0
--  time:1852 nSec
naive     : result:20x0 15x0 5x0 4x0 1x0  count:0
--  time:572 nSec
pruned    : result:20x0 15x0 5x0 4x0 1x0  count:0
--  time:987 nSec
divAndConq: result:20x0 15x0 5x0 4x0 1x0  count:0
--  time:443 nSec
top       : result:20x0 15x0 5x0 4x0 1x0  count:0
--  time:3814 nSec
bot       : result:20x0 15x0 5x0 4x0 1x0  count:0
--  time:1658 nSec
botUpReach: result:20x0 15x0 5x0 4x0 1x0  count:0
--  time:329 nSec

*** test set for 1 ***
greedy    : result:2x0  count:2147483647
--  time:548 nSec
naive     : result:2x0  count:2147483647
--  time:388 nSec
pruned    : result:2x0  count:2147483647
--  time:563 nSec
divAndConq: result:2x0  count:2147483647
--  time:420 nSec
top       : result:2x0  count:2147483647
--  time:2380 nSec
bot       : result:2x0  count:2147483647
--  time:1881 nSec
botUpReach: result:2x0  count:2147483647
--  time:3839 nSec

*** test set for 23 ***
greedy    : result:20x1 15x0 5x0 4x0 1x3  count:4
--  time:529 nSec
naive     : result:20x0 15x1 5x0 4x2 1x0  count:3
--  time:5941762 nSec
pruned    : result:20x0 15x1 5x0 4x2 1x0  count:3
--  time:69194 nSec
divAndConq: result:20x0 15x1 5x0 4x2 1x0  count:3
--  time:9601 nSec
top       : result:20x0 15x1 5x0 4x2 1x0  count:3
--  time:26959 nSec
bot       : result:20x0 15x1 5x0 4x2 1x0  count:3
--  time:19781 nSec
botUpReach: result:20x0 15x1 5x0 4x2 1x0  count:3
--  time:14906 nSec

*** test set for 46 ***
greedy    : result:20x2 15x0 5x1 4x0 1x1  count:4
--  time:534 nSec
skipping naive
pruned    : result:20x2 15x0 5x1 4x0 1x1  count:4
--  time:798560 nSec
divAndConq: result:20x2 15x0 5x1 4x0 1x1  count:4
--  time:46203 nSec
top       : result:20x2 15x0 5x1 4x0 1x1  count:4
--  time:56899 nSec
bot       : result:20x2 15x0 5x1 4x0 1x1  count:4
--  time:37923 nSec
botUpReach: result:20x2 15x0 5x1 4x0 1x1  count:4
--  time:32182 nSec

*** test set for 31 ***
greedy    : result:20x1 15x0 5x2 4x0 1x1  count:4
--  time:561 nSec
skipping naive
pruned    : result:20x0 15x2 5x0 4x0 1x1  count:3
--  time:134656 nSec
divAndConq: result:20x0 15x2 5x0 4x0 1x1  count:3
--  time:16872 nSec
top       : result:20x0 15x2 5x0 4x0 1x1  count:3
--  time:38468 nSec
bot       : result:20x0 15x2 5x0 4x0 1x1  count:3
--  time:25553 nSec
botUpReach: result:20x0 15x2 5x0 4x0 1x1  count:3
--  time:19032 nSec

*** test set for 43 ***
greedy    : result:20x2 15x0 5x0 4x0 1x3  count:5
--  time:480 nSec
skipping naive
pruned    : result:20x1 15x1 5x0 4x2 1x0  count:4
--  time:731592 nSec
divAndConq: result:20x1 15x1 5x0 4x2 1x0  count:4
--  time:34070 nSec
top       : result:20x1 15x1 5x0 4x2 1x0  count:4
--  time:54569 nSec
bot       : result:20x1 15x1 5x0 4x2 1x0  count:4
--  time:35912 nSec
botUpReach: result:20x1 15x1 5x0 4x2 1x0  count:4
--  time:30253 nSec

*** test set for 100 ***
greedy    : result:20x5 15x0 5x0 4x0 1x0  count:5
--  time:567 nSec
skipping naive
pruned    : result:20x5 15x0 5x0 4x0 1x0  count:5
--  time:7856557 nSec
divAndConq: result:20x5 15x0 5x0 4x0 1x0  count:5
--  time:371883 nSec
top       : result:20x5 15x0 5x0 4x0 1x0  count:5
--  time:107881 nSec
bot       : result:20x5 15x0 5x0 4x0 1x0  count:5
--  time:79875 nSec
botUpReach: result:20x5 15x0 5x0 4x0 1x0  count:5
--  time:60365 nSec

*** test set for 1000 ***
greedy    : result:20x50 15x0 5x0 4x0 1x0  count:50
--  time:728 nSec
skipping naive
skipping pruned
divAndConq: result:20x50 15x0 5x0 4x0 1x0  count:50
--  time:1654224850 nSec
top       : result:20x50 15x0 5x0 4x0 1x0  count:50
--  time:838802 nSec
bot       : result:20x50 15x0 5x0 4x0 1x0  count:50
--  time:766986 nSec
botUpReach: result:20x50 15x0 5x0 4x0 1x0  count:50
--  time:736739 nSec

*** test set for 1000 ***
greedy    : result:13x76 7x1 1x5  count:82
--  time:651 nSec
skipping naive
skipping pruned
divAndConq: result:13x76 7x1 1x5  count:82
--  time:1220761 nSec
top       : result:13x76 7x1 1x5  count:82
--  time:704132 nSec
bot       : result:13x76 7x1 1x5  count:82
--  time:685573 nSec
botUpReach: result:13x76 7x1 1x5  count:82
--  time:686982 nSec

*** test set for 10000 ***
greedy    : result:20x500 15x0 5x0 4x0 1x0  count:500
--  time:559 nSec
skipping naive
skipping pruned
skipping div and conqurer
top       : result:20x500 15x0 5x0 4x0 1x0  count:500
--  time:8379256 nSec
bot       : result:20x500 15x0 5x0 4x0 1x0  count:500
--  time:7830593 nSec
botUpReach: result:20x500 15x0 5x0 4x0 1x0  count:500
--  time:7763225 nSec


### Google interview: Compute the Fibonacci sequance

// Write code that computes the Fibonacci sequance upto n

#include <cstdint>
#include <vector>
#include <iostream>
#include <functional>
#include <chrono>

// Recursive implementations of Fibonacci are all the rage in a teaching
// environment as it demonstrates recursion nicely. The problem is that a recursive
// implementation is completely a inefficient way to code the task in practice.
// Each call of function calls 2 more and those 2 call 2 more each resulting in 4.
// And those 4 call 4 more each resulting in 8 so at this point your space and time
// complexity is 1 + 2 + 4 + 8 + ... which is exponential growth and just awful

uint32_t fib_naive(uint32_t n)
{
if (n == 0) return 0;
if (n == 1) return 1;

return fib_naive(n-1) + fib_naive(n-2);
}

// So what google is really testing here is the "dynamic programming" paradigm.
// https://en.wikipedia.org/wiki/Dynamic_programming.  Unfortunately the term
// "dynamic programming" has had its meaning diluted and corrupted a bit.  In
// this context what it means is the breaking of a larger problem into sub
// parts and solving each sub part to solve the whole.  This sounds exactly
// like recursion and it is, the distinction is that recursive implementations
// tend to stop at raw mathematical conversion where as DP implementations look
// at lazy caching results to cut off repeated computations

// There are two textbook approaches to this; top-down and bottom-up

// in the top-down approach you start working from the answer point and compute
// the values of the breakdowns, you deploy memory as needed to cache the
// results of prior computations so that if they are used again you use the
// cached value rather than recomputing it from scratch again

uint32_t fib_top_core(uint32_t n, std::vector<uint32_t>& cache)
{
if (n > 0 and cache[n] == 0)
{
cache[n] = fib_top_core(n-2,cache) + fib_top_core(n-1,cache);
}

return cache[n];
}

uint32_t fib_top(uint32_t n)
{
if (n == 0) return 0;

std::vector<uint32_t> cache(n+1, 0);
cache[1] = 1;

return fib_top_core(n, cache);
}

// in the bottom-up approach you start working from the known terminators
// and work your way up to the answer point.  You do this by computing
// the values of all the breakdowns you could possibly need and memorize
// these the results for use in the next computations.  Often you can optimize
// the memory to forget values that we know we will not revisit

uint32_t fib_bot(uint32_t n)
{
if (n == 0) return 0;

uint32_t fib_minus_2 = 0;
uint32_t fib_minus_1 = 0;
uint32_t fib_current = 1;

while (n > 1)
{
fib_minus_2 = fib_minus_1;
fib_minus_1 = fib_current;
fib_current = fib_minus_2 + fib_minus_1;
--n;
}

return fib_current;
}

void test(std::function<uint32_t(uint32_t)> func, uint32_t n)
{
auto start = std::chrono::system_clock::now();
std::cout << "result:" << func(n);
auto end = std::chrono::system_clock::now();
std::cout << " --  time:"
<< std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count()
<< "nSec\n";
}

int main()
{
int n = 40;

test(fib_naive,n);
test(fib_top  ,n);
test(fib_bot  ,n);
}


and the output looks like:
result:102334155 --  time:1282786361 nSec
result:102334155 --  time:51012 nSec
result:102334155 --  time:690 nSec


## Wednesday, July 20, 2016

### google interview: find number in multi disk list better than O(logn)

// http://www.impactinterview.com/2009/10/140-google-interview-questions/#software_engineer
//
// Find or determine non existence of a number in a sorted list
// of N numbers where the numbers range over M, M>> N and N large
// enough to span multiple disks. Algorithm to beat O(log n) bonus
// points for constant time algorithm.

// the const time algo is called a "bloom filter" if you work with large
// databases you (hopefully) learn this..  what a bloom filter does on insert
// is hash your key in 3 ways and then light up flags in 3 indexes..  on lookup
// it hashs again and checks the 3 places..  if the any one the flags is off
// then you have a item that doesnt exist..  otherwise commence your normal
// logN search

#include <iostream>
#include <vector>
#include <stdint.h>
#include <memory>
#include <cstring>
#include <cstdlib>
#include <chrono>

#define BLOOMFILTER

class PagedVector
{
public:
typedef uint64_t                Key;         // ideally this would be a template param
typedef std::vector<Key>        Keys;
typedef std::shared_ptr<Keys>   KeysHandle;
typedef std::vector<KeysHandle> PagedKeys;

const uint32_t filterSize = 256;             // ideally this would be tunable to reduce misses/mem use
const uint32_t filterMask = filterSize - 1;

private:
std::size_t threashold_;

PagedKeys pages_;

std::vector<bool> presence_[3]; // the bloom filter

// the divide part of a divide and conqurer search
template <typename Iterator,
typename Compare,
typename Splitter>
struct Divider
{
Compare&  comparer_;
Splitter& splitter_;

Divider(Compare&  comparer,
Splitter& splitter) :
comparer_(comparer),
splitter_(splitter)
{}

void operator()(Iterator& a,
Iterator& b)
{
// should never happen unless we are empty..
if (b==a) return;

Iterator n = splitter_(a,b);

if (comparer_(n))
{
b = n;
return;
}
a = n + 1;
}
};

// All divide and conqurer searches do 1 thing..  they split the data in
// some way and repeat until the desired location is found..  In this
// system we have mutliple uses of the search.. they are used to search and find insertion points

template <typename Iterator,
typename Compare,
typename Splitter>
Iterator search(Iterator begin,
Iterator end,
Compare  comparer,
Splitter splitter)
{
// point at the match or the point to insert in if it doesnt match
Divider<Iterator, Compare, Splitter> Divider(comparer, splitter);

// narrows the sub range in which to keep searching when the range is 1 item we are done..
while (1)
{
Divider(begin,end);
if (end == begin)
return begin;
}
}

struct FindInPage
{
Key target_;

FindInPage(Key target) :
target_(target)
{}

bool operator()(const PagedKeys::iterator& a)
{
// ok slight strangeness here.. pages are a range of numers
//  t < [b,e] then true
//  t > [b,e] then false
// but
// t > b, t < e its equal..  so it has to be absolete less than!

const KeysHandle& pageHandle = *a;
if (pageHandle->size() > 0)
{
return target_ <= (*pageHandle)[pageHandle->size()-1];

}
return true; // empty... should be the last if it exists..
}
};

struct SplitPage
{
Key target_;

SplitPage()
{}

PagedKeys::iterator operator()(const PagedKeys::iterator& a,
const PagedKeys::iterator& b)
{
std::size_t offset = (b-a)/2;
return a + offset;
}
};

struct FindInVector
{
Key target_;

FindInVector(Key target) :
target_(target)
{}

bool operator()(const Keys::iterator& a)
{
return target_ <= (*a);
}
};

struct SplitVector
{
Key target_;

SplitVector(Key target) :
target_(target)
{}

Keys::iterator operator()(const Keys::iterator& a,
const Keys::iterator& b)
{
// std::size_t offset = (b-a)/2;
// return a + offset;

if ((b-a) <= 2)
{
std::size_t offset = (b-a)/2;
return a + offset;
}

Key high = *(b-1);
Key low  = *a;

// since we have the values...
if (target_ <= low)  return a;
if (target_ >= high) return b-1;

double splitPercent;
splitPercent = static_cast<double>(target_ - low) / static_cast<double>(high-low);
splitPercent = (splitPercent < 0.0) ? 0.0 : splitPercent;

std::size_t range  = (b-a);
std::size_t offset = range * splitPercent;

return  a + offset;
}
};

// retun a fax iterator to the page and page offset where the item is or should be
std::pair<PagedKeys::iterator, Keys::iterator> find(Key target)
{
std::pair<PagedKeys::iterator, Keys::iterator> iter;

iter.first = search(pages_.begin(),
pages_.end(),
FindInPage(target),
SplitPage());

if (iter.first == pages_.end())
iter.first = pages_.end() - 1;

KeysHandle& page = *(iter.first);

// find the location to add into the list
iter.second = search(page->begin(),
page->end(),
FindInVector(target),
SplitVector(target));
return iter;
}

public:
PagedVector(std::size_t threashold) :
threashold_(threashold),
pages_(),
presence_()
{
for (int i = 0; i < 3; ++i)
presence_[i].resize(filterSize);

// make it a bit more easy on the boarder conditions
KeysHandle newPage(new Keys);
pages_.push_back(newPage);
}

// add the target to the paged data struture
void insert(Key target)
{
#ifdef BLOOMFILTER
// not the best way but assume we are doing a item by item insert
uint64_t hash = std::hash<Key>()(target);

// light the bloomfilters presence bits according to the hash
presence_[0][ hash        & filterMask] = true;
presence_[1][(hash >> 8)  & filterMask] = true;
presence_[2][(hash >> 16) & filterMask] = true;
#endif

// find the relevent part of the list
std::pair<PagedKeys::iterator, Keys::iterator> iter = find(target);

KeysHandle& page = *(iter.first);

page->insert(iter.second, target);

// then threshold check.
if (page->size() > threashold_)
{
// ok over the threshold.. so divide the list in half
KeysHandle newPage(new Keys);
uint32_t spliceAt = page->size() / 2;
uint32_t sizeAfter = page->size() - spliceAt;
newPage->resize(sizeAfter);
std::memcpy(&((*newPage)[0]), &((*page)[spliceAt]), sizeAfter * sizeof(Key));
page->resize(spliceAt);

pages_.insert(iter.first + 1, newPage);
}
}

bool exists(Key target)
{
#ifdef BLOOMFILTER
uint64_t hash = std::hash<Key>()(target);

// a quick O(1) pre check before the more costly real check
if (not (presence_[0][ hash        & filterMask] and
presence_[1][(hash >> 8)  & filterMask] and
return false;
#endif

// basically have to implement a better then log(N) search
std::pair<PagedKeys::iterator, Keys::iterator> iter = find(target);

return *(iter.second) == target;
}

const PagedKeys& pages() const
{
return pages_;
}

void print(std::ostream& os) const
{
for ( KeysHandle page : pages_)
{
os << "\n ########################################## \n";
for ( Key item : *page)
{
os << " " << item;
}
}
}
};

std::ostream& operator<<(std::ostream& os, const PagedVector& list)
{
list.print(os);
return os;
}

void validate_sorted(PagedVector& theList)
{
bool failed = false;
bool first = true;
PagedVector::Key prev;

for ( PagedVector::KeysHandle page : theList.pages() )
{
for ( PagedVector::Key item : *page)
{
if (first)
{
first = false;
prev = item;
}
else
{
if (prev > item)
failed = true;
prev = item;
}
}
}
std::cout << (failed ? "FAILED" : "passed")
<< "\n";
}

void test_random()
{
PagedVector theList(5000);

auto istart = std::chrono::system_clock::now();
for (int i = 0; i < 1000000;  ++i)
{
PagedVector::Key item = (std::rand() % 100000)*2;

// std::cout << "adding:" << item << "\n";
theList.insert(item);
}
auto iend = std::chrono::system_clock::now();

std::cout << "inserted time:"
<< std::chrono::duration_cast<std::chrono::milliseconds>(iend - istart).count()
<< "mSec \n";
// std::cout << theList << "\n";

int count = 0;
int exist = 0;

auto cstart = std::chrono::system_clock::now();
for (int i = 0; i < 1000000;  ++i)
{
PagedVector::Key item = std::rand() % 200000;
if (theList.exists(item))
++exist;
++count;

// std::cout << " " << item << " "
//           << (theList.exists(item) ? "exists" : "missing")
//           << "\n";
}
auto cend = std::chrono::system_clock::now();

std::cout << "checking time:"
<< std::chrono::duration_cast<std::chrono::milliseconds>(cend - cstart).count()
<< "mSec\n\n";

// expecting a 50% hit rate
std::cout << "checked:"
<< count
<< " exist:" << exist
<< "\n";

validate_sorted(theList);
}

int main()
{
test_random();
}


Output with a bloomfilter is:
inserted time:2613mSec
checking time:735mSec

checked:1000000 exist:499965
passed


The output without the filter is
inserted time:2584mSec
checking time:1292mSec

checked:1000000 exist:499965
passed


### Google interview: implement sort X and discuss its space/time complexity

The problem with sort algorithms is there are many tricks to making them quicker. These tricks also make it more complex to do in an interview.

I have failed a google interview in the past because of this, at that time my interviewer asked me to code a *stable* quick sort on the white board.. The moment it occurred to just how tricky the code will be to do on a white board i simply froze up and promptly forgot everything.. I went down the rabbit hole so to speak, worst interview ever..

So i have tried to keep the algorithms as simple as possible. My tactic here is to break them into an to easy to remember set of sub operations. They are not optimal versions.

#include <iostream>
#include <vector>
#include <functional>

// ***********************************************
// ************** ITERATIVE SORTS ****************
// ***********************************************

void insertionSort(std::vector<int>& list)
{
// ok insertion sort takes current object a inserts it in the correct place
// inplace (excluding swap memory)
// stable ordering
//  -- because the item only moves if less than.. equals stop the inner loop
// average case: O(n^2)
// -- because it has to do some percent of the inner loop lets say 0.5..
//     hence 0.5*N*N is still O(N^2)
// best case: O(n)
//  -- sorted and nothing moves
// worst case: O(n^2)
//  -- reversed and everything moves

for (int i = 1 ; i < list.size(); ++i)
{
// now swap item down until its in place
int j = i;
while (j > 0 and list[j] < list[j-1])
{
int temp = list[j];
list[j] = list[j-1];
list[j-1] = temp;
--j;
}
}
}

void selectionSort(std::vector<int>& list)
{
// selection sort finds the correct object and for the current place
//
// inplace
// stable ordering
//  -- because the first of of equal items is selected
// average case: O(n^2)
// -- same reason as best
// best case: O(n^2)
//  -- even sorted u have to check all items left over
// worst case: O(n^2)
//  -- everything moves

for (int i = 0 ; i < list.size(); ++i)
{
int selection = i;
for (int j = i; j < list.size(); ++j)
{
if (list[selection] > list[j])
selection = j;
}

if (selection != i)
{
int temp = list[i];
list[i] = list[selection];
list[selection] = temp;
}
}
}

// ***********************************************
// *********** DIVIDE AND CONQUER SORTS **********
// ***********************************************

void swap(std::vector<int>::iterator a,
std::vector<int>::iterator b)
{
int temp = *a;
*a = *b;
*b = temp;
}

void quickSort(std::vector<int>::iterator begin,
std::vector<int>::iterator end)
{
// ok divide and conquer by partition on a pivot
// inplace
// unstable
//  -- because pivot moves out of order and then we swap left end woith right end
// average case O(nlogn)
// -- we linear scan the array and swap, with hopefully leaves a 50/50
//    split and recurse for log(n) deep
// best case O(nlogn)
// -- still have to scan for the swaps..
// worst case O(n^2)
// -- if the pivot value sucks it will split all 2 one side

// the range is [begin, end)..
if ((end - begin) < 2) return;

// choose pivot.. lets say midway...
std::vector<int>::iterator p = begin + (end - begin)/2;
int pivot = *(p);

std::vector<int>::iterator tail = end - 1;

// swap partition out of the way - If you dont remove a value then *when*
// partitioning fails you end up stuck in an infinite loop
swap(p, tail);
--tail;

// partition
{
else
{
// swivel head to tail and move tail down
--tail;
}
}

// check last item againest the pivot

// swap the pivot back to the middle.

// note carefully this is infinite loop avoidance...  the head now points
// at the pivot so exclude that from recursion
}

std::vector<int>::iterator rotate(std::vector<int>::iterator begin,
std::vector<int>::iterator mid,
std::vector<int>::iterator end)
{
std::vector<int>::iterator p1 = begin;
std::vector<int>::iterator p2 = mid;

while (1)
{
swap(p1,p2);

++p1;
++p2;

if (p1 >= mid and p2 == end) break;
if (p2 == end) p2 = mid;
}

return begin + (end - mid);
}

std::vector<int>::iterator stablePartition(std::vector<int>::iterator cursor,
std::vector<int>::iterator end,
std::function<bool (int)>  op)
{
// Note "op" will be true if its in the left side

// move through the data until we find the end of the *first* left
// section
while (cursor < end and op(*cursor)) ++cursor;

// record the start of the right section
std::vector<int>::iterator rightStart = cursor;

// find the end of the right section and the start of out of order
// left section
while (cursor < end and not op(*cursor)) ++cursor;

while (cursor < end)
{
// if we are not at the end of the entire range theb the opFlase secton
// has to be *out of place*

// So record the end of the out of place right section
std::vector<int>::iterator rightEnd = cursor;

// find end of the out of place left section
while (cursor < end and op(*cursor)) ++cursor;

// then rotate the out of order left section in front of the right
// section, and update the rightStart to the new divide point
rightStart = rotate(rightStart, rightEnd, cursor);

// then move the cursor to the new end of the right section, starting
// from the end of the prior searched region
while (cursor < end and not op(*cursor)) ++cursor;
}

// return the divide point between the left and rightsections
return rightStart;
}

void stableQuickSort(std::vector<int>::iterator begin,
std::vector<int>::iterator end)
{
// divide and conqure using *stable* partitioning of data
// the trick with stable quick sort is that you rotate the sequances of
// data to instead of swap the start and end to keep the data in order.
//
// It is a real pain..  and tricky to get right...  and would be very
// difficult to do in an interview because partition failure avoidance is
// much harder..  u need to break the sort range into 3 parts so u can leave
// out the middle an avoid the infinite loop when partitioning fails
//
// AND NOTE BEST i have been asked this in a interview with google before.. (and i failed)
//
// inplace
// stable
//  -- the use of rotates moves the entire sequence keeping order
//  -- however care must be taken so that pivot is also stable.  The way I
//     see it there is only 1 real option you have to rotate and form 3
//     sections the lessThan, equalTo and greaterThan sections. You can
//     then avoid infinite loops on partition failures by excluding the
//     equalTo section from the recurse
// average case O(log(n) log(n) n)
// -- http://www.codeproject.com/Articles/26048/Fastest-In-Place-Stable-Sort
// worst case O(n^3) (i think)
// -- if the pivoit value sucks it will split all 2 one side and do N
//    recursions, if that partition is N and a worst will repeatedly call
//    rotate which is worst case N then we have N*N*N or N^3

// a list of less than 2 items is sorted by default
if ((end - begin) < 2) return;

// choose pivot.. mid point...
std::vector<int>::iterator p = begin + (end - begin)/2;

// record the pivot value..
int pivot = *(p);

// form the lessThan and greaterThanEqualTo sections
std::vector<int>::iterator greaterThanEqualToStart
= stablePartition(begin,
end,
[pivot](int v) { return v < pivot; });

// form the equalTo and greaterThan sections
std::vector<int>::iterator greaterThanStart
= stablePartition(greaterThanEqualToStart,
end,
[pivot](int v) { return not (pivot < v); });

// now recurse on the unhanded lessThan and greaterThan partitions *only*
stableQuickSort(begin,            greaterThanEqualToStart);
stableQuickSort(greaterThanStart, end);
}

void mergeSort(std::vector<int>::iterator begin,
std::vector<int>::iterator end,
std::vector<int>& tmp)
{
// divide and conqure sort using *merging* of data
//
// stable
//  -- because the left side merges in before right if the values are the same
// average case O(nlogn)
// -- we always divide in 2 and recurse for log(n) deep then do 2n in merge actions
// best case O(nlogn)
// -- same as average..
// worst case O(nlogn)
// -- same as best and average
// worst space complexity O(n)
// -- in theory u can grow tmp as u need to min the space..but practically
//    this may result in issues because resizing a large vector temporally
//    2x the space as it has to copy the old buffer to the new one, you can
//    avoid this by allocating new pages.. but most of time that not what i
//    see in practice

if ((end - begin) < 2) return;

// split
std::vector<int>::iterator n = begin + (end - begin)/2;

// recurse
mergeSort(begin,n,tmp);
mergeSort(n,end,tmp);

//merge
std::vector<int>::iterator left   = begin;
std::vector<int>::iterator right  = n;
std::vector<int>::iterator wcursor = tmp.begin();

// stop the moment left runs out there is no need to copy all of right to tmp
while ( left < n )
{
if      (right == end)   *(wcursor++) = *(left++ );
else if (*right < *left) *(wcursor++) = *(right++);
else                     *(wcursor++) = *(left++ );
}

// copy the section we merged into tmp back out (note this may not be the full length)
std::vector<int>::iterator ccursor = tmp.begin();
while(ccursor != wcursor)
{
*(begin++) = *(ccursor++);
}
}

void inplaceMergeSort(std::vector<int>::iterator begin,
std::vector<int>::iterator end)
{
// divide and conquer sort using *inplace* merging of data
// the trick with inplace merge sort is that you rotate the sequences of
// data to instead of copy then into the new buffer
//
// inplace
//  -- the use of rotates swaps data multiple times to keep the in the same locations
// stable
// average case O(log(n) log(n) n)
// -- http://www.codeproject.com/Articles/26048/Fastest-In-Place-Stable-Sort

if ((end - begin) < 2) return;

// split
std::vector<int>::iterator n = begin + (end - begin)/2;

// recurse
inplaceMergeSort(begin,n);
inplaceMergeSort(n,end);

//merge
std::vector<int>::iterator left  = begin;
std::vector<int>::iterator right = n;
while ( left < right)
{
// find rotate start ..
while (not (*right < *left) and left < right) ++left;
if (not (left < right)) break;

// note rotate mid
std::vector<int>::iterator mid = right;

// find rotate end
while (*right < *left and right < end) ++right;
if (mid == right) break;

// swap the sections of the array
left = rotate(left,mid,right);
}
}

// ***********************************************
// ****************** HEAP SORT ******************
// ***********************************************

std::vector<int>::iterator parent(std::vector<int>::iterator begin,
std::vector<int>::iterator i)
{
return begin + (i - begin - 1)/2;
}

std::vector<int>::iterator child(std::vector<int>::iterator begin,
std::vector<int>::iterator i)
{
return begin + 2 * (i - begin) + 1;
}

void upShift(std::vector<int>::iterator begin,
std::vector<int>::iterator i)
{
while(begin != i)
{
std::vector<int>::iterator p = parent(begin, i);
if (*p > *i) return;
swap(p,i);

i = p;
}
}

void makeHeap(std::vector<int>::iterator begin,
std::vector<int>::iterator end)
{
// version 1: top down O(nlogn)..
if((end - begin) < 2) return;

for(std::vector<int>::iterator i = begin + 1;
i != end;
++i)
{
upShift(begin, i);
}
}

void reheap(std::vector<int>::iterator begin,
std::vector<int>::iterator end)
{
std::vector<int>::iterator i = begin;
while(1)
{
std::vector<int>::iterator kid1 = child(begin, i);
std::vector<int>::iterator kid2 = kid1 + 1;

if (not (kid1 < end)) return;

std::vector<int>::iterator kidmax
= ((kid2 < end) and (*kid1 < *kid2)) ? kid2 : kid1;

if (not (*kidmax > *i)) return;

swap(kidmax, i);
i = kidmax;
}
}

void heapSort(std::vector<int>::iterator begin,
std::vector<int>::iterator end)
{
// Heap sort, is the same as selection. Ie:  Find the correct item for the
// current location but there is a critical difference, heap sort prepares
// the unsorted data as a tree that forms a priority index with the
// larger numbers the parents of children nodes
//
// inplace
// unstable
//   -- the heap operators dont work to keep stable order
// best case: O(nlogn)  ...
//  -- building a heap can be done as O(n) but this implementation doesn't use it

if ((end - begin) < 2) return;

// make the heap
makeHeap(begin,end);

//then starting at the back
std::vector<int>::iterator i = end - 1;
while (i != begin) // dont bother with the last it has to be the largest
{
// move the largest to the end
swap(i, begin);

// and repair the heap
reheap(begin,i);
--i;
}
}

// ***********************************************
// ***********************************************

// should be a real overload but std::functional cant seem to match the

void quickSortWrap(std::vector<int>& list)
{
quickSort(list.begin(), list.end());
}

void stableQuickSortWrap(std::vector<int>& list)
{
stableQuickSort(list.begin(), list.end());
}

void mergeSortWrap(std::vector<int>& list)
{
std::vector<int> tmp;
tmp.resize(list.size());

mergeSort(list.begin(), list.end(), tmp);
}

void inplaceMergeSortWrap(std::vector<int>& list)
{
inplaceMergeSort(list.begin(), list.end());
}

void heapSortWrap(std::vector<int>& list)
{
heapSort(list.begin(), list.end());
}

// ***********************************************
// ******************** TESTING ******************
// ***********************************************

bool check(std::vector<int>& list)
{
bool failed = false;
bool first = true;
int prev;

for ( int i : list)
{
if (first)
{
prev = i;
first = false;
}
else if (i < prev)
{
std::cout << "*";
failed = true;
}
prev = i;
std::cout << i << " ";
}
std::cout << " :" << (failed ? "FAILED" : "passed") << "\n";
}

int testHeap(std::vector<int> data)
{
// check indexing
if (data.size() > 1)
{
// some hard checks
// index 1 -> parent 0
if (data.size() > 1)
std::cout << " parent1: "
<< ((parent(data.begin(), data.begin()+1) == data.begin()) ?
"passed" : "FAILED")
<< "\n";

// index 2 -> parent 0
if (data.size() > 2)
std::cout << " parent2: "
<< ((parent(data.begin(), data.begin()+2) == data.begin()) ?
"passed" : "FAILED")
<< "\n";

// index 3 -> parent 1
if (data.size() > 3)
std::cout << " parent3: "
<< ((parent(data.begin(), data.begin()+3) == (data.begin()+1)) ?
"passed" : "FAILED")
<< "\n";

// index 4 -> parent 1
if (data.size() > 4)
std::cout << " parent4: "
<< ((parent(data.begin(), data.begin()+4) == (data.begin()+1)) ?
"passed" : "FAILED")
<< "\n";

// the soft general check
bool failed = false;
for (std::vector<int>::iterator i = data.begin() + 1;
i != data.end();
++i)
{
std::vector<int>::iterator p    = parent(data.begin(),i);
std::vector<int>::iterator kid1 = child(data.begin(),p);
std::vector<int>::iterator kid2 = kid1 + 1;

bool bad = (kid1 != i) and (kid2 != i);
std::cout << " indexing bad at"
<< " i:" << (i-data.begin())
<< " p:" << (p-data.begin())
<< " k1:" << (kid1-data.begin())
<< " k2:" << (kid2-data.begin())
<< "\n";

}
std::cout << " general heap indexing:" << (failed ? "FAILED" : "passed") << "\n";
}

// check heap making
makeHeap(data.begin(),
data.end());

bool failed = false;

if (data.size() > 0)
std::cout << " # " << data[0];

for (int i = 1; i < data.size(); ++i)
{
int p = (i-1)/2;
bool bad = (data[i] > data[p]);
std::cout << (bad ? "|" : " ") << data[i];
}

std::cout <<" " << (failed ? "FAILED" : "passed")
<< "\n";
}

int testRotate()
{
std::vector<int> data({1,2,3,4,5,6,7,8});

std::vector<int> data1(data);
rotate(data1.begin(),
data1.begin()+2,
data1.end());

for (std::vector<int>::iterator a = data1.begin();
a != data1.end();
++a)
std::cout << " " << *a;
std::cout << "\n";

std::cout << (data1 == std::vector<int>({3,4,5,6,7,8,1,2}) ? "passed" : "FAILED")
<< "\n";

std::vector<int> data2(data);
rotate(data2.begin(),
data2.begin()+6,
data2.end());

for (std::vector<int>::iterator a = data2.begin();
a != data2.end();
++a)
std::cout << " " << *a;
std::cout << "\n";

std::cout << (data2 == std::vector<int>({7,8,1,2,3,4,5,6}) ? "passed" : "FAILED")
<< "\n";
}

int test_one(std::vector<int> data,
std::function<void (std::vector<int>&)> sort)
{
std::vector<int> list(data);
sort(list);
check(list);
}

void test(std::initializer_list<int> data)
{
std::cout << "testing heap parts\n";
testHeap(data);

std::cout << "testing sorts\n";
test_one(data,insertionSort);
test_one(data,selectionSort);
test_one(data,mergeSortWrap);
test_one(data,inplaceMergeSortWrap);
test_one(data,quickSortWrap);
test_one(data,stableQuickSortWrap);
test_one(data,heapSortWrap);
}

int main()
{
std::cout << "testing rotate\n";
testRotate();

std::cout << "test checker\n";
std::vector<int> list1({6,7,8,8,6,5});
check(list1);

std::cout << "testing sets\n";
test({});
test({1});
test({1,1}); // sort breaker

test({1,2});
test({2,1});

test({6,7,8,8,6,5});

test({2,2,2,2,2,2,2,2,2,2,2,2,2}); // sort breaker...

test({3,5,2,6,8,3,2,6,5,3,4,8,7});
}



## Saturday, July 2, 2016

### google interview: uniform sampling from an infinite stream

//http://www.impactinterview.com/2009/10/140-google-interview-questions/#software_engineer
// You have a stream of infinite queries (ie: real time Google search queries
// that people are entering).  Describe how you would go about finding a good
// estimate of 1000 samples from this never ending set of data and then write
// code for it.

#include <iostream>
#include <iomanip>
#include <vector>
#include <ctime>

// Reservoir Sampling
// did it here http://code-slim-jim.blogspot.jp/2010/06/reservoir-sampling.html
// but wow 2010 is so long ago..

// ok so the explanation:
// you have stream and u need to sample from N "good" items it..  assumably
// "good" means uniform here so until you have the first N items u just take
// every thing
//
// now the tricky part when u get the N+1 item each item is therefore supposed
// to have a N/(N+1) probability of being in the output so the new item has a
// N/(N+1) chance of selection and all the existing items have a 1/N chance of
// begin rejected..
//
// then we can generalize to the (N+x)-th sample.  when u get the N+x item each
// item is therefore supposed to have a N/(N+x) probability of being in the
// output so the new item has a N/(N+x) chance of selection and all the
// existing items have a 1/N chance of begin rejected

template <typename Type>
class ReservoirSample
{
public:
typedef std::vector<Type> Samples;

private:
std::size_t       size_;
std::size_t       count_;
std::vector<Type> samples_;

public:
ReservoirSample(unsigned int size) :
size_(size),
count_(0),
samples_()
{}

float rand()
{
return static_cast<float>(std::rand() % 1000) / 1000.0;
}

void sample(Type& item)
{
count_ += 1;
if (samples_.size() < size_)
{
samples_.push_back(item);
}
else
{
float chance = static_cast<float>(size_)/static_cast<float>(count_);

if (rand() < chance)
{
std::size_t idx = size_ * rand();
samples_[idx] = item;
}
}
}

std::vector<Type>& samples()
{
return samples_;
}
};

void test(int selection,
int streamSize)
{
// testing random stuff can be a pain.. you either de-random it
// or you repeat and confirm the expected distrubution (uniform...)
const int cycles = 100000;
std::vector<int> freq(streamSize,0);

for (int j = 0; j < cycles; ++j)
{
ReservoirSample<int> sampler(selection);

for (int i = 0; i < streamSize; ++i)
{
sampler.sample(i);
}

const std::vector<int>& samples = sampler.samples();
for ( std::vector<int>::const_iterator sit = samples.begin();
sit != samples.end();
++sit)
{
++(freq[*sit]);
}
}

// each number will come up once in a cycle and "selection" are choosen out of "streamSize"
float expected
= (static_cast<float>(cycles)
* static_cast<float>(selection))
/ static_cast<float>(streamSize);

// and lets give it +/- 5% .. (which will fail now and then..)
float expectMin = 0.95 * expected;
float expectMax = 1.05 * expected;

for (int i = 0; i < streamSize; ++i)
{
bool passed = freq[i] > expectMin and freq[i] < expectMax;
std::cout << std::setw(3) << i << ":" << freq[i]
<< " " << (passed ? "passed" : "FAILED")
<< "\n";
}
std::cout << "\n";
}

int main()
{
std::srand(std::time(0));

test(1,30);
test(3,30);
test(7,30);
test(15,30);
}


0:3307 passed
1:3302 passed
2:3237 passed
3:3344 passed
4:3308 passed
5:3233 passed
6:3317 passed
7:3257 passed
8:3230 passed
9:3407 passed
10:3310 passed
11:3267 passed
12:3303 passed
13:3414 passed
14:3267 passed
15:3327 passed
16:3333 passed
17:3371 passed
18:3351 passed
19:3364 passed
20:3306 passed
21:3364 passed
22:3402 passed
23:3440 passed
24:3326 passed
25:3390 passed
26:3351 passed
27:3451 passed
28:3289 passed
29:3432 passed

0:10056 passed
1:9997 passed
2:10060 passed
3:10012 passed
4:9919 passed
5:10021 passed
6:9984 passed
7:9945 passed
8:9987 passed
9:10040 passed
10:9919 passed
11:9965 passed
12:9918 passed
13:9904 passed
14:9864 passed
15:9876 passed
16:10043 passed
17:9945 passed
18:10064 passed
19:10025 passed
20:9878 passed
21:10021 passed
22:10108 passed
23:9796 passed
24:10100 passed
25:10025 passed
26:10206 passed
27:10003 passed
28:10172 passed
29:10147 passed

0:23484 passed
1:23126 passed
2:23259 passed
3:23345 passed
4:23290 passed
5:23339 passed
6:23497 passed
7:23348 passed
8:23290 passed
9:23179 passed
10:23603 passed
11:23154 passed
12:23223 passed
13:23333 passed
14:23394 passed
15:23193 passed
16:23546 passed
17:23219 passed
18:23401 passed
19:23201 passed
20:23246 passed
21:23240 passed
22:23621 passed
23:23224 passed
24:23410 passed
25:23376 passed
26:23518 passed
27:23268 passed
28:23348 passed
29:23325 passed

0:49663 passed
1:49911 passed
2:50401 passed
3:49617 passed
4:49691 passed
5:50278 passed
6:49804 passed
7:49940 passed
8:50497 passed
9:49687 passed
10:49582 passed
11:50552 passed
12:49871 passed
13:49670 passed
14:50304 passed
15:50184 passed
16:49988 passed
17:49965 passed
18:50077 passed
19:49945 passed
20:50290 passed
21:50173 passed
22:49972 passed
23:50051 passed
24:49943 passed
25:50053 passed
26:50288 passed
27:49852 passed
28:50009 passed
29:49742 passed



### Google interview question: write an execl col label to integer converter

// http://www.impactinterview.com/2009/10/140-google-interview-questions/#software_engineer
// Write a function (with helper functions if needed) called to Excel that takes an excel column value
// (A,B,C,D…AA,AB,AC,… AAA..) and returns a corresponding integer value (A=1,B=2,… AA=26..).

#include <iostream>
#include <stdexcept>

// the function "toExecl" will convert *from* excel format to integers... ignoring that confusion...
int fromExcel(std::string val)
{
// there is an oddity here
// 0A -> A but A0 -> doesnt exist! note the +1 in the conversion line as a result

int out = 0;
for (std::string::iterator vit = val.begin();
vit != val.end();
++vit)
{
if (*vit < 'A' or *vit > 'Z')
throw std::runtime_error("doesnt look like an execl col");
out = (out*26) + (*vit - 'A' + 1);
}

return out;
}

int test(std::string in, int exp)
{
int out = fromExcel(in);

std::cout << (out == exp ? "passed" : "FAILED")
<< " in:" << in
<< " out:" << out
<< " exp:" << exp
<< "\n";
}

int main()
{
try
{
test("012", 0);
std::cout << "FAILED: did not detect bad inputs\n";
}
catch (std::exception& e)
{
std::cout << "pass: throw check worked\n";
}

test(  "A",                     1);
test(  "Z",                    26);
test( "AA",             1*26 +  1);
test( "AZ",             1*26 + 26);
test( "BA",             2*26 +  1);
test( "ZZ",            26*26 + 26);
test("AAA", 1*26*26 +   1*26 +  1);
test("AZZ", 1*26*26 +  26*26 + 26);
test("ZAA",26*26*26 +   1*26 +  1);
test("ZZZ",26*26*26 +  26*26 + 26);
}


Output

pass: throw check worked
passed in:A out:1 exp:1
passed in:Z out:26 exp:26
passed in:AA out:27 exp:27
passed in:AZ out:52 exp:52
passed in:BA out:53 exp:53
passed in:ZZ out:702 exp:702
passed in:AAA out:703 exp:703
passed in:AZZ out:1378 exp:1378
passed in:ZAA out:17603 exp:17603
passed in:ZZZ out:18278 exp:18278


### Google interview questions: print matching chars in order of first string

// http://www.impactinterview.com/2009/10/140-google-interview-questions/#software_engineer
// Write a function f(a, b) which takes two character string arguments and returns a string containing
// only the characters found in both strings in the order of a. Write a version which is order N-squared
// and one which is order N.

#include <iostream>
#include <string>
#include <functional>
#include <vector>
#include <cstring>

std::string match_the_stupid_way(std::string a, std::string b)
{
// the silly way is just for each char in string a O(n)
// look through string b O(n) double loop hence O(n^2)

std::string res;

for (std::string::iterator ait = a.begin();
ait != a.end();
++ait)
{
// ok match the first each char
std::string::iterator bit = b.begin();
while (bit != b.end() and
*bit != *ait)
{
++bit;
}

if (bit != b.end())
res += *ait;
}
return res;
}

std::string match_the_fast_way(std::string a, std::string b)
{
// O(N) .. this imples reading over a string (which takes O(N)) and
// digesting into an O(1) data struture and then check over the other...

// well an O(1) data structure is a hash or table..
std::vector<bool> found_char(256,false);

for (std::string::iterator bit = b.begin();
bit != b.end();
++bit)
{
found_char[*bit] = true;
}

// and the second O(N) check is
std::string res;
for (std::string::iterator ait = a.begin();
ait != a.end();
++ait)
{
if (found_char[*ait])
res += *ait;
}

return res;
}

void test(std::function<std::string (std::string,std::string)> func,
std::string a,
std::string b,
std::string exp)
{
std::string out = func(a,b);

std::cout << (out == exp ? "passed" : "FAILED")
<< " a:\""     << a
<< "\" b:\""   << b
<< "\" out:\"" << out
<< "\" exp:\"" << exp
<< "\"\n";
}

void test_set(std::function<std::string (std::string,std::string)> func)
{
test(func, "a", "cba", "a");
test(func, "aaa", "cba", "aaa");  // hmm question isnt clear on repeats.. assume it does repeats
test(func, "abcdefg", "gfedcba", "abcdefg");
test(func, "the quick brown fox", "peter piper picked a pepper", "te ick r ");
std::cout << "*** SET DONE ***\n";
}

int main()
{
test_set(match_the_stupid_way);
test_set(match_the_fast_way);
}


Output is

passed a:"a" b:"cba" out:"a" exp:"a"
passed a:"aaa" b:"cba" out:"aaa" exp:"aaa"
passed a:"abcdefg" b:"gfedcba" out:"abcdefg" exp:"abcdefg"
passed a:"the quick brown fox" b:"peter piper picked a pepper" out:"te ick r " exp:"te ick r "
*** SET DONE ***
passed a:"a" b:"cba" out:"a" exp:"a"
passed a:"aaa" b:"cba" out:"aaa" exp:"aaa"
passed a:"abcdefg" b:"gfedcba" out:"abcdefg" exp:"abcdefg"
passed a:"the quick brown fox" b:"peter piper picked a pepper" out:"te ick r " exp:"te ick r "
*** SET DONE ***


### google interview question: build a random generator

// http://www.impactinterview.com/2009/10/140-google-interview-questions/
// Given a function which produces a random integer in the
// range 1 to 5, write a function which produces a random
// integer in the range 1 to 7.

#include <cstring>
#include <cstdlib>
#include <iostream>
#include <functional>
#include <chrono>

// Note stuff the [1,5] thats just +1. -1 everywhere and a waste of time lets go [0,4]

int rand5()
{
return std::rand() % 5;
}

// ok so the problem is that you need to combine the rand5 to make rand7
// the question doesnt say anything about *uniform* distribution so in theroy
// u just call rand5 2 times and mod 7.. lets call this the weasel solution
// cause the question was badly written

int rand7_weasel()
{
return (rand5() + rand5()) % 7;

}

// ok so now the questioner has realized his mistake and corrected.
// a "uniform" rand7() generator
//
// The above is not uniform because if use two 6 sided dice and added the rolls
// the most common number is 7. Ie 7 has the most number of combos that add to it.
//
// So lets simplify and assume for the moment we have a more natural problem
// we can generate digits [0,9] uniformly and need to create a uniform number
// from [0,999].. well the answer is much more clear now you generate 3 numbers
// and use them as the digits of the final number
//
// The same idea works for the [0,4], we have 5 numbers so we simply use base 5
// instead of base 10. This way we can make a very large uniform distributed number
//
// once we have a very large range of uniformly distributed numbers we can module
// it by a much smaller number the binning error will be negligible

// lets minimize the binning error by making the max number possible (overflow it once)
//
// 2^32 = 5^N
// 32 log2 = N log5
// N = 13.78

int rand7_overkill()
{
unsigned int sum = 0;
for (int i = 0; i < 14; i++)
sum = (sum*5) + rand5();

return sum % 7;
}

// of course if you know your maths the central limit theorem says that for most distributions
// when you take many independent samples and summate them the resulting distribution will
// approach a another stable one.. for uniform distributions thats the normal distribution.
//  refer to: http://demonstrations.wolfram.com/CentralLimitTheoremForTheContinuousUniformDistribution/
//
// Then add to this the fact that a small enough modulus on a normal distribution (much
// much less that its variance) will make the resulting distribution look somewhat uniform
//  ie the wrapped normal distribution where sigma^2 approaches infinite
//  refer to: https://en.wikipedia.org/wiki/Wrapped_normal_distribution
//
// So technically this works.. but a smart interviewer is going to ask you why it works.. and
// your likely to choke on it.. (i know the maths would kill me during an interview)

int rand7_central_limit_weasel()
{
unsigned int sum = 0;
for (int i = 0; i < 14; i++)
sum += rand5();

return sum % 7;
}

// ok so now we have a working solution.. but guess what your interviewing for google..
// and they want to curve ball it.. so they ask the following.. can you improve the
// performance?.. looping 14 times is too much.. a very "innocent" question
//
// now you might think ok well just reduce the number of iterations it that does it..
// but that is not really going to work well
//
// So here is the trick.. the real question is about testing your knowledge of pseudo
// random generators.. so just build one.. now a pseudo random generator isn't anything
// special but you have to hack one out in an interview and i don't have primes. pi and
// what not memorized by heart...
//
// What i do know is how to generate a uniform distribution that ranges from [0,infinite]..
// so all u have to do is make one big continuous rolling uniform number from the above code
// and then sample from it as needed..
//
// now this does raise some questions about the independence of the samples... but its faster...

int rand7_pseudo()
{
// inject a the random sample to seeded the hash
static unsigned int sum = rand5();

// the uniform sum on with the prior...
sum = (sum*5) + rand5();

return sum % 7;
}

void test(std::function<int()> randy)
{
int freq[7] = {0};
std::memset(freq, 0, sizeof(freq));

for (int i = 0; i < 100000; ++i)
{
++freq[randy()];
}

double elapsed_seconds = std::chrono::duration_cast<
std::chrono::duration<double> >(finish - start).count();

for (int i=0;i<7;++i)
{
std::cout << "\n " << i << ":" << freq[i];
}
std::cout << "\n speed:" << elapsed_seconds
<< "\n";
}

int main()
{
test(rand7_weasel);
test(rand7_overkill);

test(rand7_central_limit_weasel);
test(rand7_pseudo);
}


and output looks like this:

 0:11949
1:12073
2:11885
3:16060
4:19993
5:16024
6:12016
speed:0.0190048

0:14130
1:14541
2:14177
3:14296
4:14157
5:14400
6:14299
speed:0.056561

0:14264
1:14538
2:14425
3:14064
4:14476
5:14076
6:14157
speed:0.0409873

0:14272
1:14230
2:14308
3:14110
4:14214
5:14422
6:14444
speed:0.0131068