The Coding Slim Jim: 2016

Wednesday, October 26, 2016

analytic regression in python using a parameterized family of functions

Im have been studying ML(machine learning). So here is the deal.. ML is just applied statics. You take a bunch of data and you fit a model to it. The data affords you a certain upper limit of accuracy compared to the real world. Your chosen model then subtracts a certain percentage off the top. The cost of abstraction is an imperfect fit. Simple.

So one of the primary rules of thumb when working with this stuff is start simple. You need to digest the dataset and then make incremental steps forward... its easy however to be making steps backwards without knowing it.

Now regression is just curve fitting. The simplest way to do this is just a direct analytic solution. no gradient descent, nothing more complex, but this approach has limits that quickly become clear when dealing with larger datasets.

So lets solve analytic regression for a parameterized family of functions and then codify it... you'll note i have some quirks to the way i do things.. i try to avoid matrix maths since i think it hides too much and fails to readily when you get beyond 2d axis of data

Define the model:

\hat y_i := \sum_k w_k f(x_i;k)

ReferenceError: katex is not defined

Define the loss:

MSE = \sum_i(\hat y_i - y_i)^2

ReferenceError: katex is not defined

Set target (when mse gradient is 0 the error is at its min for w of 0 to n):

\frac{\partial MSE}{\partial w} = 0

ReferenceError: katex is not defined

Sub the model into the target equation for each "n" derviate term:

= \frac{\partial}{\partial w_n} \Big( \sum_i( \sum_k w_k f(x_i;k) - y_i)^2 \Big)

ReferenceError: katex is not defined

Expand square (note not expanding the sum of k treat it as a single term):

= \frac{\partial}{\partial w_n} \Big( \sum_i\big( (\sum_k w_k f(x_i;k))^2 -2 y_i \sum_k w_k f(x_i;k) + y_i^2 \big) \Big)

ReferenceError: katex is not defined

Transfer outer sum to terms:

= \frac{\partial}{\partial w_n} \Big( \sum_i(\sum_k w_k f(x_i;k))^2 - \sum_i 2 y_i \sum_k w_k f(x_i;k) + \sum_iy_i^2 \Big)

ReferenceError: katex is not defined

Transfer derivative to terms and expand/rebase squares:

= \frac{\partial}{\partial w_n} \sum_i(\sum_k w_k f(x_i;k))(\sum_j w_j f(x_i;j)) - \frac{\partial}{\partial w_n} \sum_i \sum_k 2 y_i w_k f(x_i;k) + \frac{\partial}{\partial w_n} \sum_iy_i^2

ReferenceError: katex is not defined

Do simple derivatives to the 2 right most terms:
* Note that the partial derivative of a sum drops all terms unless n=k
* Note that sum of y's looks like a constant and the derivate of a constant is 0

= \sum_i\frac{\partial}{\partial w_n} (\sum_k w_k f(x_i;k))(\sum_j w_j f(x_i;j)) - \sum_i 2 y_i f(x_i;n)

ReferenceError: katex is not defined

Start applying the derivative product rule:

= \sum_i\sum_k w_k f(x_i;k) \frac{\partial}{\partial w_n}(\sum_j w_j f(x_j;j)) + \sum_i\sum_j w_j f(x_j;j) \frac{\partial}{\partial w_n}(\sum_k w_k f(x_i;k)) - \sum_i 2 y_i f(x_i;n)

ReferenceError: katex is not defined

Continue the derivative product rules:

= \sum_i\sum_k w_k f(x_i;k) f(x_i;n) + \sum_i\sum_j w_j f(x_i;j) f(x_i;n)) - \sum_i 2 y_i f(x_i;n)

ReferenceError: katex is not defined

Rebase vars into common ones:

= \sum_i\sum_j w_j f(x_i;j) f(x_i;n) + \sum_i\sum_j w_j f(x_i;j) f(x_i;n)) - \sum_i 2 y_i f(x_i;n)

ReferenceError: katex is not defined

Simply:

= 2 \sum_i\sum_j w_j f(x_i;j) f(x_i;n) - 2 \sum_i y_i f(x_i;n)

ReferenceError: katex is not defined

Equate for a linear solver implementation (recall there are now "n" of these equations):

\sum_i\sum_j w_j f(x_i;j) f(x_i;n) = \sum_i y_i f(x_i;n)

ReferenceError: katex is not defined

Now the above maths when codified produces the following code:

#!/usr/bin/env python

# analytic regression for a general linear combination of functions

import random
import numpy as np
import matplotlib.pyplot as plt

# convert the maths into a function
# note this is likely weak in respect to:
#  - overflow (because we are summing numbers)
#  - large amounts of data (no batching etc)
#   etc
def analytic_regress(x,y,params,func):
    size = len(params)
    left  = np.zeros((size,size))
    right = np.zeros(size)

    for i in range(size):
        right[i] = np.sum(y*func(x,params[i]))

        for j in range(size):
            left[i,j] = np.sum(func(x,params[i])*func(x,params[j]))

    w = np.linalg.solve(left, right)

    model = lambda x: (np.sum(w * np.array([func(x,i) for i in params])))
    return w, model

### testing ###

def generate(func,
             x_min, x_max,
             yerr,
             count):
    x = np.random.uniform(x_min, x_max, count)
    y = np.array([func(xi) for xi in x]) + np.random.uniform(0,yerr,count)

    return x,y

# basis functions to trial
def ploy(x,n):
    return x**n

def cossin(x,n):
    if (n < 0):
        return np.sin(-n * x)
    return np.cos(n * x)

def sigmoid(x,n):
    return 1.0/(1.0 + np.exp(n-x))

def guassian(x,n):
    return np.exp(-0.5*(x-n)**2)

def square(x,n,w):
    return (1.0*np.logical_and(x >= n, x < n+w))

# a general function to trial a few regressions
def test_set(func,
             x_min, x_max, y_error, count,
             basis, params,
             x_test):

    x,y = generate(func, x_min, x_max, y_error, count)

    models = {}
    for p in params:
        w, models[p]  = analytic_regress(x,y,range(p),basis)
        print "model basis up to:", p, " has w:"
        print w

    y_test = {}
    for p in params:
        model = models[p]
        y_test[p]  = np.array([model(i) for i in x_test])

    plt.plot(x , y , "o")
    for p in params:
        plt.plot(x_test, y_test[p])

    plt.show()

# a general funtion to trial a single regression
def test_one(func,
             x_min, x_max, y_error, count,
             basis, params,
             x_test):
    x,y = generate(func, x_min, x_max, y_error, count)

    w, model = analytic_regress(x,y, params, basis)
    print w

    y_test = np.array([model(i) for i in x_test])

    plt.plot(x , y , "o")
    plt.plot(x_test, y_test)
    plt.show()

print " ploy regress for function y = 78 - 2x"
test_set(lambda x: (78.0 - 2.0*x),
         0.0, 6.0, 6, 100,
         ploy, [2,3,5,9,11],
         np.array(range(-1, 70))/10.0)

print " ploy regress for function y = 78 + 60x - 13x^2"
test_set(lambda x: (78.0 + 60.0*x - 13.0*x**2),
         0.0, 6.0, 6, 100,
         ploy, [2,3,5,9,11],
         np.array(range(-2, 72))/10.0)

print "square regress for function y = 1 iff 1 < x < 6 othewise 0"
test_one(lambda x: (0 if (x < 1 or x > 6) else 1),
         0.0, 7.0, 0.4, 100,
         lambda x,n: (square(x,n,0.5)), np.array(range(14))/2.0,
         np.array(range(-50, 150))/10.0)

print "square regress for function y = 10*cos(2x)"
test_one(lambda x: 10*np.cos(2.0*x),
         0.0, 7.0, 0.4, 100,
         lambda x,n: (square(x,n,0.5)), np.array(range(14))/2.0,
         np.array(range(-50, 150))/10.0)

print "sigmod regress for function y = 10*cos(2x)"
test_one(lambda x: 10*np.cos(2.0*x),
         0.0, 7.0, 0.4, 100,
         sigmoid, np.array(range(14))/2.0,
         np.array(range(-50, 150))/10.0)

print "guassian regress for function y = 10*cos(2x)"
test_one(lambda x: 10*np.cos(2.0*x),
         0.0, 7.0, 0.4, 100,
         guassian, np.array(range(14))/2.0,
         np.array(range(-50, 150))/10.0)

print "cos/sin regress for function y = 10*cos(2x)"
test_one(lambda x: 10*np.cos(2.0*x),
         0.0, 7.0, 0.4, 100,
         cossin, np.array(range(-6,8))/2.0,
         np.array(range(-50, 150))/10.0)

And then the various outputs...

The regression using using a family of polynomials on a linear equation:

The regression using using a family of polynomials on a quadratic:

The regression using square bars on a square bar:

The regression using square bars on a cos:

The regression using guassians on a cos:

The regression using sigmoids on a cos:

The regression using sin and cos on a cos:

Find the n shortest paths from a to b

// compile with: g++ --std=c++11 multipath.cpp 
//
// Problem Statement:
//  find the n shortest paths from a to b

// Answer:
//  The proposer of the question seemed to think this was hard
//  and even stated many more limitations to the question(no cycles etc)..
//  but i took it as dead easy. Since this inst a challenge of efficiency
//  ill stop at the first pass naive recursive answer..

#include <utility>
#include <stdint.h>
#include <vector>
#include <map>
#include <memory>
#include <iostream>

struct Node
{
    uint32_t id_;
    std::vector<std::pair<Node*,uint32_t> > kids_;

    Node(uint32_t id) :
        id_(id)
    {}
};

struct Graph
{
    // meh should be better..
    std::map<uint32_t, std::shared_ptr<Node> > nodes_;

    Node* get(uint32_t id)
    {
        Node* n = nodes_[id].get();

        if (n == NULL)
        {
            nodes_[id] = std::make_shared<Node>(id);
            n = nodes_[id].get();
        }

        return n;
    }

    void link(uint32_t fromId,
              uint32_t toId,
              uint32_t weight)
    {
        Node* from = nodes_[fromId].get();
        if (from == NULL)
        {
            nodes_[fromId] = std::make_shared<Node>(fromId);
            from = nodes_[fromId].get();
        }

        Node* to = nodes_[toId].get();
        if (to == NULL)
        {
            nodes_[toId] = std::make_shared<Node>(toId);
            to = nodes_[toId].get();
        }

        from->kids_.emplace_back(std::make_pair(to,weight));
    }
};

struct PathNode
{
    std::shared_ptr<PathNode> parent_;
    Node*      node_;
    uint32_t   cost_;

    PathNode(std::shared_ptr<PathNode> parent,
             Node*      node,
             uint32_t   cost) :
        parent_(parent),
        node_(node),
        cost_(cost)
    {}
};

std::ostream& operator<<(std::ostream& os,
                         const PathNode& path)
{
    // inverted.. but it meets my needs
    if (path.parent_.get() != NULL)
    {
        const PathNode& p = *(path.parent_);
        os << p;
        os << "-";
    }

    if (path.node_ != NULL)
        os << path.node_->id_;

    return os;
}


// ************ NAIVE APPROACH ***********
// WARNING do be careful here.. there are some large weaknesses
// in the Naive version:
//  - negative weights will break the shortest first ordering
//  - cycles that trap the search in disconnected areas will never terminate
//
// A stronger algo would fix this by checking reachability from the node and
// pruning the impossible branches from the tree (in fact that would be my next
// step in optimization as it also saves cycles.. but im here for a different
// target so not going there.. a more simple fix would be to add a cycle
// limit and terminate on that..)

std::vector<std::shared_ptr<PathNode> > findPaths(Node*    from,
                                                  Node*    to,
                                                  uint32_t limit)
{
    std::vector<std::shared_ptr<PathNode> > results;
    auto comp = [] (const std::shared_ptr<PathNode>& a,
                    const std::shared_ptr<PathNode>& b) -> bool
        { return a->cost_ > b->cost_; };

    // I was using a std::priority_queue.. problem is i
    // cant print it to demonstrate the stack action to
    // the question proposer
    std::vector<std::shared_ptr<PathNode> > stack;

    // ok start the algo by adding the *from* node into a path.. and stacking it
    stack.push_back(std::make_shared<PathNode>(nullptr,
                                               from,
                                               0));

    while (results.size() < limit and not stack.empty())
    {
        // show the stack...
        // std::cout << "  stack: \n";
        // for (const std::shared_ptr<PathNode>& path : stack)
        //     std::cout << " -- c:" << path->cost_ << " p:" << *path << "\n";

        std::pop_heap(stack.begin(), stack.end(), comp);
        std::shared_ptr<PathNode> current = stack.back();
        stack.pop_back();

        // check if node was at the terminal
        if (current->node_ == to)
        {
            // register it as the next result
            results.push_back(current);

            // and end if we have our "limit" of solutions
            if (results.size() > limit)
                return results;
        }

        // and then expand search tree using the current node
        for (const std::pair<Node*,uint32_t>& edge : current->node_->kids_)
        {
            std::shared_ptr<PathNode> step =
                std::make_shared<PathNode>(current,
                                           edge.first,
                                           current->cost_ + edge.second);

            // now here is a tricky part.. the next shortest route may
            // include the current one even if it is a *terminal* (ie self loop)
            // hence we *always* push even if its the *end*!
            stack.push_back(step);
            std::push_heap(stack.begin(), stack.end(), comp);
        }
    }

    // ok we fell of the end of the stack that means there are not "limit" solutions
    return results;
}

void printPaths(std::ostream& os,
                const std::vector<std::shared_ptr<PathNode> >& paths)
{
    os << "Results...\n";
    for (const std::shared_ptr<PathNode>& path : paths)
    {
        os << "d:" << path->cost_ << " p:" << *path << "\n";
    }
}

void test()
{
    {
        // warning naive algo death - dont do this with a cycle it will never terminate
        Graph disconnected;

        Node* to   = disconnected.get(0);
        Node* from = disconnected.get(1);

        printPaths(std::cout, findPaths(from,to,3));
    }

    {
        // self
        Graph self;
        self.link(0,0,1);

        Node* zero = self.get(0);

        printPaths(std::cout, findPaths(zero,zero,3));
    }

    {
        // cycle
        Graph cycle;
        cycle.link(0,1,1);
        cycle.link(1,0,1);

        Node* from = cycle.get(0);
        Node* to   = cycle.get(1);

        printPaths(std::cout, findPaths(from,to,3));
    }

    {
        // a line of nodes ( asking for 3 anwers.. impossible)
        Graph oneLine;
        oneLine.link(0,1,1);
        oneLine.link(1,2,1);
        oneLine.link(2,3,1);

        Node* from = oneLine.get(0);
        Node* to   = oneLine.get(3);

        printPaths(std::cout, findPaths(from,to,3));
    }

    {
        // 2 lines of nodes ( asking for 3 anwers.. impossible)
        Graph twoLine;
        twoLine.link(0,1,1);
        twoLine.link(1,2,1);
        twoLine.link(2,3,1);
        twoLine.link(0,4,2);
        twoLine.link(4,3,2);

        Node* from = twoLine.get(0);
        Node* to   = twoLine.get(3);

        printPaths(std::cout, findPaths(from,to,3));
    }

    {
        // the questioners challenge graph
        Graph triangle;
        triangle.link(1,2,1);
        triangle.link(1,3,2);

        triangle.link(2,1,1);
        triangle.link(2,3,3);
        triangle.link(2,4,2);
        triangle.link(2,5,1);

        triangle.link(3,1,2);
        triangle.link(3,2,3);
        triangle.link(3,5,2);
        triangle.link(3,6,1);

        triangle.link(4,2,2);
        triangle.link(4,5,3);

        triangle.link(5,2,1);
        triangle.link(5,3,2);
        triangle.link(5,4,3);
        triangle.link(5,6,3);

        triangle.link(6,5,3);
        triangle.link(6,3,1);

        Node* from = triangle.get(1);
        Node* to   = triangle.get(6);

        printPaths(std::cout, findPaths(from,to,10));
    }
}

int main()
{
    test();
}

And output will look like this:

Results...
Results...
d:0 p:0
d:1 p:0-0
d:2 p:0-0-0
Results...
d:1 p:0-1
d:3 p:0-1-0-1
d:5 p:0-1-0-1-0-1
Results...
d:3 p:0-1-2-3
Results...
d:3 p:0-1-2-3
d:4 p:0-4-3
Results...
d:3 p:1-3-6
d:5 p:1-2-3-6
d:5 p:1-2-1-3-6
d:5 p:1-2-5-6
d:5 p:1-3-6-3-6
d:5 p:1-2-5-3-6
d:7 p:1-2-1-2-5-6
d:7 p:1-2-1-2-1-3-6
d:7 p:1-2-3-6-3-6
d:7 p:1-2-5-2-1-3-6

Wednesday, October 5, 2016

Adding math rendering via katex to blogger

First edit the Html version of the blogs template and add the following code before the end of the body

<!--KATEX RENDER BEGIN -->
<link href="https://khan.github.io/KaTeX/bower_components/katex/dist/katex.min.css" type="text/css" rel="stylesheet"/>
<script src="https://khan.github.io/KaTeX/bower_components/katex/dist/katex.min.js" type="text/javascript"></script>
<script language='javascript'>
//<![CDATA[ 
function renderLatex()
{
    var locs = document.getElementsByTagName("pre");
    for (var x = 0; x < locs.length; x++)
    {
        if (locs[x].className == "render:latex")
        {
            var eq =  locs[x].innerHTML;
            var div=document.createElement("div");
            div.className = "latexRendered";
            locs[x].parentNode.insertBefore(div,locs[x].nextSibling);

            try
            {
                katex.render("\\displaystyle{" + eq + "}", div);
                locs[x].className = "done:latex";
                locs[x].style = "display:none";
            }
            catch (err)
            {
                div.innerHTML = err;
                locs[x].className = "error:latex";
            }
        }
    }
}

renderLatex();
//]]>
</script>
<!-- KATEX RENDER END -->

You then blog using html tags like the following

<pre class="render:latex">y_i = \sum_iw_ix_i</pre>

And it should render like this

y_i = \sum_iw_ix_i

ReferenceError: katex is not defined

Thursday, September 22, 2016

intel drm driver issue and disk slowness fix in ubuntu / xbuntu 16

If you are running xubuntu of ubuntu and have a intel graphics card then the drivers are a bit broken.. This causes general slowness and the apperance of constant disk accesses and it seems to show most often after suspension wakeup.

You can confirm the issue by tailing dmesg:

dmesg -w

Your watching for this kind of message:

[20474.916101] ------------[ cut here ]------------
[20474.916133] WARNING: CPU: 3 PID: 999 at /build/linux-a2WvEb/linux-4.4.0/drivers/gpu/drm/drm_irq.c:1326 drm_wait_one_vblank+0x1b5/0x1c0 [drm]()
[20474.916135] vblank wait timed out on crtc 0
[20474.916136] Modules linked in: uas usb_storage nls_iso8859_1 mmc_block udf crc_itu_t drbg ansi_cprng ctr ccm i2400m_usb i2400m wimax intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp arc4 iwldvm uvcvideo kvm_intel kvm videobuf2_vmalloc videobuf2_memops mac80211 videobuf2_v4l2 videobuf2_core irqbypass v4l2_common videodev crct10dif_pclmul crc32_pclmul iwlwifi aesni_intel media snd_hda_codec_hdmi aes_x86_64 snd_hda_codec_realtek lrw snd_hda_codec_generic gf128mul snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq joydev glue_helper ablk_helper cryptd input_leds serio_raw snd_seq_device toshiba_acpi snd_timer sparse_keymap cfg80211 snd soundcore toshiba_bluetooth wmi toshiba_haps mei_me mei lpc_ich shpchp mac_hid tpm_infineon ip6t_REJECT
[20474.916188]  nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables parport_pc ppdev lp parport autofs4 i915 psmouse i2c_algo_bit drm_kms_helper ahci syscopyarea sysfillrect libahci sysimgblt fb_sys_fops drm sdhci_pci e1000e sdhci ptp pps_core video fjes
[20474.916224] CPU: 3 PID: 999 Comm: Xorg Not tainted 4.4.0-36-generic #55-Ubuntu
[20474.916226] Hardware name: TOSHIBA dynabook R731/36EB/Portable PC, BIOS Version 3.60   01/24/2012
[20474.916228]  0000000000000286 00000000aa071208 ffff8800a46dbb08 ffffffff813f13b3
[20474.916231]  ffff8800a46dbb50 ffffffffc00cfb38 ffff8800a46dbb40 ffffffff810810f2
[20474.916234]  ffff88014a605000 0000000000000000 0000000000000000 00000000000e0368
[20474.916237] Call Trace:
[20474.916245]  [] dump_stack+0x63/0x90
[20474.916251]  [] warn_slowpath_common+0x82/0xc0
[20474.916254]  [] warn_slowpath_fmt+0x5c/0x80
[20474.916259]  [] ? finish_wait+0x55/0x70
[20474.916272]  [] drm_wait_one_vblank+0x1b5/0x1c0 [drm]
[20474.916276]  [] ? wake_atomic_t_function+0x60/0x60
[20474.916311]  [] intel_atomic_commit+0x43a/0x6f0 [i915]
[20474.916329]  [] ? drm_atomic_set_crtc_for_connector+0x6f/0xe0 [drm]
[20474.916346]  [] drm_atomic_commit+0x37/0x60 [drm]
[20474.916359]  [] drm_atomic_helper_set_config+0x76/0xb0 [drm_kms_helper]
[20474.916374]  [] drm_mode_set_config_internal+0x62/0x100 [drm]
[20474.916389]  [] drm_mode_setcrtc+0x3cc/0x4f0 [drm]
[20474.916402]  [] drm_ioctl+0x152/0x540 [drm]
[20474.916417]  [] ? drm_mode_setplane+0x1b0/0x1b0 [drm]
[20474.916421]  [] do_vfs_ioctl+0x29f/0x490
[20474.916424]  [] ? __sb_end_write+0x21/0x30
[20474.916428]  [] ? vfs_write+0x15d/0x1a0
[20474.916430]  [] SyS_ioctl+0x79/0x90
[20474.916434]  [] entry_SYSCALL_64_fastpath+0x16/0x71
[20474.916437] ---[ end trace a5e017619a02b625 ]---

If your getting hung by it and dont what to shutdown you can get your system moving again by doing this:

sudo pm-suspend

The full solution is the removal of the offending driver:

sudo apt-get remove xserver-xorg-video-intel

Rasberry pi samba and git server setup

Purchase and assumable physical items

* model 3 pi + case
* 32gb mirco sd card and adapter
* usb portable disk
* a usb hub (the pi lacks sufficient power output to drive a usb powered hdd)

Setup Jesse on the micro sd

Note more info at: https://www.raspberrypi.org/documentation/installation/installing-images/

1) Download the RASPBIAN JESSIE LITE .img
https://www.raspberrypi.org/downloads/raspbian/

2) Get sd card mount point

df -h | egrep "mmc|sdd"

/dev/mmcblk0p1   30G   32K   30G   1% /media/kage/3535-3133

3) umount so the card so it can be imaged

umount /dev/mmcblk0p1

4) write img to sd card

sudo dd bs=4M if=~/Downloads/2016-05-27-raspbian-jessie-lite.img of=/dev/mmcblk0
sync

5) correct sd card partition back to max size

sudo parted /dev/mmcblk0

(parted) p free

Model: SD SL32G (sd/mmc)
Disk /dev/mmcblk0: 31.9GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
32.3kB  4194kB  4162kB           Free Space
1      4194kB  70.3MB  66.1MB  primary  fat16        lba
2      70.3MB  31.9GB  31.8GB  primary  ext4
31.9GB  31.9GB  15.0MB           Free Space

(parted) resizepart 2 32G                                              
(parted) p free

Model: SD SL32G (sd/mmc)
Disk /dev/mmcblk0: 31.9GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
32.3kB  4194kB  4162kB           Free Space
1      4194kB  70.3MB  66.1MB  primary  fat16        lba
2      70.3MB  31.9GB  31.8GB  primary  ext4

6) Eject card and reinsert. confirm mounting and sizes are ok

setup passwordless ssh for pi user (assuming the machine your on is the accessor)

1) cd to the mounted sd card and into the pi home dir

cd /media/sd-card-uuid/
pushd .
cd home/pi

2) setup .ssh with the working machines publish ssh info

mkdir .ssh
chmod 0700 .ssh
touch .ssh/authorized_keys
chmod 600 .ssh/authorized_keys
cat ~/.ssh/id_rsa.pub  >> .ssh/authorized_keys

setup static ip

1) cd to the mounted sd card and into the /etc/networks area

popd .

2) modify the interface setup

sudo vi etc/network/interface

3) setup a static hard line ip im using 192.168.1.175 feel free to adjust as needed

iface eth0 inet static
   address 192.168.1.175
   netmask 255.255.255.0
   gateway 192.168.1.1
   network 192.168.1.0
   broadcast 192.168.1.255

3b) OR setup the wifi (i would not advise this)

sudo vi etc/wpa_supplicant/wpa_supplicant.conf

# add this at the end 
network={
  ssid="network id"
  psk="network password"
}

finish up raw img adjustments

1) sync and umount card and then eject the card

sync
umount /dev/mmcblk0p1

2) Install sd card in the pi3. You insert the micro card into the card reader on the underside of the pi3 board, face out.
Power it up and check that it seems to booted

3) Now in back your work machine find out where it went

nmap -sP 192.168.1.0/24

4) If it booted at the correct address then great otherwise check your router setups

check access and secure the default password

1) ssh into the pi. Note the default password is "raspberry" but you shouldnt need it your ssh key is the access method.

ssh pi@192.168.1.175

2) Scramble the default password (I use a software vault, keepassx, to generate and store passwords) this way if someone gets in they cant just sudo with the default password and get root as well

passwd

clean up and secure the ssh access

1) Its an image so /etc/ssh identity files are already setup.. thats not very healthy lets rebuild them

sudo rm -f /etc/ssh/*key*
sudo dpkg-reconfigure openssh-server

2) Confirm the ssh is ok (for chmods etc)

ls -al ~/.ssh/authorized_keys

3) Now secure the ssh access to keys only

sudo vi  /etc/ssh/sshd_config

4) Edit/add the following lines

PasswordAuthentication no
 AllowTcpForwarding no
 X11Forwarding no

5) And confirm that ssh still works (with a second window).. Note if you make a mistake you can always turn off the pi, eject the sd card and edit it directly in your working PC.

update machine and fix various other image guff

1) first update the images software.

sudo apt-get update
sudo apt-get upgrade
sudo apt-get dist-upgrade

2) after that I seemed to be getting this mess with the locale being bad..

perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),

3) so fix it with:

sudo dpkg-reconfigure locales

Setup usb disk

for more info refer to: http://www.howtogeek.com/139433/how-to-turn-a-raspberry-pi-into-a-low-power-network-storage-device/

1) find out what usb disk we have attached

sudo blkid

/dev/sda1: LABEL="EC-PHU3" UUID="2E24054E24051B0B" TYPE="ntfs" PARTUUID="8418c874-01"

2) Right looks like a ntfs (but it could have also been vfat etc). So setup ntfs drivers

sudo apt-get install ntfs-3g

3) make the mount point

sudo mkdir       /media/hdd
sudo chmod a+rwx /media/hdd

4) And check that it can actually mount

sudo mount -t auto /dev/sda1 /media/hdd

4a) ok.. stupid things are happening

modprobe: ERROR: ../libkmod/libkmod.c:557 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.11-v7+/modules.dep.bin'
ntfs-3g-mount: fuse device is missing, try 'modprobe fuse' as root

4b) well thats because.. it has the wrong /lib/modules version installed!!.. wtf!

ls /lib/modules/4.4.13-v7+/

4c) So after googling for a while turns out there is a surprising fix... most likely it was something in that first apt-get update cycle caused all the modules to move forward!

sudo reboot

pi@raspberrypi:/lib/modules $ uname -a
Linux raspberrypi 4.4.13-v7+ #894 SMP Mon Jun 13 13:13:27 BST 2016 armv7l GNU/Linux

4d) But something crazy happened to the ntfs-3g had to install it again??? no idea why..

sudo apt-get install ntfs-3g

5) now mount and test access

sudo mount -t ntfs-3g -o uid=pi,gid=pi /dev/sda1 /media/hdd
cd /media/hdd
touch abc.txt
ls 
rm /media/hdd/abc.txt

6) if that worked and u can see the test file move on otherwise start googling..

Setup usb disk at permanent location

1) now we are going to make the usb disk a permanent fixture, umount it

sudo umount /media/hdd

2) update /etc/fstab. We are going make all files owned by user pi and group users, read and writable
extra info at
* https://www.howtoforge.com/reducing-disk-io-by-mounting-partitions-with-noatime
* http://raspi.tv/2012/how-to-mount-and-use-a-usb-hard-disk-with-the-raspberry-pi
* http://www.omaroid.com/fstab-permission-masks-explained/
* note nobootwait doesnt work???

sudo vi etc/fstab

2a) and add the following line

/dev/sda1       /media/hdd    ntfs-3g noatime,umask=0,uid=pi,gid=users,dmask=0007,fmask=0117 0 0

2b) or you can use something like:

/dev/sda1 /media/hdd vfat uid=pi,gid=users,umask=0022,sync,auto,nosuid,rw,nouser 0 0

3) and then mount it

sudo mount /dev/sda1

4) test access

touch /media/hdd/abc.txt
rm /media/hdd/abc.txt

setup samba for remote share

1) ok build samba share location...

mkdir /media/hdd/share

2) install samba

sudo apt-get install samba samba-common-bin libpam-smbpass

3) Configure samba

sudo cp /etc/samba/smb.conf /etc/samba/smb.conf.20160922
sudo vi /etc/samba/smb.conf

4) insert or uncomment the following in the Authorization section

security = user
unix password sync = yes

5) comment out homes,printers sections.. its junk we dont want or need

6) insert "share" section

[Share]
  comment = 1TB_samba
  path = /media/hdd/share
  valid users = remote
  read only = No

7) bounce the samba service

sudo /etc/init.d/samba restart

8) check its setup

testparm -s

9) now in the working pc with what ever file manager u use "browse" your network and locate the pi likely called "raspberrypi" open it and confirm there are no "printers" stuff and "share" is visible

10) try to access the "share" using "guest" access, then as the user "pi" with the default password, your new password and confirm all of this has no access

11) back in the pi. Make the "remote" user and passwd it. Again password vault scramble a password for it.. but make certain its type-able
For more info refer to http://raspi.tv/2012/how-to-create-a-new-user-on-raspberry-pi

sudo adduser remote
sudo usermod -a -G users remote
groups remote
sudo smbpasswd -e remote

12) then bounce samba

sudo /etc/init.d/samba restart

13) then back in the working PC retest the samba share using the user "remote" and there unix password and confirm file/dir create/delete etc. Also check the guest and "pi" users again to be certain..

14) then reboot and check mount stays put and samba works over the reboot

sudo reboot

Setup the git ssh server

1) make the repos storage location

mkdir /media/hdd/repos

2) install git

sudo apt-get install git-core

3) add a git user

sudo adduser git
sudo usermod -a -G users git
groups git

4) switch over to the new "git" user

#cat *pi* users access keys first so u can copy it later for step 6
cat .ssh/authorized_keys

su git
cd ~

5) confirm sudo limits.. (should fail)

sudo ls

6) setup the "git" users ssh access

mkdir .ssh
chmod 0700 .ssh
touch .ssh/authorized_keys
chmod 600 .ssh/authorized_keys
vi ~/.ssh/authorized_keys

7) from your working pc confirm ssh access

ssh git@192.168.1.175

8) link the repos storage to the git users home

ln -s /media/hdd/repos repos

9) create a test repo

cd ~/repos 
git init --bare test.git

9a) Note ignore chmod errors (we have it forced to a certian way in fstab)

error: chmod on /media/hdd/repos/test.git/config.lock failed: Operation not permitted
error: chmod on /media/hdd/repos/test.git/config.lock failed: Operation not permitted

Note to self. There might be an issue with executable scripts loosing x permission due to this...may have to rethink this.

10) Then on your working machine clone and check that the repo worked

git clone git@:~/repos/test.git 
cd test/
ls
touch test.txt
git add test.txt 
git commit -a -m "Initial"
git push

11) then reclone and confirm test.txt is there in the new clone

cd ..
git clone git@192.168.1.4:~/repos/test.git test2
cd test2
ls

setup automatic updating

now we know all the basics are working... lets harden it a bit

1) setup auto updates
for more info see https://unixsysdoc.wordpress.com/2015/05/18/xubuntu-security-and-hardening-part-i-basics/

sudo apt-get install unattended-upgrades
sudo dpkg-reconfigure -plow unattended-upgrades

disable needless hardware

1) disable wifi and bluetooth (as im using the hard line)

sudo touch /etc/modprobe.d/disable_rpi3_wifi_bt.conf
sudo vi /etc/modprobe.d/disable_rpi3_wifi_bt.conf

2) add the lines

##wifi
blacklist brcmfmac
blacklist brcmutil
##blue tooth
blacklist btbcm
blacklist hci_uart

3) and reboot

sudo reboot

4) log back in to the pi and confirm wifi is off

ifconfig

setup the firewall

for more info check; https://www.digitalocean.com/community/tutorials/how-to-setup-a-firewall-with-ufw-on-an-ubuntu-and-debian-cloud-server

1) install ufw.. its an easy to use firewall

sudo apt-get install ufw

2) let in local ssh

sudo ufw allow ssh

3) let in local samba (and bonjour)
for more info check: https://ubuntuforums.org/showthread.php?t=806000

sudo ufw allow proto tcp to any port 135 from 192.168.0.0/16
sudo ufw allow proto udp to any port 137 from 192.168.0.0/16
sudo ufw allow proto udp to any port 138 from 192.168.0.0/16
sudo ufw allow proto tcp to any port 139 from 192.168.0.0/16
sudo ufw allow proto tcp to any port 445 from 192.168.0.0/16
sudo ufw allow proto udp to any port 5353 from 192.168.0.0/16

4) and turn it on

sudo ufw enable

5) then retest all the samba acces and git cloning

Setup the hostname to something better

1) adjust the hostname so you know which machine your in replace "raspberrypi" with the name you want

sudo vi /etc/hostname
sudo vi /etc/hosts

2) and reboot

sudo reboot

Clean up

1) clean up the working pc git clones

rm -rf test
rm -rf test2

2) go back to the pi clean out the test and import/setup your real ones

rm test.git

bonus notes..

1) i trialed wake on lan

sudo apt-get install ethtool
ethtool eth0 | grep wake
ethtool -s eth0 wol g

# it responds with bad news

Cannot get wake-on-lan settings: Operation not permitted

Saturday, August 13, 2016

google interview: efficiently find a number in a partial sorted list

// http://www.impactinterview.com/2009/10/140-google-interview-questions/#software_engineer
// You are given a list of numbers.  When you reach the end of the list you
// will come back to the beginning of the list (a circular list).  Write the
// most efficient algorithm to find the minimum # in this list.  Find any given
// # in the list.  The numbers in the list are always increasing but you
// don’t know where the circular list begins, ie: 38, 40, 55, 89, 6, 13, 20,
// 23, 36.

// First of all note very carefully this is for a *unidirectional circular linked 
// list* if we alter the design of the data structure there are much better ways 
// to do this.. but i believe that is out of bounds as a potential answer.

// hmm..  essentially the find min number algorithm is a component of the find 
// any number algorithm..  Since the question isn't written as find the min, 
// then fix the list start and then find any # ill assume that it was meant as 
// a helper step to get to the find algorithm..  so in the interest of getting 
// it done lets just cut to the find any # algorithm and code a few evolutions 
// showing efficiency improvements..

#include <iostream>
#include <vector>
#include <chrono>
#include <functional>

// ****************************************
// *************** CYCLIC LIST ************
// ****************************************

class List
{
    std::vector<int> data_;
    int offset_;

public:
    class Iterator
    {
        List* list_;
        int index_;

    public:
        Iterator() :
            list_(NULL),
            index_(-1)
        {}

        Iterator(List* list) :
            list_(list),
            index_(list->offset_)
        {}

        // WARNING DO NOT USE in find.. that was not spec'ed
        // this is for testing only
        Iterator(List* list, int offset) :
            list_(list),
            index_(offset)
        {}

        Iterator& operator++()
        {
            index_ = (index_ + 1) % list_->data_.size();
            return *this;
        }

        Iterator operator+(int delta)
        {
            int offset = (index_ + delta) % list_->data_.size();
            while (offset < 0) offset += list_->data_.size();

            return Iterator(list_, offset);
        }

        // WARNING DO NOT UE in find.. that was not spec'ed
        // this is for testing only
        Iterator& operator--()
        {
            index_ = (list_->data_.size() + index_ - 1) % list_->data_.size();
            return *this;
        }

        int operator*()
        {
            return list_->data_[index_];
        }

        bool operator==(const Iterator& other) const
        {
            return other.list_ == list_ and
                other.index_ == index_;
        }

        bool operator!=(const Iterator& other) const
        {
            return not (other == *this);
        }

        bool valid() const
        {
            return list_ != NULL;
        }
    };

    List(std::initializer_list<int> data,
         int offset = 0) :
        data_(data),
        offset_(offset)
    {}

    List(int size) :
        data_(size),
        offset_(rand() % size)
    {
        // random fill
        int value = 0;
        for (int i = 0; i < size; ++i)
        {
            value += 2 + rand() % 5;
            data_[i] = value;
        }
    }

    bool empty() const
    {
        return data_.empty();
    }

    Iterator begin()
    {
        return Iterator(this);
    }

    Iterator random()
    {
        return Iterator(this, rand() % data_.size());
    }
};

// ****************************************
// *************** FIND NAIVE *************
// ****************************************

List::Iterator find_naive(List& data, int target)
{
    if (data.empty()) return List::Iterator();

    List::Iterator start = data.begin();
    List::Iterator i = start;

    do
    {
        if (*i == target) return i;
        ++i;
    }
    while (i != start);

    return List::Iterator();
}

// ****************************************
// ************* FIND NAIVE V2 ************
// ****************************************

// Attempt to simply the code a bit so that we handle the cyclic iteration and
// partial sorted issue cleaner for latter scaling

List::Iterator find_naive2(List& data, int target)
{
    if (data.empty()) return  List::Iterator();

    //termination is an issue because start *is* the end ..
    // simplify this by checking the start..
    List::Iterator end = data.begin();
    if (target == *end) return end;

    // then start at the first item after and now "start" is really the end
    List::Iterator i = end + 1;

    while (i != end and *i != target) ++i;
    if (*i == target) return i;

    return  List::Iterator();
}

// Of course you could try a divide and conquer idea..  but your dealing with a
// list that you want to divide in half.  Dividing it it half means you need to
// step the iterator the mid point..  So assuming you know the size of the list
// the total search on a fully sorted list will *always* cost u n/2 + n/4 + n/8
// ...  moves. Which totals to n-1 (ie if n=8 is 4+2+1=7)..  So the performance
// is actually worst on average for the naive version ... of course this might 
// not be true because the cost of iterators might be very low compared to the 
// compare.. but i dont think it is... so im skipping it..

// ****************************************
// ******** SKIP TO SORTED SECTION ********
// ****************************************

// What we can really do is try to take advantage of the sorted order of data
// and terminate early..  In order to do that what you have to do is skip to
// the correct sorted part for your target and then search... (hence the find
// min number stuff before)

// Keep in mind this gets much less robust to problems with the data..  ie
// if the list isnt sort (and parted) before hand..

List::Iterator find_skip_start(List& data, int target)
{
    // handle the empty dataset case..
    if (data.empty()) return  List::Iterator();

    //termination is an issue because start *is* the end ..
    // simplify this by checking the start..
    List::Iterator end = data.begin();
    if (target == *end) return end;

    // then start at the first item after and now "start" is really the end
    List::Iterator i = end + 1;

    // if the target is in the second part
    if (target < *end)
    {
        // skip to the second half
        int split_edge = *end;
        while (*i >= split_edge and i != end) ++i;
    }

    // then search for the target
    while (*i < target and i != end) ++i;

    if (*i == target) return i;
    return  List::Iterator();
}

// ****************************************
// ******* DOUBLE SKIP IN SECTION *********
// ****************************************

// Now because its sorted you dont have to check items every time you move
// forward.  You can step and then roll back if you went to far..  this makes
// some big assumptions about the cost of iterator movement vs integer
// compares.. my "List" class isnt great but it works out.. and in 
// theory should cut out every second compare..

List::Iterator find_double_skip(List& data, int target)
{
    // handle the empty dataset case..
    if (data.empty()) return  List::Iterator();

    //termination is an issue because start *is* the end ..
    // simplify this by checking the start..
    List::Iterator end = data.begin();
    if (target == *end) return end;

    // then start at the first item after and now "start" is really the end
    List::Iterator i  = end + 1;

    if (target < *end)
    {
        int split_edge = *end;

        // double skip to second half
        List::Iterator iprev;
        while (*i >= split_edge and i != end)
        {
            ++i;
            iprev = i;
            if (i != end) ++i;
        }

        // check if we need to back up the double step.
        if (iprev.valid() and *iprev < split_edge) i = iprev;
    }

    {
        // double skip till target found or exceeded
        List::Iterator iprev;
        while (*i < target and i != end)
        {
            ++i;
            iprev = i;
            if (i != end) ++i;
        }

        // check if found the target at either of the pointers..
        if (iprev.valid() and *iprev == target) return iprev;
        if (*i == target) return i;
    }

    return  List::Iterator();
}

// ****************************************
// ********** N SKIP IN SECTION ***********
// ****************************************

// Optimization..  of course if u can do better with steps of 2 then u can
// generalize it to any number of steps..  for example using steps of 20..  now
// we are really playing balance games..  trading the speeds of various
// comparisons, assignments etc..  Then u have to tune the size of skip for
// your machine(s)

List::Iterator find_n_skip(List& data, int target)
{
    const int skip = 20;

    // handle the empty dataset case..
    if (data.empty()) return  List::Iterator();

    //termination is an issue because start *is* the end ..
    // simplify this by checking the start..
    List::Iterator end = data.begin();
    if (target == *end) return end;

    // then start at the first item after and now "start" is really the end
    List::Iterator i  = end + 1;

    if (target < *end)
    {
        List::Iterator iprev = i;
        int split_edge = *end;
        // skip to second half
        while (*i >= split_edge and i != end)
        {
            ++i;
            iprev = i;
            int steps = skip;
            while (--steps > 0 and i != end) ++i;
        }

        // when found return to the last known ok point and search forward 1 by 1
        i = iprev;
        while (*i >= split_edge and i != end) ++i;
    }

    {
        List::Iterator iprev = i;
        while (*i < target and i != end)
        {
            ++i;
            iprev = i;
            int steps = skip;
            while (--steps > 0 and i != end) ++i;
        }

        // when found return to the last known ok point and search forward 1 by 1
        i = iprev;
        while (*i < target and i != end) ++i;

        if (*i == target) return i;
    }

    return  List::Iterator();
}

void test(std::string dataTag,
          std::string algoTag,
          std::function<List::Iterator (List& , int)> func,
          List& data,
          int target,
          bool hasTarget)
{
    auto start = std::chrono::system_clock::now();
    List::Iterator result = func(data, target);
    auto end   = std::chrono::system_clock::now();

    std::cout << algoTag << " for " << dataTag << ": ";
    if (not result.valid())
        std::cout << (hasTarget ? "FAIL" : "pass");
    else
        std::cout << ((hasTarget and (target == *result)) ? "pass" : "FAIL");

    std::cout << " -- time:"
              << std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count()
              << " nSec" << std::endl;
}

void testSet(std::string tag,
             List& data,
             int target,
             bool hasTarget)
{
    test(tag, "naive ",  find_naive,       data, target, hasTarget);
    test(tag, "naive2",  find_naive2,      data, target, hasTarget);
    test(tag, "skip  ",  find_skip_start,  data, target, hasTarget);
    test(tag, "2xskip",  find_double_skip, data, target, hasTarget);
    test(tag, "nskip ",  find_n_skip,      data, target, hasTarget);
    std::cout << "\n";
}

int main()
{
    srand(std::chrono::system_clock::now().time_since_epoch().count());

    {
        List list({10,11,12,1,2,3,4,5,6,7,8});
        testSet("small     ", list, 4, true);
    }

    {
        List list(1000000);
        testSet("big rand  ", list, *list.random(), true);
        testSet("big nrand ", list, *list.random()-1, false);  // note that  random always is +2 ~ +7 so -1 means i will not be in the data..

        testSet("big early ", list, *(list.begin() + 10), true);    // i hope.. its random so it might not be..
        testSet("big nearly", list, *(list.begin() + 10)-1, false); // check early exit in first half on fail

        testSet("big last  ", list, *(--list.begin()), true);
        testSet("big nlast ", list, -1, false);
    }
}

naive  for small     : pass -- time:3207 nSec
naive2 for small     : pass -- time:2175 nSec
skip   for small     : pass -- time:1996 nSec
2xskip for small     : pass -- time:2151 nSec
nskip  for small     : pass -- time:4232 nSec

naive  for big rand  : pass -- time:14229998 nSec
naive2 for big rand  : pass -- time:13228995 nSec
skip   for big rand  : pass -- time:12786059 nSec
2xskip for big rand  : pass -- time:10874261 nSec
nskip  for big rand  : pass -- time:9004453 nSec

naive  for big nrand : pass -- time:43750054 nSec
naive2 for big nrand : pass -- time:37382249 nSec
skip   for big nrand : pass -- time:17939681 nSec
2xskip for big nrand : pass -- time:15509062 nSec
nskip  for big nrand : pass -- time:12927864 nSec

naive  for big early : pass -- time:738 nSec
naive2 for big early : pass -- time:611 nSec
skip   for big early : pass -- time:632 nSec
2xskip for big early : pass -- time:708 nSec
nskip  for big early : pass -- time:1064 nSec

naive  for big nearly: pass -- time:28980196 nSec
naive2 for big nearly: pass -- time:27551947 nSec
skip   for big nearly: pass -- time:1034 nSec
2xskip for big nearly: pass -- time:719 nSec
nskip  for big nearly: pass -- time:1217 nSec

naive  for big last  : pass -- time:27341796 nSec
naive2 for big last  : pass -- time:27780499 nSec
skip   for big last  : pass -- time:28165415 nSec
2xskip for big last  : pass -- time:24593599 nSec
nskip  for big last  : pass -- time:22455148 nSec

naive  for big nlast : pass -- time:27195820 nSec
naive2 for big nlast : pass -- time:26584976 nSec
skip   for big nlast : pass -- time:8640245 nSec
2xskip for big nlast : pass -- time:7304119 nSec
nskip  for big nlast : pass -- time:6601818 nSec

Tuesday, August 9, 2016

google interview: measure the speed of a context switch

// http://www.impactinterview.com/2009/10/140-google-interview-questions/#software_engineer
// Write a C program which measures the the speed of a context switch on a
// UNIX/Linux system.

// wow.. just wow... the questioner hates you.. so moving on...

// A context switch occurs when the kernel or hardware thinks its time to do
// something else...  you have no control of it..  *period*...  the only true
// way to measure it is to get into the kernel remove *everything* and check it
// directly..  without running any other programs and device drivers etc, i
// mean hell painting the screen, that blinking cursor your looking at while
// typing thats *all* a context switch.

// so write code to measure this.. well.. i dont know the kernel so thats out..
// so i dont have the skills to do the *real* job.. the best i can do is guess..

// basically i can measure the whole process: program executing, ctx switch, do
// something else, ctx switch, complete execution

// so we are going to try to measure something that should have very little 
// variation and should scale well..  ie multiply and accumulate a register.  
// we will do the ops few times and profile them. Then change the total number 
// of times we do it and reprofile..  of course smaller context switches may 
// be getting buried in the testing method etc..  not to mention the timers
// effects etc etc

// what we are looking for is a statistical blip...  the mult accum should just
// shift right when we scale it..  anything that stays still isnt part of our
// test..  its likely to be some part of the kernels context switching or device
// drivers

// Anyway results looked as expected..  the test execution time moved to the
// right when the number of cycles was increased..  there is a set of a large
// reoccurring spikes in the execution times..  but i doubt these are the cost
// of a context switch, more likely some device driver, service, interrupt
// handlers bumping into my execution etc


#include <iostream>
#include <chrono>
#include <map>

int main()
{
    std::map<int,int> counts;
    for (int k = 0; k < 100; ++k)
    {
        std::cout << ".";
        for (int j = 0; j < 1000000; ++j)
        {
            auto start = std::chrono::system_clock::now();
            for (int i = 0; i < 100000; ++i)
            {
                i += i*i;
            }
            auto end = std::chrono::system_clock::now();

            int delta = std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count();

            // sigh.. crud but will work for now
            if (counts.find(delta) != counts.end())
                ++(counts[delta]);
            else
                counts[delta] = 1;
        }
    }
    std::cout << "\n";

    for (std::map<int,int>::iterator cit = counts.begin();
         cit != counts.end();
         ++cit)
    {
        std::cout << " " << cit->first << ":" << cit->second << "\n";
    }
}

The results copied out into open office and graphed etc. Note that the "fuzz" in the red section maybe significant also.. but im not really a maths guy so I cant be certain, and my use of stats here is really flimsy so grain of salt people... Update: i just noticed that i didnt align the graphs very well(this wasn't intentional) anyway the mistake is a bit disadvantageous to the results so ill leave it be.

Saturday, August 6, 2016

google interview: detect a cycle in a graph

// detect a cycle in a directed graph

#include <iostream>
#include <vector>
#include <unordered_set>
#include <unordered_map>
#include <memory>
#include <algorithm>
#include <functional>
#include <chrono>

struct Node
{
    static int gID;
    int id_;

    std::vector<Node*> kids_;

    Node() : id_(++gID) {}
    void link(Node& kid) { kids_.push_back(&kid); }
};

int Node::gID = 0;

// *************************************************
// *************** PRE BOARD WORK  *****************
// *************************************************
//
//  Basic loop with Node "R" going into unknown "Graph"
//                  +---+    << check for self link node in path
//    ~~~~~~~~~     |   |
//   (         ) <--R<--+
//    ) Graph (     ^
//   (         ) -?-+
//    ~~~~~~~~~
//
// Recursive case for "A" a child Node of "R"
//                  +---+---+    << check for prior node in path
//    ~~~~~~~~~     |   v   |
//   (         ) <--A<--R<--+
//    ) Graph (     ^   ^
//   (         ) -?-+-?-+
//    ~~~~~~~~~
//
// Recursive case for "B" with child Node "A"
//                  +---+---+--+    << check for *ANY* prior node in path
//    ~~~~~~~~~     |   v   v  |
//   (         ) <--B<--A<--R<-+
//    ) Graph (     ^   ^   ^
//   (         ) -?-+-?-+-?-+
//    ~~~~~~~~~
//
// etc
//
// Therefore first algo is to DFS visit all nodes and check if current nodes
// kids loop back into the path of nodes before the current one.

// *************************************************
// ***************** BASIC RECURSE *****************
// *************************************************

bool hasCycle_recurseImpl(Node* v,
                          std::vector<Node*>& path)
{
    // Mark the current node as part of pathway
    path.push_back(v);

    // then for all kids
    for(Node* i : v->kids_)
    {
        // if the kid is path of the path you have a loop done
        std::vector<Node*>::iterator pit =
            std::find(path.begin(), path.end(), i);

        if (pit != path.end())
            return true;

        // recurse into kid
        if (hasCycle_recurseImpl(i, path) )
            return true;
    }

    // remove the node from path
    path.pop_back();
    return false;
}

bool hasCycle_recurse(Node* root)
{
    std::vector<Node*> path;

    if (root == NULL) return false;

    return hasCycle_recurseImpl(root, path);
}


// *************************************************
// ***************** UNORDERED PATH ****************
// *************************************************

// optimization of path - ordering is of the path is *not* important.  As a
// result we can choose a data structure with O(1) search, insert and delete ie
// a hash

bool hasCycle_unordPathImpl(Node* v,
                            std::unordered_set<Node*>& path)
{
    // Mark the current node as part of pathway
    path.insert(v);

    // then for all kids
    for(Node* i : v->kids_)
    {
        // if the kid is path of the path you have a loop done
        if (path.find(i) != path.end())
            return true;

        // recurse into kid
        if (hasCycle_unordPathImpl(i, path) )
            return true;
    }

    // remove the node from path
    path.erase(v);
    return false;
}

bool hasCycle_unordPath(Node* root)
{
    std::unordered_set<Node*> path;

    if (root == NULL) return false;

    return hasCycle_unordPathImpl(root, path);
}

// *************************************************
// ****************** NO REVISIT  ******************
// *************************************************

// optimization of visiting - never revisit a node.  The logic behind this is
// easy but it took me a bit to actually see it.
//
// If you visit a node then the DFS will terminate on 2 possible conditions.
// 1) you found a cycle,
// 2) you exhausted all possible nodes below this one and didn't end up in a
// cycle.
//
// Now here is what took me a bit to get..  If you ever enter a path that forms
// a cycle you have no possible way to exit unless you found end of the cycle
// that means that you can never see a visited node unless it is about to start 
// a cycle or has been exhausted and will *never* form a cycle even if we add to
// the current path. IE if it could have formed a cycle with the current node 
// then it should have traversed into the current node already before it was
// exhausted.

bool hasCycle_1visitImpl(Node* v,
                         std::unordered_set<Node*>& visited,
                         std::unordered_set<Node*>& path)
{
    if(visited.find(v) == visited.end())
    {
        // Mark the current node as visited and part of pathway
        visited.insert(v);
        path.insert(v);

        // then for all kids
        for(Node* i : v->kids_)
        {
            // if the kid is path of the path you have a loop done
            if (path.find(i) != path.end())
                return true;

            // recurse into kid
            if (hasCycle_1visitImpl(i, visited, path) )
                return true;
        }

        // remove the node from path
        path.erase(v);
    }
    return false;
}

bool hasCycle_1visit(Node* root)
{
    std::unordered_set<Node*> visited;
    std::unordered_set<Node*> path;

    if (root == NULL) return false;

    return hasCycle_1visitImpl(root, visited, path);
}

// *************************************************
// *** CALL STACK, PATH and VISIT DUPLICATION  *****
// *************************************************

// call stack, path hash and visit hash optimization -
// the recursive call stack is basically the same as the path stack and the path
// stack is the same as the visit stack all 3 data structures can be made into
// one..  using an iterative approach that stores the parent node to remove call 
// stack, and a stateful hash entry to double as visit and child idx marking which 
// kid is currently in use when in the path

bool hasCycle_iterative(Node* root)
{
    typedef std::pair<Node*,int>  ParentKidIdx;
    std::unordered_map<Node*, ParentKidIdx> path;

    if (root == NULL) return false;

    // note the oddity with the new nodes current kids we start with the end kid
    // this way it naturally ends in a negative idx when kids are exhusted
    path[root] = ParentKidIdx(NULL, root->kids_.size());
    Node* current  = root;

    while (current != NULL)
    {
        std::unordered_map<Node*, ParentKidIdx>::iterator pit
            = path.find(current);

        if (pit == path.end())
            std::cout << "IMPOSSIBLE! we have a node without a state\n";

        // just for easy of undertanding std::pair's member names suck..
        // and as u can see this is a bit confusing
        Node*  parent = pit->second.first;
        int&   kidIdx = pit->second.second;  // note its *writable*

        // find the next kid that that current node should work on ..
        Node* newKid = NULL;
        while (kidIdx >= 0 and newKid == NULL)
        {
            // try to get the next kid
            --kidIdx;
            if (kidIdx >= 0)
            {
                // extract the kid
                newKid = current->kids_[kidIdx];

                // check if the kid has a state already
                std::unordered_map<Node*, ParentKidIdx>::iterator kit
                    = path.find(newKid);


                if (kit != path.end())
                {
                    // child is has aleady been visited but whats its current
                    // state at??
                    if (kit->second.second >= 0)
                        // it is activite and in the current path..so this is a
                        // cycle
                        return true;

                    // it was visited and exhusted..  waste of time move to
                    // next one
                    newKid = NULL;
                }
                else
                {
                    // kid has never been visied before make its init state
                    // entry and then visited it
                    path[newKid] = ParentKidIdx(current,
                                                newKid->kids_.size());
                }
            }
        }

        if (newKid != NULL)
        {
            // we found a new kid to explore.. jump in to it
            current = newKid;
        }
        else
        {
            // we have exhusted all the current ones kids..  move up to the
            // parent
            current = parent;
        }
    }
    return false;
}


// *************************************************
// ******************* TEST ************************
// *************************************************

void testCase(std::string tag,
              std::function<bool(Node*)> hasCycle,
              Node* root,
              bool expected)
{
    auto start = std::chrono::system_clock::now();
    bool result = hasCycle(root);
    auto end   = std::chrono::system_clock::now();

    std::cout << tag << ": " << (expected == result ? "pass" : "FAIL")
              << " -- time:"
              << std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count()
              << " nSec" << std::endl;
}

void testCaseSet(std::string set,
                 Node* root,
                 bool expected)
{
    std::cout << "\n *** " << set << " ***\n";
    testCase("hasCycle_recurse", hasCycle_recurse,       root, expected);
    testCase("hasCycle_unord  ", hasCycle_unordPath,     root, expected);
    testCase("hasCycle_1visit ", hasCycle_1visit,        root, expected);
    testCase("hasCycle_iter   ", hasCycle_iterative,     root, expected);
}

void test()
{
    {
        testCaseSet("null",
                    NULL, false);
    }

    {
        Node a0;
        testCaseSet("single node",
                    &a0, false);
    }

    {
        Node a0;
        a0.link(a0); // cycle
        testCaseSet("single node cycle",
                    &a0, true);
    }

    {
        Node a0;
        Node a1;
        a0.link(a1);
        testCaseSet("two node",
                    &a0, false);
    }

    {
        Node a0;
        Node a1;

        a0.link(a1);
        a0.link(a1);
        testCaseSet("two node - double link",
                    &a0, false);
    }

    {
        Node a0;
        Node a1;

        a0.link(a1);
        a1.link(a0); // cycle
        testCaseSet("two node cycle",
                    &a0, true);
    }

    {
        // visit breaker
        Node a0;
        Node a1;
        Node a2;

        a0.link(a2);
        a0.link(a1);
        a1.link(a2);

        testCaseSet("triple node double end link - reversed",
                    &a0, false);
    }

    {
        // visit breaker
        Node a0;
        Node a1;
        Node a2;

        a0.link(a1);
        a0.link(a2);
        a1.link(a2);

        testCaseSet("triple node double end link",
                    &a0, false);
    }

    {
        Node a0;
        Node a1;
        Node a2;
        Node a3;

        a0.link(a2);
        a0.link(a1);
        a1.link(a2);
        a2.link(a0); // cycle..
        a2.link(a3);
        a3.link(a3); // cycle

        testCaseSet("some random tree",
                    &a0, true);
    }

    {
        // graph speed breaker wide first node with repeated long tail..
        Node head;
        Node chest;
        std::vector<Node> shoulders(100);
        for (int i = 0; i < shoulders.size(); ++i)
        {
            head.link(shoulders[i]);
            shoulders[i].link(chest);
        }

        std::vector<Node> tail(100);
        chest.link(tail[0]);
        for (int i = 1; i < tail.size(); ++i)
        {
            tail[i-1].link(tail[i]);
        }

        testCaseSet("head shoulder tail",
                    &head, false);

        Node looper;
        tail[tail.size()-1].link(looper);
        looper.link(head);

        testCaseSet("head shoulder tail - looped",
                    &head, true);
    }
    std::cout << std::endl;
}

int main()
{
    test();
}

*** null ***
hasCycle_recurse: pass -- time:2574 nSec
hasCycle_unord  : pass -- time:13314 nSec
hasCycle_1visit : pass -- time:4531 nSec
hasCycle_iter   : pass -- time:4097 nSec

 *** single node ***
hasCycle_recurse: pass -- time:4975 nSec
hasCycle_unord  : pass -- time:12330 nSec
hasCycle_1visit : pass -- time:9873 nSec
hasCycle_iter   : pass -- time:9279 nSec

 *** single node cycle ***
hasCycle_recurse: pass -- time:4614 nSec
hasCycle_unord  : pass -- time:8321 nSec
hasCycle_1visit : pass -- time:9863 nSec
hasCycle_iter   : pass -- time:8335 nSec

 *** two node ***
hasCycle_recurse: pass -- time:19714 nSec
hasCycle_unord  : pass -- time:9582 nSec
hasCycle_1visit : pass -- time:13318 nSec
hasCycle_iter   : pass -- time:11426 nSec

 *** two node - double link ***
hasCycle_recurse: pass -- time:5507 nSec
hasCycle_unord  : pass -- time:11714 nSec
hasCycle_1visit : pass -- time:13890 nSec
hasCycle_iter   : pass -- time:11678 nSec

 *** two node cycle ***
hasCycle_recurse: pass -- time:5299 nSec
hasCycle_unord  : pass -- time:7683 nSec
hasCycle_1visit : pass -- time:12348 nSec
hasCycle_iter   : pass -- time:9779 nSec

 *** triple node double end link - reversed ***
hasCycle_recurse: pass -- time:7694 nSec
hasCycle_unord  : pass -- time:13821 nSec
hasCycle_1visit : pass -- time:18194 nSec
hasCycle_iter   : pass -- time:12800 nSec

 *** triple node double end link ***
hasCycle_recurse: pass -- time:7322 nSec
hasCycle_unord  : pass -- time:13999 nSec
hasCycle_1visit : pass -- time:16998 nSec
hasCycle_iter   : pass -- time:12793 nSec

 *** some random tree ***
hasCycle_recurse: pass -- time:4840 nSec
hasCycle_unord  : pass -- time:7792 nSec
hasCycle_1visit : pass -- time:11853 nSec
hasCycle_iter   : pass -- time:15752 nSec

 *** head shoulder tail ***
hasCycle_recurse: pass -- time:18295450 nSec
hasCycle_unord  : pass -- time:14848133 nSec
hasCycle_1visit : pass -- time:615861 nSec
hasCycle_iter   : pass -- time:512595 nSec

 *** head shoulder tail - looped ***
hasCycle_recurse: pass -- time:131233 nSec
hasCycle_unord  : pass -- time:136192 nSec
hasCycle_1visit : pass -- time:242850 nSec
hasCycle_iter   : pass -- time:176559 nSec

Wednesday, August 3, 2016

Google interview: Implement an AVL tree ...

Seems to me that google is most likely to ask about Tree inserts over erase. Also they might ask for traversal, infix printing etc etc... basically the cheapest way to do this is just code the whole AVL tree eco system period...

AVL trees look complex but they basically are a BST tree with an additional pair of rebalancing rotations that you apply to them after an insert or erase.. this makes them somewhat close to the theoretical log(n) depth of a well balanced BST tree...

There are 2 rotate operations that are depicted as follows. they are reversed depending of the side that are viewing them from (some people call them 4 with i think just over complicates them), personally i know them as inside and outside rotates, you can tell by looking at the position of the nodes grand child.. if its under the grandparent its inside, otherwise its outside.

a) grandchild-outside-child (left side)
   T1, T2, T3 and T4 are subtrees. 
          z                            y 
         / \                         /   \
        y   T4  Right Rotate (z)    x      z
       / \      --------------->  /  \    /  \ 
      x   T3                     T1  T2  T3  T4
     / \
   T1   T2

   b) grandchild-inside-child (left-side)    
        z                            z                            x
       / \                         /   \                        /   \ 
      y   T4  Left Rotate (y)     x    T4   Right Rotate(z)   y      z
     / \      -------------->    /  \       -------------->  / \    / \
   T1   x                       y    T3                     T1  T2 T3  T4
       / \                     / \ 
     T2   T3                 T1   T2

The trick to knowing when to perform the rebalance ops is to check the height balance between the nodes left and right side of the current one. If the height balance is off by more than +/- 1 you need to rotate, the sign tells you the direction. Then to determine if its the inner or outer case you check the balance of the child node.. as you can see from the diagram above the outer case should already have an inbalance in the *same* direction(sign) as the parents inbalance if it does not then you have the inner case and need to adjust it to make the other case before doing the rotate to correct..

#include <stdint.h>
#include <vector>
#include <algorithm>
#include <iostream>
#include <functional>

template <typename T>
class AvlTree
{
public:
    // late binding via constructor - dont pollute type
    typedef std::function<bool(const T&, const T&)> Comparer;

private:
    struct Node
    {
        uint32_t height_;
        Node*    left_;
        Node*    right_;
        T        item_;

        Node(const T& item) :
            height_(1),
            left_(NULL),
            right_(NULL),
            item_(item)
        {}

        ~Node()
        {
            if (left_  != NULL) delete left_;
            if (right_ != NULL) delete right_;
        }

        int height() const
        {
            int leftHeight  = (left_  != NULL) ? left_->height_  : 0;
            int rightHeight = (right_ != NULL) ? right_->height_ : 0;

            return std::max(leftHeight, rightHeight) + 1;
        }

        int balance() const
        {
            int leftHeight  = (left_  != NULL) ? left_->height_  : 0;
            int rightHeight = (right_ != NULL) ? right_->height_ : 0;

            return leftHeight - rightHeight;
        }

    };
public:
    class Iterator
    {
        // infix DFS reshaped into stack for an iterator.. fun but crazy...
        //
        // Ok..  So i got creative here..  rather than use parent pointers in the
        // tree I implemented the iterator as a DFS parent stack..  now you do
        // *not* want to do this in production because its actually a lot more
        // fragile than the parent pointer solution..
        //
        // consider the following trade offs
        // - the find operation needs to create the stack which is costly and
        //  iterator returned from the find will often *not* move making it
        //  needless
        // - the iterator is less robust to tree updates.  with a parent
        //  pointer the iterator just points at one node so changing anything that
        //  is not the current node is generally ok..  However with the stack
        //  implementation any update that effects the parent chain breaks the
        //  iterators stack and makes a large issue

        // TODO can a bi-directional version be made??...

        typedef std::vector<Node*> Stack;

        Stack stack_;

        void enstackLeft(Node* node);
        void next();

    public:
        Iterator(Node* root);   // not publicily usable because Node is not public!
        Iterator();
        Iterator& operator++();
        void      operator++(int);
        const T&  operator*() const;
        const T*  operator->() const;

        T&  operator*() ;
        T*  operator->();

        int height() const; // yep mostly for testing

        bool  operator==(const Iterator& other) const;
        bool  operator!=(const Iterator& other) const;
    };

    class Handle
    {
        // a light weight *immovable* iterator .. so *not* an iterator
        // TODO it *is* possible to lazy covert to a real iterator if ++'ed
        //  but whatever the stacked based iterator is crazy
        Node* node_;
    public:
        Handle() :
            node_(NULL)
        {}

        Handle(Node* node) :
            node_(node)
        {}

        const T& operator*()  const { return node_->item_; }
              T& operator*()        { return node_->item_; }

        const T* operator->() const { return &(node_->item_); }
              T* operator->()       { return &(node_->item_); }

        operator bool () const      { return node_ != NULL; }
    };

private:
    Node*   root_;
    Comparer cmp_;

    // TODO compilation firewalling issues...
    //  ie can this be converted to a base Node without T and a factory to
    //  create and handle Nodes with T.

    void rotateLeft (Node*& node);
    void rotateRight(Node*& node);
    void rebalance  (Node*& node);

    void insertImpl (const T& item,
                     Node*& node);

    void replacementEraseRight(Node*& node,
                               Node*& other);

    void eraseImpl(const T& item,
                   Node*& node);

public:
    AvlTree() :
        root_(NULL),
        cmp_(std::less<T>())
    {}

    AvlTree(Comparer cmp) :
        root_(NULL),
        cmp_(cmp)
    {}

    ~AvlTree()
    {
        if (root_ != NULL) delete root_;
    }

    void insert(const T& item);
    void erase(const T& item);
    Handle find(const T& item);
    void print(std::ostream& os);

    Iterator begin();
    Iterator end();
};

// *************************************************
// ******************** ITERATOR *******************
// *************************************************

template <typename T>
AvlTree<T>::Iterator::Iterator() :
    stack_()
{}

template <typename T>
AvlTree<T>::Iterator::Iterator(AvlTree<T>::Node* root) :
    stack_()
{
    //when we add new nodes we go down the left side and enstack then
    enstackLeft(root);
}

template <typename T>
void AvlTree<T>::Iterator::enstackLeft(typename AvlTree<T>::Node* node)
{
    while (node != NULL)
    {
        stack_.push_back(node);
        node = node->left_;
    }
}

template <typename T>
void AvlTree<T>::Iterator::next()
{
    // node has just completed left side and self visit.. remove me from the stack
    Node* current = stack_.back();
    stack_.pop_back();

    // and enstack add my right branch
    enstackLeft(current->right_);
}

template <typename T>
const T& AvlTree<T>::Iterator::operator*() const
{
    return stack_.back()->item_;
}

template <typename T>
T& AvlTree<T>::Iterator::operator*()
{
    return stack_.back()->item_;
}

template <typename T>
const T* AvlTree<T>::Iterator::operator->() const
{
    return &(stack_.back()->item_);
}

template <typename T>
T* AvlTree<T>::Iterator::operator->()
{
    return &(stack_.back()->item_);
}

template <typename T>
int AvlTree<T>::Iterator::height() const
{
    return stack_.back()->height_;
}


template <typename T>
typename AvlTree<T>::Iterator& AvlTree<T>::Iterator::operator++()
{
    next();
    return *this;
}

// NOTE void on x++ to termiate the post iterator junk
template <typename T>
void AvlTree<T>::Iterator::operator++(int)
{
    // Node* prev = stack_.back();
    next();
    // return ...;
}

template <typename T>
bool AvlTree<T>::Iterator::operator==(const AvlTree<T>::Iterator& other) const
{
    // TODO short cut the last item?? would be quicker
    return other.stack_ == stack_;
}

template <typename T>
bool AvlTree<T>::Iterator::operator!=(const AvlTree<T>::Iterator& other) const
{
    // TODO short cut the last item?? would be quicker
    return not (other == *this);
}

template <typename T>
typename AvlTree<T>::Iterator AvlTree<T>::begin()
{
    return Iterator(root_);
}

template <typename T>
typename AvlTree<T>::Iterator AvlTree<T>::end()
{
    return Iterator();
}

// *************************************************
// ********************** FIND *********************
// *************************************************

// this is a bit of a problem using stack based iterators makes find
// more weighty using nodes exposes the internals
// lightwieght_iterator?? one that cant move but gives you access.. a "handle"
template <typename T>
typename AvlTree<T>::Handle AvlTree<T>::find(const T& item)
{
    Node* node = root_;
    while (node != NULL)
    {
        if      (cmp_(item, node->item_))   // less *than*
        {
            node = node->left_;
        }
        else if (cmp_(node->item_, item))   // greater *than*
        {
            node = node->right_;
        }
        else
        {
            return typename AvlTree<T>::Handle(node);
        }
    }
    return typename AvlTree<T>::Handle();
}

// *************************************************
// ********************** PRINT ********************
// *************************************************

template <typename T>
void AvlTree<T>::print(std::ostream& os)
{
    for (AvlTree<uint32_t>::Iterator it = begin();
         it != end();
         ++it)
    {
        os << *it << "(" << it.height()  << ")" << " ";
    }
}

template <typename T>
std::ostream& operator<<(std::ostream& os, AvlTree<T>& tree)
{
    tree.print(os);
    return os;
}

// *************************************************
// ******************** REBALANCE ******************
// *************************************************

template <typename T>
void AvlTree<T>::rotateLeft(typename AvlTree<T>::Node*& node)
{
    // given the current node make its left the top
    Node* right  = node->right_;
    if (right == NULL) return;  // this will explode.. stop!

    node->right_ = right->left_;
    right->left_  = node;

    // correct height.. *before* adjusting parents link (node)
    node->height_  = node->height();
    right->height_ = right->height();

    // keep in mind that this is using Node*& so the following updates the
    // nodes parent left/right or the root!
    node = right;
}

template <typename T>
void AvlTree<T>::rotateRight(typename AvlTree<T>::Node*& node)
{
    // given the current node make its left the top
    Node* left  = node->left_;
    if (left == NULL) return; // this will explode.. stop!

    node->left_ = left->right_;
    left->right_ = node;

    // correct height,,, *before* adjusting parents link (node)
    node->height_ = node->height();
    left->height_ = left->height();

    // keep in mind that this is using Node*& so the following updates the
    // nodes parent left/right or the root!
    node = left;
}

// attempt to rebalance in a general way for both insert delete..
template <typename T>
void AvlTree<T>::rebalance(typename AvlTree<T>::Node*& node)
{
    if (node == NULL) return;

    // correct node height..
    node->height_ = node->height();

    // and compute AVL rotations
    int balance = node->balance();

    // done if balance is good...
    if (balance > -2 and balance < 2)  return;

    Node* child = NULL;
    if (balance < -1)
    {
        // right side is heavy
        child = node->right_;
    }
    else if (balance > 1)
    {
        // left side is heavy
        child = node->left_;
    }

    int childBalance = child->balance();

    // std::cout << "   ++ childBalance: " << childBalance << "\n";

    // now rotate to fix inbalance
    if (balance > 0)
    {
        // planning to rotate right..
        if (childBalance <= 0)
            //but if the inner child left heavy or balanced correct it
            rotateLeft(node->left_);

        // left side so rotate right
        rotateRight(node);
    }
    else
    {
        // planning to rotate left..
        if (childBalance >= 0)
            //but if the inner child right heavy or balanced correct it
            rotateRight(node->right_);

        rotateLeft(node);
    }
}

// *************************************************
// ********************* INSERT ********************
// *************************************************

template <typename T>
void AvlTree<T>::insertImpl(const T& item,
                            typename AvlTree<T>::Node*& node)
{
    // BST tree insert
    if (node == NULL)
    {
        node = new Node(item);
        return;
    }

    if (cmp_(item, node->item_))
    {
        insertImpl(item, node->left_);
    }
    else
    {
        insertImpl(item, node->right_);
    }

    // now AVL tree rebalance..
    rebalance(node);
}

template <typename T>
void AvlTree<T>::insert(const T& item)
{
    insertImpl(item, root_);
}

// *************************************************
// ********************* ERASE *********************
// *************************************************

template <typename T>
void AvlTree<T>::replacementEraseRight(typename AvlTree<T>::Node*& node,
                                       typename AvlTree<T>::Node*& other)
{
    // we are no longer searching for the erase target..
    // we are now searching for the next object after it
    // the right side min

    if (other->left_ != NULL)
    {
        // go right to the end of the left branching and do replacement erase
        replacementEraseRight(node,other->left_);

        // then rebalance the tree from here up
        rebalance(other);
    }
    else
    {
        // swap replacement with target node so nothing odd happens
        std::swap(node->item_, other->item_);

        // now unlink and erase other from tree
        Node* right = other->right_;
        other->right_ = NULL;
        delete other;

        // relink others child in its place
        other = right;
    }
}

template <typename T>
void AvlTree<T>::eraseImpl(const T& item,
                           typename AvlTree<T>::Node*& node)
{
    // BST tree erase
    // didnt exist in the tree??
    if (node == NULL) return;

    //
    if (cmp_(item, node->item_))          // a < b
    {
        eraseImpl(item, node->left_);
    }
    else if (cmp_(node->item_, item))     // b < a
    {
        eraseImpl(item, node->right_);
    }
    else                                  // b == a
    {
        // match!
        if ((node->right_ == NULL) or (node->right_ == NULL))
        {
            // non special case.. zero/single link can link with parent
            Node* child = (node->right_ == NULL) ? node->left_ : node->right_;

            // unlink node.. and kill it
            node->right_ = NULL;
            node->left_ = NULL;
            delete node;

            // replace with child
            node = child;
        }
        else
        {
            // special case both right and left have stuff.. but we have 1 parent
            // means we have to take a one of the closet (in value) nodes and replace this one
            // we have 2 choices for the replacement the max of left or min of right...
            replacementEraseRight(node, node->right_);

            // TODO.. shouldn't this do a balance check and then go left or right based on the heavy branch??
        }
    }

    // and then AVL tree rebalance..
    rebalance(node);
}

template <typename T>
void AvlTree<T>::erase(const T& item)
{
    eraseImpl(item, root_);
}

// *************************************************
// ********************* AVL MAP *******************
// *************************************************

// TODO cleanup AvlTree to AvlMap coverison class
template <typename Key,
          typename Value>
class AvlMap
{
    typedef std::pair<Key,Value> Pair;
    typedef std::function<bool(const Pair&, const Pair&)> Comparer;
    typedef AvlTree<Pair> Tree;

    AvlTree<Pair> tree_;
public:
    typedef typename Tree::Iterator Iterator;

    AvlMap() :
        tree_([] (const Pair& v1, const Pair& v2) { return v1.first < v2.first; })
    {}

    Value& operator[](const Key& key)
    {
        // hmm not such a great implementation...
        Pair target(key,Value());
        typename AvlTree<Pair>::Handle it = tree_.find(target);
        if (it) return it->second;

        tree_.insert(target);
        return tree_.find(target)->second;
    }

    Iterator begin() { return tree_.begin(); }
    Iterator end()   { return tree_.end();   }
};

// *************************************************
// *********************** TEST ********************
// *************************************************

void test()
{
    // TODO honestly this inst real testing.. need to make it check the results..
    {
        AvlTree<uint32_t> tree;

        tree.begin();
        tree.end();

        tree.find(1);
    }

    {
        AvlTree<uint32_t> tree;

        tree.insert(50); std::cout << tree << "\n";
        tree.insert(25); std::cout << tree << "\n";
        tree.insert(10); std::cout << tree << "\n";
        tree.insert(40); std::cout << tree << "\n";
        tree.insert(30); std::cout << tree << "\n";
        tree.insert(45); std::cout << tree << "\n";
        tree.insert(75); std::cout << tree << "\n";

        AvlTree<uint32_t>::Handle handle = tree.find(75);
        std::cout << "find:" << *handle << "\n";

        tree.erase(50); std::cout << tree << "\n";
        tree.erase(25); std::cout << tree << "\n";
        tree.erase(10); std::cout << tree << "\n";
        tree.erase(40); std::cout << tree << "\n";
        tree.erase(30); std::cout << tree << "\n";
        tree.erase(45); std::cout << tree << "\n";
        tree.erase(75); std::cout << tree << "\n";
    }

    // outer rotate check
    {
        AvlTree<uint32_t> tree;

        tree.insert(15); std::cout << tree << "\n";
        tree.insert(77); std::cout << tree << "\n";
        tree.insert(35); std::cout << tree << "\n";

        // pair erase check
        tree.erase( 1); std::cout << tree << "\n";
        tree.erase(35); std::cout << tree << "\n";
        tree.erase(15); std::cout << tree << "\n";
        tree.erase(77); std::cout << tree << "\n";
        tree.erase( 1); std::cout << tree << "\n";
    }

    {
        AvlTree<uint32_t> tree;

        tree.insert(93); std::cout << tree << "\n";
        tree.insert(86); std::cout << tree << "\n";
        tree.insert(87); std::cout << tree << "\n";

        // pair erase check
        tree.erase(99); std::cout << tree << "\n";
        tree.erase(86); std::cout << tree << "\n";
        tree.erase(87); std::cout << tree << "\n";
        tree.erase(93); std::cout << tree << "\n";
        tree.erase(99); std::cout << tree << "\n";
    }
}

void randomTreeTest()
{
    AvlTree<uint32_t> tree;

    for (int i = 0; i < (4096-1); ++i)
    {
        uint32_t value = std::rand() % 100;
        // std::cout << value << " ";
        tree.insert(value);
    }
    std::cout << "\n";

    // frequency count the nodes at each layer..
    // thought it was ironic to use AvlTree as the map also.. just to prove that its flexible
    AvlMap<int,int> freq;
    for (AvlTree<uint32_t>::Iterator it = tree.begin();
         it != tree.end();
         ++it)
    {
        ++(freq[it.height()]);
    }
    std::cout << "\n";

    // dump it
    for (AvlMap<int,int>::Iterator it = freq.begin();
         it != freq.end();
         ++it)
    {
        std::cout << it->first << ":" << it->second  << " ";
    }
    std::cout << "\n";
}


int main()
{
    test();

    randomTreeTest();
}

And the output looks like:

50(1) 
25(1) 50(2) 
10(1) 25(2) 50(1) 
10(1) 25(3) 40(1) 50(2) 
10(1) 25(3) 30(1) 40(2) 50(1) 
10(1) 25(2) 30(1) 40(3) 45(1) 50(2) 
10(1) 25(2) 30(1) 40(3) 45(1) 50(2) 75(1) 
find:75
10(1) 25(2) 30(1) 40(3) 45(1) 75(2) 
10(1) 30(2) 40(3) 45(1) 75(2) 
30(1) 40(3) 45(1) 75(2) 
30(1) 45(2) 75(1) 
45(2) 75(1) 
75(1) 

15(1) 
15(2) 77(1) 
15(1) 35(2) 77(1) 
15(1) 35(2) 77(1) 
15(1) 77(2) 
77(1) 


93(1) 
86(1) 93(2) 
86(1) 87(2) 93(1) 
86(1) 87(2) 93(1) 
87(2) 93(1) 
93(1) 




1:1992 2:995 3:518 4:266 5:129 6:90 7:45 8:26 9:16 10:8 11:5 12:2 13:2 14:1