Tuesday, June 7, 2016

c++ like template AutoGrad - How to compute the Derivative Of a function using templates

You may have heard of a system called AutoGrad. It is a system that when given an equation computes the derivate of it automatically. Here is an example of how to build the same system using c++ templates.

I was inspired to write this after i realized i messed up the differentiation of the for the neural networks back prop path in my last post.

Clearly you see where im headed with this.. a few more mods to make it work with tensors and ill have full autograd system for c++ that can automatically compute the backdrop of a neural network.. and will also compile in c4driod.. and then toss in some thrust code and it will be CUDA capable able to use PC with GPUs

from smartphone to GPU in a few short steps...

Anyway here is the proof of concept code:

output looks like
x:2 m:3 c:4 p:6 y:10 z:16 q:100
 dm/dx:0 dx/dx:1 dp/dx:1 dy/dx:3 dz/dx:12 dq/dx:60

// compile with "g++ std=c++11 ..."
// or in c4driod on an andriod phone..

#include <tuple>
#include <iostream>
#include <cmath>

template <int VALUE>
struct Const
{
    template< typename... Types >
    static double op(std::tuple<Types...>& params )
    {
        return VALUE;
    }
};

template <int ID>
struct Var
{
    template< typename... Types >
    static double op(std::tuple<Types...>& params )
    {
        return std::get<ID>(params);
    }
};

template <typename L, typename R>
struct OpAdd
{
    template< typename... Types >
    static double op(std::tuple<Types...>& params)
    {
        return L::op(params) + R::op(params);
    }
};

template <typename L, typename R>
struct OpMult
{
    template< typename... Types >
    static double op(std::tuple<Types...>& params)
    {
        return L::op(params) * R::op(params);
    }
};

template <typename L, int POW>
struct OpPower
{
    template< typename... Types >
    static double op(std::tuple<Types...>& params)
    {
        return std::pow(L::op(params), POW);
    }
};

template <typename Num, typename Denum>
struct DerivativeOf {};

template <int V, int ID_D>
struct DerivativeOf<Const<V>, Var<ID_D> >
{
    typedef Const<0> Type;
};

template <int ID_N, int ID_D>
struct DerivativeOf<Var<ID_N>, Var<ID_D> >
{
    typedef Const<0> Type;
};

template <int ID_N>
struct DerivativeOf<Var<ID_N>, Var<ID_N> >
{
    typedef Const<1> Type;
};

template <typename L, typename R, typename D>
struct DerivativeOf<OpAdd<L,R>, D >
{
    typedef OpAdd<typename DerivativeOf<L,D>::Type,
                  typename DerivativeOf<R,D>::Type> Type;
};

template <typename L, typename R, typename D>
struct DerivativeOf<OpMult<L,R>, D >
{
    typedef OpAdd<OpMult<typename DerivativeOf<L, D>::Type, R>,
                  OpMult<L, typename DerivativeOf<R, D>::Type> > Type;
};

template <typename L, int POW>
struct DerivativeOf<OpPower<L,POW>, L >
{
    typedef OpMult<Const<POW>, OpPower<L, POW-1> > Type;
};

template <typename L, int POW, typename D>
struct DerivativeOf<OpPower<L,POW>, D >
{
    typedef OpMult<OpMult<Const<POW>, OpPower<L, POW-1> >,
                   typename DerivativeOf<L, D>::Type>  Type;
};

int main()
{
    typedef Var<0> X;
    typedef Var<1> M;
    typedef Var<2> C;
    typedef OpAdd<X,C> P;                      // p = x + c
    typedef OpAdd<OpMult<M,X>,C> Y;            // y = m*x + c
    typedef OpAdd<OpMult<M,OpPower<X,2>>,C> Z; // z = m*x^2 + c
    typedef OpPower<OpAdd<OpMult<M,X>,C>,2> Q; // q = = y^2 = (m*x + c)^2

    typedef DerivativeOf<M, X>::Type  dM_dX;   // dm/dx = 0
    typedef DerivativeOf<X, X>::Type  dX_dX;   // dx/dx = 1
    typedef DerivativeOf<P, X>::Type  dP_dX;   // dP/dx = 1
    typedef DerivativeOf<Y, X>::Type  dY_dX;   // dY/Dx = m
    typedef DerivativeOf<Z, X>::Type  dZ_dX;   // dZ/dx = 2x
    typedef DerivativeOf<Q, X>::Type  dQ_dX;   // dQ/dx = dq/dy + dy/dx = 2*(m*x + c)*m

    X x;
    M m;
    C c;
    P p;
    Y y;
    Z z;
    Q q;

    dX_dX  dx_dx;
    dM_dX  dm_dx;
    dP_dX  dp_dx;
    dY_dX  dy_dx;
    dZ_dX  dz_dx;
    dQ_dX  dq_dx;

    std::tuple<double,double,double> params = std::make_tuple(2.0,3.0,4.0);

    std::cout << " x:" << x.op(params)
              << " m:" << m.op(params)
              << " c:" << c.op(params)
              << " p:" << p.op(params)
              << " y:" << y.op(params)
              << " z:" << z.op(params)
              << " q:" << q.op(params)
              << "\n";

    std::cout << " dm/dx:"  << dm_dx.op(params)
              << " dx/dx:"  << dx_dx.op(params)
              << " dp/dx:"  << dp_dx.op(params)
              << " dy/dx:"  << dy_dx.op(params)
              << " dz/dx:"  << dz_dx.op(params)
              << " dq/dx:"  << dq_dx.op(params)
              << "\n";
}

Thursday, May 5, 2016

libraryless c++ neural nets

UPDATE: DONT USE THIS

I have noticed that i messed up the backprop.. seems Matrics are not my strong point and do/dw is a killer.. even the wiki page on it gives up https://en.wikipedia.org/wiki/Matrix_calculus#Layout_conventions ... check the "Result of differentiating various kinds of aggregates with other kinds of aggregates" table it has "??" in the vector y to matrix X column



I like every other com-sci guy on the planet is studying Machine learning... But to do this tools are needed..

Libraries, toolkits and frameworks with their sundry list of requirements have always annoyed the hell out of me. These things are never cost-less.. We constantly have burn time to setup a system that is compatible with the megabytes of dependencies that these things require to be installed, then deal with the portability issues that stop them from installing and then over the long term maintain and upgrade them and deal with package drift between the various components to keep it all running and then hope that the package maintainers dont quit updating the entire mess.

Library free raw language implementation have some big plus going for them:
* Firstly any of todays smartphones have enough processing power to do the job of training and working with neural networks.. but the tool chain makes it next to impossible actually do the job..
* Secondly companies are often paranoid about what you can install in workstations.

So in short I have tossed out the ML tool chain and give a shoot at doing it in plain simple c++.

ML at its core is just Matrix math so I have hacked together a Matrix class that can be compiled with c4driod, g++, and clang. Then using this Matrix class build a simple xor model and trained using hand worked out back prop. This is proof of concept that with less than 1000 lines of normal c++ code and a standard c++ compiler your good to go for ML projects.

The whole mess is 728 lines of c++11 code (it needs a fair bit clean up but hey whatever)

Heres the end result of the build and run on my smartphone (a nexus 5x using c4driod)



And the code:
#include <iostream>
#include <vector>
#include <functional>
#include <memory>
#include <sstream>
#include <stdexcept>
#include <cmath>
#include <cstdlib>

// ****************************************************************
// *************************** MATRIX *****************************
// ****************************************************************

template <typename Type>
struct Matrix
{
    //private:
    typedef std::vector<Type> Data;

    std::shared_ptr<Data>    data_;
    std::vector<std::size_t> shape_;

    // friend class MatrixUtils<Type>;
    //public:

    Matrix()
    {}

    Matrix(std::size_t cols,
           std::size_t rows)
    {
        data_.reset(new Data);
        data_->resize(cols*rows);
        shape_.push_back(cols);
        shape_.push_back(rows);
    }

    template <std::size_t N>
    Matrix(Type (&raw)[N],
           std::size_t cols,
           std::size_t rows)
    {
        data_.reset(new Data);
        data_->resize(N);
        data_->assign(raw,raw+N);
        shape_.push_back(cols);
        shape_.push_back(rows);
    }

    Matrix(Type* begin, Type* end,
           std::size_t cols,
           std::size_t rows)
    {
        data_.reset(new Data);
        data_->resize(cols*rows);
        data_->assign(begin,end);
        shape_.push_back(cols);
        shape_.push_back(rows);
    }

    Matrix(std::shared_ptr<Data> data,
           std::size_t cols,
           std::size_t rows) :
        data_(data)
    {
        shape_.push_back(cols);
        shape_.push_back(rows);
    }

    Type& at(std::size_t x, std::size_t y)
    {
        return (*data_)[y*shape_[0] + x];
    }

    Type at(std::size_t x, std::size_t y) const
    {
        return (*data_)[y*shape_[0] + x];
    }
};

// ****************************************************************
// ************************ MATRIX UTILS **************************
// ****************************************************************

template <typename Type>
struct MatrixUtils
{
    static void print(std::ostream& os, const Matrix<Type>& a)
    {
       for (std::size_t y = 0; y < a.shape_[1]; ++y)
        {
            for (std::size_t x = 0; x < a.shape_[0]; ++x)
            {
                os << a.at(x,y) << " ";
            }
            os << "\n";
        }
    }

    static Matrix<Type> matmul(const Matrix<Type>& a,
                               const Matrix<Type>& b)
    {
        if (a.shape_[0] != b.shape_[1])
        {
            std::stringstream ss;
            ss << "Matrix shapes wrong for matmul"
               << " a: " << a.shape_[0] << "x" << a.shape_[1]
               << " b: " << b.shape_[0] << "x" << b.shape_[1];
            throw std::runtime_error(ss.str());
        }

        Matrix<Type> res(b.shape_[0], a.shape_[1]);

        for (std::size_t y = 0; y < res.shape_[1]; ++y)
        {
            for (std::size_t x = 0; x < res.shape_[0]; ++x)
            {
                Type sum = 0;
                for (std::size_t i = 0; i < a.shape_[0]; ++i)
                {
                    sum += a.at(i,y) * b.at(x,i);
                }
                res.at(x,y) = sum;
            }
        }

        return res;
    }

    static Matrix<Type> selrow(std::size_t row,
                               const Matrix<Type>& a)
    {
        // TODO reimpl as an iterator!
        Matrix<Type> r(a.shape_[0],1);

        for (std::size_t x = 0; x < a.shape_[0]; ++x)
        {
            r.at(x,0) = a.at(x,row);
        }

        return r;
    }

    static Matrix<Type> selcol(std::size_t col,
                               const Matrix<Type>& a)
    {
        Matrix<Type> r(1,a.shape_[1]);

        for (std::size_t y = 0; y < a.shape_[1]; ++y)
        {
            r.at(0,y) = a.at(col,y);
        }

        return r;
    }

    static Matrix<Type> transpose(const Matrix<Type>& a)
    {
        Matrix<Type> r(a.shape_[1],a.shape_[0]);

        for (std::size_t y = 0; y < a.shape_[1]; ++y)
        {
            for (std::size_t x = 0; x < a.shape_[0]; ++x)
            {
                r.at(y,x) = a.at(x,y);
            }
        }

        return r;
    }

    static void unifunctor_inplace(std::function<Type (Type)> func,
                                   Matrix<Type>& a)

    {
        typedef typename Matrix<Type>::Data::iterator iterator;
        // TODO this feels llike there should be an std:algo for it
        //  maybe generate ??

        for(iterator it = a.data_->begin();
            it != a.data_->end();
            ++it)
        {
            *it = func(*it);
        }
    }

    static Matrix<Type> unifunctor(std::function<Type (Type)> func,
                                   const Matrix<Type>& a)

    {
        typedef typename Matrix<Type>::Data::iterator iterator;
        // TODO this feels llike there should be an std:algo for it
        //  maybe generate ??

        Matrix<Type> r(a.shape_[0],a.shape_[1]);

        iterator ait = a.data_->begin();
        for(iterator rit = r.data_->begin();
            rit != r.data_->end();
            ++rit)
        {
            *rit = func(*ait);
            ++ait;
        }

        return r;
    }

    static Matrix<Type> bifunctor(std::function<Type (Type,Type)> func,
                                  const Matrix<Type>& a,
                                  const Matrix<Type>& b)

    {
        typedef typename Matrix<Type>::Data::iterator iterator;

        if (a.shape_[0] != b.shape_[0] or
            a.shape_[1] != b.shape_[1])
        {
            std::stringstream ss;
            ss << "Matrix shapes mismatch for bifunctor"
               << " a: " << a.shape_[0] << "x" << a.shape_[1]
               << " b: " << b.shape_[0] << "x" << b.shape_[1];
            throw std::runtime_error(ss.str());
        }

        // TODO this feels llike there should be an std:algo for it
        //  maybe generate ??
        Matrix<Type> r(a.shape_[0], a.shape_[1]);

        iterator ait = a.data_->begin();
        iterator bit = b.data_->begin();
        iterator rit = r.data_->begin();

        while (rit != r.data_->end())
        {
            *rit = func(*ait, *bit);
            ++ait;
            ++bit;
            ++rit;
        }

        return r;
    }

    static void bifunctor_inplace(std::function<Type (Type,Type)> func,
                                  Matrix<Type>& a,
                                  const Matrix<Type>& b)

    {
        typedef typename Matrix<Type>::Data::iterator iterator;

        if (a.shape_[0] != b.shape_[0] or
            a.shape_[1] != b.shape_[1])
        {
            std::stringstream ss;
            ss << "Matrix shapes mismatch for bifunctor"
               << " a: " << a.shape_[0] << "x" << a.shape_[1]
               << " b: " << b.shape_[0] << "x" << b.shape_[1];
            throw std::runtime_error(ss.str());
        }

        // TODO this feels llike there should be an std:algo for it
        //  maybe generate ??
        Matrix<Type> r(a.shape_[0], a.shape_[1]);

        iterator bit = b.data_->begin();
        for(iterator ait = a.data_->begin();
            ait != a.data_->end();
            ++ait)
        {
            *ait = func(*ait, *bit);
            ++bit;
        }
    }

    static Matrix<Type> bifunctor_row(std::function<Type (Type,Type)> func,
                                      const Matrix<Type>& a,
                                      const Matrix<Type>& b)
    {
        typedef typename Matrix<Type>::Data::iterator iterator;

        if (a.shape_[0] != b.shape_[0] or
            b.shape_[1] != 1)
        {
            std::stringstream ss;
            ss << "Matrix shapes mismatch for bifunctor_row"
               << " a: " << a.shape_[0] << "x" << a.shape_[1]
               << " b: " << b.shape_[0] << "x" << b.shape_[1];
            throw std::runtime_error(ss.str());
        }

        // TODO this feels llike there should be an std:algo for it
        //  maybe generate ??
        Matrix<Type> r(a.shape_[0], a.shape_[1]);

        iterator ait = a.data_->begin();
        iterator bit = b.data_->begin();
        iterator rit = r.data_->begin();

        while (rit != r.data_->end())
        {
            if (bit == b.data_->end()) bit = b.data_->begin();

            *rit = func(*ait, *bit);
            ++ait;
            ++bit;
            ++rit;
        }

        return r;
    }

    static Matrix<Type> bifunctor_scaler(std::function<Type (Type,Type)> func,
                                        const Type a,
                                        const Matrix<Type>& b)

    {
        typedef typename Matrix<Type>::Data::iterator iterator;

        // TODO this feels llike there should be an std:algo for it
        //  maybe generate ??
        Matrix<Type> r(b.shape_[0], b.shape_[1]);

        iterator bit = b.data_->begin();
        iterator rit = r.data_->begin();

        while (rit != r.data_->end())
        {
            *rit = func(a,*bit);
            ++bit;
            ++rit;
        }

        return r;
    }

    static Matrix<Type> bifunctor_scaler(std::function<Type (Type,Type)> func,
                                        const Matrix<Type>& a,
                                        const Type b)
    {
        typedef typename Matrix<Type>::Data::iterator iterator;

        // TODO this feels llike there should be an std:algo for it
        //  maybe generate ??
        Matrix<Type> r(a.shape_[0], a.shape_[1]);

        iterator ait = a.data_->begin();
        iterator rit = r.data_->begin();

        while (rit != r.data_->end())
        {
            *rit = func(*ait,b);
            ++ait;
            ++rit;
        }

        return r;
    }

    struct Helpers
    {
        static Type add(Type a, Type b) { return a+b; }
        static Type sub(Type a, Type b) { return a-b; }
        static Type mul(Type a, Type b) { return a*b; }
        static Type div(Type a, Type b) { return a/b; }

        static Type relu(Type a) { return (a < 0) ? 0 : a; }
        static Type tanh(Type a) { return std::tanh(a); }
        static Type rand(Type a) { return static_cast<Type>(std::rand() % 100)/100.0 - 0.5; }
        static Type ones(Type a) { return 1.0; }

        static Type xor_f(Type a, Type b) { return (a*b < 0) ? -1.0 : 1.0; }
    };
};

// ****************************************************************
// ************************ MATRIX OPERATORS **********************
// ****************************************************************

template <typename Type>
std::ostream& operator<<(std::ostream& os, const Matrix<Type>& a)
{
    MatrixUtils<Type>::print(os, a);
    return os;
}

template <typename Type>
Matrix<Type> operator+(const Matrix<Type>& a,
                       const Matrix<Type>& b)
{
    return MatrixUtils<Type>::bifunctor(&MatrixUtils<Type>::Helpers::add,a,b);
}

template <typename Type>
Matrix<Type> operator-(const Matrix<Type>& a,
                       const Matrix<Type>& b)
{
    return MatrixUtils<Type>::bifunctor(&MatrixUtils<Type>::Helpers::sub,a,b);
}

template <typename Type>
Matrix<Type> operator*(const Matrix<Type>& a,
                       const Matrix<Type>& b)
{
    return MatrixUtils<Type>::matmul(a,b);
}

template <typename Type>
Matrix<Type> operator*(Type                a,
                       const Matrix<Type>& b)
{
    return MatrixUtils<Type>::bifunctor_scaler(&MatrixUtils<Type>::Helpers::mul,a,b);
}

template <typename Type>
Matrix<Type> operator*(const Matrix<Type>& a,
                       Type                b)
{
    return MatrixUtils<Type>::bifunctor_scaler(&MatrixUtils<Type>::Helpers::mul,a,b);
}

template <typename Type>
Matrix<Type> operator/(const Matrix<Type>& a,
                       const Matrix<Type>& b)
{
    return MatrixUtils<Type>::bifunctor(&MatrixUtils<Type>::Helpers::div,a,b);
}

template <typename Type>
Matrix<Type> operator+=(Matrix<Type>&       a,
                        const Matrix<Type>& b)
{
    MatrixUtils<Type>::bifunctor_inplace(&MatrixUtils<Type>::Helpers::add,a,b);
    return a;
}

template <typename Type>
Matrix<Type> product(const Matrix<Type>& a,
                     const Matrix<Type>& b)
{
    return MatrixUtils<Type>::bifunctor(&MatrixUtils<Type>::Helpers::mul,a,b);
}

template <typename Type>
Matrix<Type> rowadd(const Matrix<Type>& a,
                    const Matrix<Type>& b)
{
    return MatrixUtils<Type>::bifunctor_row(&MatrixUtils<Type>::Helpers::add,a,b);
}

template <typename Type>
Matrix<Type> transpose(const Matrix<Type>& a)
{
    return MatrixUtils<Type>::transpose(a);
}

template <typename Type>
Matrix<Type> tanh(const Matrix<Type>& a)
{
    return MatrixUtils<Type>::unifunctor(&MatrixUtils<Type>::Helpers::tanh,a);
}

template <typename Type>
Matrix<Type> tanh_derivate(const Matrix<Type>& a)
{
    // dtanh/dx = 1 - (tanh(x)) ^ 2
    Matrix<Type> th = tanh(a);

    return MatrixUtils<Type>::bifunctor_scaler(&MatrixUtils<Type>::Helpers::sub, 1, product(th,th));
}

template <typename Type>
void rand(Matrix<Type>& a)
{
    MatrixUtils<Type>::unifunctor_inplace(&MatrixUtils<Type>::Helpers::rand,a);
}

template <typename Type>
void ones(Matrix<Type>& a)
{
    MatrixUtils<Type>::unifunctor_inplace(&MatrixUtils<Type>::Helpers::ones,a);
}

// ****************************************************************
// ****************************************************************
// ****************************************************************

struct Model
{
    // input   2
    // hidden1 4
    // hidden2 2
    // output  1
    // function tanh
    // learn rate: 0.03
    // batch size 10
    // xor data [-1, 1]
    //  result in less than 140 iterators
    typedef float Type;

    struct Network
    {
        Matrix<Type> x_;
        Matrix<Type> o1_;
        Matrix<Type> h1_;
        Matrix<Type> o2_;
        Matrix<Type> h2_;
        Matrix<Type> o3_;
        Matrix<Type> y_ ;
    };

    struct Params
    {
        Matrix<float> Wh1_;
        Matrix<float> Bh1_;
        Matrix<float> Wh2_;
        Matrix<float> Bh2_;
        Matrix<float> Wo_;
        Matrix<float> Bo_;

        Params() :
            Wh1_(4,2),
            Bh1_(4,1),
            Wh2_(2,4),
            Bh2_(2,1),
            Wo_ (1,2),
            Bo_ (1,1)
        {
            // init wieghts
            rand(Wh1_);
            rand(Bh1_);
            rand(Wh2_);
            rand(Bh2_);
            rand(Wo_ );
            rand(Bo_ );
        }
    };

    Params p_;

    // init weights
    Model() :
        p_()
    {}

    void forward(Network& net, const Matrix<Type>& inputs)
    {
        // forward network params
        // h1 = tanh(Wh1*x  + Bh1)
        // h2 = tanh(Wh2*h1 + Bh2)
        // y  = tanh(Wo *h2 + Bo )
        net.x_ = inputs;

        net.o1_ = rowadd(net.x_  * p_.Wh1_, p_.Bh1_);  net.h1_ = tanh(net.o1_);
        net.o2_ = rowadd(net.h1_ * p_.Wh2_, p_.Bh2_);  net.h2_ = tanh(net.o2_);
        net.o3_ = rowadd(net.h2_ * p_.Wo_ , p_.Bo_ );  net.y_  = tanh(net.o3_);
    }

    Matrix<Type> loss(const Matrix<Type>& expected,
                      const Network& net)
    {
        // you dont have to run this unless your printing stuff.. the back prop needs the "dE/dy" not the "E"
        Matrix<Type> delta = net.y_ - expected;
        return product(delta, delta);
    }

    void backward(const Matrix<Type>& expected,
                  const Network& net,
                  const Type alpha)
    {
        // note tanh(x) = (exp(?x)-?exp(?-x))/?(exp(?x)+?exp(?-x))
        // dtanh/dx = 1 - (tanh(x)) ^ 2

        // forward network params
        // o1 = Wh1*x  + Bh1 ; h1 = tanh(o1)
        // o2 = Wh2*h1 + Bh2 ; h2 = tanh(o2)
        // o3 = Wo *h2 + Bo  ; y  = tanh(o3)
        // E = (e - y)^2

        // dE/dy =  2 * (y - e)
        //       =~     (y - e)    // toss the 2  out as alpha will take it

        // sigma3 = dE/do3
        //        = dE/dy  * dy/do3
        //        = (y - e) * (1 - (tanh(o3))^2)
        // sigma2 = dE/do2
        //        = dE/dy  * dy/do3 * do3/dh2 * dh2/do2
        //        = sigma3          * do3/dh2 * dh2/do2
        //        = sigma3          * Wo      * (1 - (tanh(o2))^2)
        // sigma1 = dE/do1
        //        = dE/dy  * dy/do3 * do3/dh2 * dh2/do2 * do2/dh1 * dh1/do1
        //        = sigma2                              * do2/dh1 * dh1/do1
        //        = sigma2                              * Wh2     * (1 - (tanh(o1))^2)

        // dWo    = -alpha * dE/dWo
        //        = -alpha * dE/dy * dy/do3 * do3/dWo
        //        = -alpha * sigma3         * do3/dWo
        //        = -alpha * sigma3         * h2
        // dWh2   = -alpha * dE/dWh2
        //        = -alpha * dE/dy * dy/do3 * do3/dh2 * dh2/do2 * do2/dWh2
        //        = -alpha * sigma2                             * do2/dWh2
        //        = -alpha * sigma2                             * h1
        // dWh1   = -alpha * dE/dWh1
        //        = -alpha * dE/dy * dy/do3 * do3/dh2 * dh2/do2 * do2/dh1 * dh1/do1 * do1/dWh1
        //        = -alpha * sigma1                                                 * do1/dWh1
        //        = -alpha * sigma1                                                 * x

        // dBo    = -alpha * dE/dBo
        //        = -alpha * dE/dy * dy/do3 * do3/dBo
        //        = -alpha * sigma3         * do3/dBo
        //        = -alpha * sigma3         * 1
        // dBh2   = -alpha * dE/dBh2
        //        = -alpha * dE/dy * dy/do3 * do3/dh2 * dh2/do2 * do2/dBh2
        //        = -alpha * sigma2                             * do2/dBh2
        //        = -alpha * sigma2                             * 1
        // dBh1   = -alpha * dE/dBh1
        //        = -alpha * dE/dy * dy/do3 * do3/dh2 * dh2/do2 * do2/dh1 * dh1/do1 * do1/dBh1
        //        = -alpha * sigma1                                                 * do1/dBh1
        //        = -alpha * sigma1                                                 * 1

        Matrix<Type> sigma3 = product((net.y_ - expected)        , tanh_derivate(net.o3_));
        Matrix<Type> sigma2 = product(sigma3 * transpose(p_.Wo_ ), tanh_derivate(net.o2_));
        Matrix<Type> sigma1 = product(sigma2 * transpose(p_.Wh2_), tanh_derivate(net.o1_));

        Matrix<Type> dWo  = -alpha * (transpose(net.h2_) * sigma3);
        Matrix<Type> dWh2 = -alpha * (transpose(net.h1_) * sigma2);
        Matrix<Type> dWh1 = -alpha * (transpose(net.x_ ) * sigma1);

        Matrix<Type> transposed_unit(sigma3.shape_[1], 1);
        ones(transposed_unit);

        Matrix<Type> dBo  = -alpha * (transposed_unit * sigma3);
        Matrix<Type> dBh2 = -alpha * (transposed_unit * sigma2);
        Matrix<Type> dBh1 = -alpha * (transposed_unit * sigma1);

        p_.Wo_  += dWo ;
        p_.Wh2_ += dWh2;
        p_.Wh1_ += dWh1;

        p_.Bo_  += dBo ;
        p_.Bh2_ += dBh2;
        p_.Bh1_ += dBh1;
    }

    void train(const Matrix<Type>& inputs,
               const Matrix<Type>& expected,
               const Type alpha,
               const int  cycles)
    {
        //gradient descent .. the hard way?
        for (int i = 0; i < cycles; ++i)
        {
            // batching ??
            Network net;

            forward(net, inputs);
            backward(expected, net, alpha);

            if ((i % 10) == 0)
            {
                Matrix<Type> err = loss(expected,net);
                Matrix<Type> transposed_unit(err.shape_[1], 1);
                ones(transposed_unit);

                Matrix<Type>  netErr = transposed_unit * err;
                std::cout << "**********************************************\n";
                std::cout << "step: " << i << " err:" << netErr;

                std::cout << "       Wo_ :\n" << p_.Wo_ ;
                std::cout << "       Wh2_:\n" << p_.Wh2_;
                std::cout << "       Wh1_:\n" << p_.Wh1_;
                std::cout << "       Bo_ :\n" << p_.Bo_ ;
                std::cout << "       Bh2_:\n" << p_.Bh2_;
                std::cout << "       Bh1_:\n" << p_.Bh1_;

            }
        }
    }
};

void basicTest()
{
    int rawA[] = {1,2,3,4};
    int rawB[] = {2,3,4,5};
    int rawC[] = {2,3,4,5,6,7};

    Matrix<int> a(rawA, 2, 2);
    Matrix<int> b(rawB, 2, 2);
    Matrix<int> c(rawC, 3, 2);

    std::cout << a << "\n";
    std::cout << b << "\n";
    std::cout << c << "\n";

    std::cout << (a + b) << "\n";
    try { std::cout << (a + c) << "\n"; }
    catch (std::exception& e) { std::cout << e.what() << "\n"; }

    std::cout << MatrixUtils<int>::matmul(a,c) << "\n";
    try { std::cout << MatrixUtils<int>::matmul(c,a) << "\n"; }
    catch (std::exception& e) { std::cout << e.what() << "\n"; }

    std::cout << "SUCCESS\n";
}

void xorNetTest()
{
    // generate some train data
    Matrix<float> inputs(2,100);
    rand(inputs);

    // compute xor for input data
    Matrix<float> expected
        = MatrixUtils<float>::bifunctor(&MatrixUtils<float>::Helpers::xor_f,
                                        MatrixUtils<float>::selcol(0, inputs),
                                        MatrixUtils<float>::selcol(1, inputs));

    Model model;
    model.train(inputs,expected,0.03,10001);
}

int main()
{
    basicTest();
    xorNetTest();

}

Sunday, December 13, 2015

Varaidic templated template parameters for a loop unroller/code repeater

This basically a demonstration how to use variadic templated template parameter in combination with a variadic template outer handler that locks the typing of the inner part into place with automatic type resolution. This application is basically demonstrating how to create a static loop unroller or code repeater...

// compile: g++ -std=c++11 metafunctor.cpp

#include <iostream>
#include <map>

// **************************************************************
// **********************TEMPLATE FOR****************************
// **************************************************************

template <template<int> class Func, int... IDs> class ForFunctor;

template <template<int> class Func, int ID, int... IDs>
class ForFunctor<Func,ID,IDs...>
{
public:
    template <typename... Params>
    static void with(Params& ...params)
    {
        Func<ID>::op(params...);
        ForFunctor<Func,IDs...>::with(params...);
    }
};

template <template<int> class Func>
class ForFunctor<Func>
{
public:
    template <typename... Params>
    static void with(Params& ...params) {}
};

// **************************************************************
// **********************SOME FUNCTORS****************************
// **************************************************************

template <int ID>
class Init
{
public:
    template <typename OUT>
    static void op(OUT& out)
    {
        out[ID] = ID;
    }
};

template <int ID>
class CopyID
{
public:
    template <typename OUT, typename IN>
    static void op(OUT& out,IN& in)
    {
        out[ID] = in[ID];
    }
};

// **************************************************************
// ***************************MAIN*******************************
// **************************************************************

int main()
{
    typedef std::map<int,int>    A;

    A a;
    A b;

    ForFunctor<Init,1,2,6,3,10>::with(a);
    ForFunctor<CopyID,1,2,6>::with(b,a);

    std::cout << "a: ";
    for (auto value: a) std::cout << value.second << " ";
    std::cout << "\nb: ";
    for (auto value: b) std::cout << value.second << " ";
    std::cout << "\n";
}

Wednesday, April 15, 2015

Creating iomanip for a class - the easy way

Now creating custom io manipulator for your classes can be damn annoying.. The trick to building them is to install and manage a handler for the custom manipulator in the pword array of the ostream. I have hacked this little CustomManip class that allows you to quickly install a lambda function as the a stream manipulator to select from or parameterize the various print methods that you have in play for your class. The only requirement is that each class have at least a " void printDefault(std::ostream& os) const" method that the default print method.

PS: Also there is one little bug with the code.. it does install the streamEvent responder several times.. but what a blog post without a bug or 2..

Anyway heres the code:
//g++ -g --std=c++11 custom_class_manip.cpp

#include <iostream>
#include <ios>
#include <sstream>
#include <functional>

template <typename TYPE>
class CustomManip
{
public:
    typedef std::function<void(std::ostream&, const TYPE&)> ManipFunc;

    // installation helper for a manipulator with params
    class Installer
    {
        ManipFunc func_;

    public:
        Installer(ManipFunc func) :
            func_(func)
        {}

        inline friend
        std::ostream& operator<<(std::ostream& os,
                                 const Installer& installer )
        {
            CustomManip<TYPE>::install(os,
                                       installer.func_);
            return os;
        }
    };

private:
    struct CustomManipHandle
    {
        ManipFunc func_;

        CustomManipHandle()  { std::cout << "new handle at " << this << "\n"; }
        ~CustomManipHandle() { std::cout << "del handle at " << this << "\n"; }
    };

    static int handleIndex()
    {
        // the id for this Custommaniputors params
        // in the os_base parameter maps
        static int index = std::ios_base::xalloc();
        return index;
    }

public:

    // for parameterized manipulator installations
    static Installer make_installer(ManipFunc func)
    {
        return Installer(func);
    }

    // for unparameterized manipulator installations
    static void install(std::ostream& os, ManipFunc func)
    {
        CustomManipHandle* handle =
            static_cast<CustomManipHandle*>(os.pword(handleIndex()));

        // check if its installed on this ostream
        if (handle == NULL)
        {
            // install it
            handle = new CustomManipHandle();
            os.pword(handleIndex()) = handle;

            // install the callback so we can destroy it
            os.register_callback (CustomManip<TYPE>::streamEvent,0);
        }

        handle->func_ = func;
    }

    static void uninstall(std::ios_base& os)
    {
        CustomManipHandle* handle =
            static_cast<CustomManipHandle*>(os.pword(handleIndex()));

        //delete the installed Custommanipulator handle
        if (handle != NULL)
        {
            os.pword(handleIndex()) = NULL;
            delete handle;
        }
    }

    static void streamEvent (std::ios::event ev,
                             std::ios_base& os,
                             int id)
    {
        switch (ev)
        {
            case os.copyfmt_event:
                std::cout << "copyfmt_event\n"; break;
                // does this mean i need to copy my manip as well??
            case os.imbue_event:
                std::cout << "imbue_event\n"; break;
            case os.erase_event:
                std::cout << "erase_event\n";
                uninstall(os);
                break;
        }
    }

    static void print(std::ostream& os, const TYPE& data)
    {
        CustomManipHandle* handle
            = static_cast<CustomManipHandle*>(os.pword(handleIndex()));

        if (handle != NULL)
        {
            handle->func_(os, data);
            return;
        }

        data.printDefault(os);
    }
};

/**********************************
 ****************TEST**************
 **********************************/

class A
{
    int v_;
public:
    A(int v) :
        v_(v)
    {}

    void printDefault(std::ostream& os) const { os << " " << v_ << " default\n"; }
    void printVer1   (std::ostream& os) const { os << " " << v_ << " abc\n"; }
    void printVer2   (std::ostream& os) const { os << " " << v_ << " xyz\n"; }
    void printParam  (std::ostream&     os,
                      const std::string paramA,
                      const int         paramB)  const
    {
        os << " " << v_ << " param: " << paramA << " " << paramB << "\n";
    }
};

std::ostream& operator<<(std::ostream& os, const A& a)
{
    CustomManip<A>::print(os, a);
    return os;
}

class B
{
public:
    void printDefault(std::ostream& os) const { os << " default\n"; }
    void printVer3   (std::ostream& os) const { os << " mno\n"; }
};

std::ostream& operator<<(std::ostream& os, const B& b)
{
    CustomManip<B>::print(os, b);
    return os;
}

std::ostream& manip_default (std::ostream& os)
{
    CustomManip<A>::uninstall(os);
    CustomManip<B>::uninstall(os);
    return os;
}

std::ostream& manip_ver1 (std::ostream& os)
{
    CustomManip<A>::install(os,
                            [](std::ostream& oos, const A& a)
                            {
                                a.printVer1(oos);
                            });
    return os;
}

std::ostream& manip_ver2(std::ostream& os)
{
    CustomManip<A>::install(os,
                            [](std::ostream& oos, const A& a)
                            {
                                a.printVer2(oos);
                            });
    return os;
}

std::ostream& manip_ver3(std::ostream& os)
{
    CustomManip<B>::install(os,
                            [](std::ostream& oos, const B& b)
                            {
                                b.printVer3(oos);
                            });
    return os;
}

CustomManip<A>::Installer manip_param(const std::string str,
                                      const int other)
{
    return CustomManip<A>::make_installer(
        [str, other](std::ostream& oos, const A& a)
        {
            a.printParam(oos, str, other);
        });
}

int main()
{
    A a(1);
    B b;

    std::cout << a
              << b
              << "\n";

    std::cout << manip_ver1 << a
              << manip_ver2 << a
              << manip_ver3 << b
              << manip_default << a
              << manip_ver1 << a
              << manip_param("testing", 4) << a
              << b
              << "\n";

    std::stringstream oss;
    oss << manip_ver1 << a
        << manip_ver2 << a
        << manip_ver3 << b
        << manip_default << a
        << manip_ver1 << a
        << manip_param("testing", 4) << a
        << b
        << "\n";

    std::cout << oss.str();
}

The resulting output looks like
 1 default
 default

new handle at 0x307de0
 1 abc
 1 xyz
new handle at 0x307e50
 mno
del handle at 0x307de0
del handle at 0x307e50
 1 default
new handle at 0x307de0
 1 abc
 1 param: testing 4
 default

new handle at 0x307ef0
new handle at 0x307e90
del handle at 0x307ef0
del handle at 0x307e90
new handle at 0x307e90
 1 abc
 1 xyz
 mno
 1 default
 1 abc
 1 param: testing 4
 default

erase_event
del handle at 0x307e90
erase_event
erase_event

Tuesday, November 18, 2014

How to create a parsing/printing system with 1 line of code per field

Recently had a friend claim that you cant parse and print a blob of data without cutting and pasting stuff every where or using the new c++11 compilers and techniques.

I have in the past(2012) shown techniques to do this kind of parsing and generation in c11-tuples-and-schema-generation.html. So with a few short hours of hacking and here is the proof of concept.. The solution is meta programming black magic but hey u can define the parser and printer (and whatever else u want to add) for "SomeMessage" in basically 1 line of code per field.

// compile with:
// g++ -I"c:\tools\boost_1_49_0" -L"c:\tools\boost_1_49_0\stage\lib" -static self_print_struct.cpp -o self_print_struct.exe

#include <iostream>
#include <boost/mpl/for_each.hpp>
#include <boost/mpl/list.hpp>
#include <boost/mpl/string.hpp>
#include <boost/mpl/range_c.hpp>

template <typename T,
          typename N>
struct Field
{
    typedef T Type;
    typedef N Name;

    static unsigned char* print(std::ostream& os, unsigned char* ptr)
    {
        os << boost::mpl::c_str<N>::value
           << " : " << *(static_cast<T*>(static_cast<void*>(ptr)));
        return ptr + sizeof(Type);
    }
};

template <typename Base>
struct PrintMixin
{
    struct DoPrint
    {
        unsigned char* cursor_;

        DoPrint(unsigned char* cursor) :
           cursor_(cursor)
        {}

        template< typename U > void operator()(U x)
        {
            std::cout << " + ";
            U::print(std::cout, cursor_);
            std::cout << '\n';
            cursor_ += sizeof(typename U::Type);
        }
    };

    static void print(std::ostream& os, unsigned char* ptr)
    {
        boost::mpl::for_each< typename Base::Type >( DoPrint(ptr) );
    }
};

struct SomeMessage : PrintMixin<SomeMessage>
{
    typedef boost::mpl::list<Field<int,  boost::mpl::string<'fiel','d 1'> >,
                             Field<char, boost::mpl::string<'fiel','d 2'> > > Type;
};

struct AnotherMessage : PrintMixin<AnotherMessage>
{
    typedef boost::mpl::list<Field<int,   boost::mpl::string<'this'> >,
                             Field<char,  boost::mpl::string<'that'> >,
                             Field<float, boost::mpl::string<'the ','othe','r'> >
                             > Type;
};

int main()
{
    unsigned char message1[] =
    {
        0x01,0x02,0x03,0x04,
        'a'
    };

    std::cout << "message 1\n";
    SomeMessage::print(std::cout, message1);

    unsigned char message2[] =
    {
        0x15, 0xCD, 0x5B, 0x07,
        'b',
        0x19, 0x04, 0x9e, 0x3f
    };

    std::cout << "message 2\n";
    AnotherMessage::print(std::cout, message2);
}


And the output looks like:
$ self_print_struct.exe
message 1
 + field 1 : 67305985
 + field 2 : a
message 2
 + this : 123456789
 + that : b
 + the other : 1.2345

Monday, July 14, 2014

Runtime value to Compile time value/action conversion via variadic selection tree

With c++11 some of the older macro or table based generator techniques can be tossed out. One of these is the runtime to compile time translation of a value.

For example lets say you have a table of of idents and related actions. 2 possible solution are
* Populate an array with function pointers.
* Generate a switch statement with marcos..

But c++11 has opened a new and what seems to flexible possibility: The ability to use a variaditic template to generated a reduction tree. This seems to have some very interesting possibilities.
* The redux tree can have its action templated... (But I seemed to have a compiler bug that stops it working)
* Its not locked into a single function signature a "Param..." argument pack gives it the flexibility to adapt on both entry and exit of the selection tree.
* good use of inlining lets the tree flatten out and the compiler gets to hit it with opt routines like the macro generated select version.


#include <iostream>
#include <cstring>

// -----------------------------------------------
//                         IDs
// -----------------------------------------------

enum {
    UNKNOWN = 0,
    MIN_TAG = UNKNOWN,
    INT,
    FLOAT,
    MAX_TAG
};

// -----------------------------------------------
//            ID -> Type translation trait
// -----------------------------------------------

template <int Tag> struct Trait {};

template <> struct Trait<UNKNOWN>
{
    typedef void Type;
    static const char* id() { return "unknown";   }
};

template <> struct Trait<INT>
{
    typedef int Type;
    static const char* id() { return "int";   }
};

template <> struct Trait<FLOAT>
{
    typedef float Type;
    static const char* id() { return "float"; }
};

// -----------------------------------------------
//                    Execution point
// -----------------------------------------------

template <int I>
void function(char* data)
{
    typename Trait<I>::Type value =
        (*static_cast<typename Trait<I>::Type*>(
            static_cast<void*>(data)));

    std::cout
        << " type:"  << Trait<I>::id()
        << " value:" << value
        << "\n";
}

template <>
void function<UNKNOWN>(char* data)
{
    std::cout
        << " unknown type number given!"
        << "\n";
}

template <>
void function<MAX_TAG>(char* data) { function<UNKNOWN>(data); }

struct DoAction
{
    template <int I, typename... Params>
    static void call(Params... params)
    {
        function<I>(params...);
    }
};

// -----------------------------------------------
//  SelectTree: runtime -> complie time selector
// -----------------------------------------------

template <typename Action, int min, int max>
struct SelectTree
{
    enum { mid = (max+min)/2 };

    template <typename... Params>
    static void call(int i, Params... params)
    {
        if (i <= mid) SelectTree<Action,min  ,mid>::call(i, params...);
        else          SelectTree<Action,mid+1,max>::call(i, params...);
    }
};

template <typename Action, int value>
struct SelectTree<Action, value, value>
{
    template <typename... Params>
    static void call(int i, Params... params)
    {
        //Action::call<value>(params...);    // compiler bug!
        DoAction::call<value>(params...);
    }
};

// -----------------------------------------------
//                        MAIN
// -----------------------------------------------

int main()
{
    float x = 1.0;

    char data[4];
    std::memcpy(data, &x, sizeof(x));

    int sel = 3;
    std::cin >> sel;

    SelectTree<DoAction,MIN_TAG,MAX_TAG>::call(sel, data);
}


the output looks like
$ a.exe
1
 type:int value:1065353216
$ a.exe
2
 type:float value:1
$ a.exe
-2
 unknown type number given!
$ a.exe
0
 unknown type number given!
$ a.exe
123213
 unknown type number given!

Saturday, July 5, 2014

Async usage of futures and promises in ASIO

One of the new c++11 additions with great potential is futures and promises. However im a heavy asio based async programmer. Most exiting examples on the web deal directly with std::thread and handle the std::promise via std::move. But the asio design basically means you need to use the binding approach to coding. Unfortunately std::bind is great until u need to use a un-copyable move only class like std::promise.

Now you can do something crazy like rewrite the bind so that it can handle movable items, as i did in my post movable-stdbind-with-placeholders. Which is all well and good in theroy BUT there is something to be said about keeping to the standard and well used libs(like boost). Why? Well compiler developers and lib devs like the boost guys are working very hard to improve there stuff. So if you keep to the standard tools and libs you get all the bonuses from their hard work for free when u upgrade. Compiler designers research, profile and add better optimizations and well used libs get more and more eyes on, who in turn find and contribute better code.

And second to all that there is a really trivial and obvious solution to the problem: std::shared_ptr...

So here is how you use std::promise and std::future in lambdas, binds, asio and anything else u cant move into.. I also tossed in an example of how to use non-blocking checks on the futures which is another key tool for using futures in asio code.

#include <iostream>
#include <chrono>
#include <thread>
#include <future>
#include <boost/asio.hpp>

void asyncRun()
{
    std::cout << "Async..." << std::flush;

    boost::asio::io_service io_service;
    std::shared_ptr< std::promise<int> > promise(new std::promise<int>());
    std::future<int> future = promise->get_future();

    io_service.post(
                    [promise]()
                    {
                        std::chrono::milliseconds dura( 2000 );
                        std::this_thread::sleep_for( dura );
                        promise->set_value(9);
                    }
                    );

    std::thread t1( [&io_service]{ io_service.run(); });
    t1.detach();

    std::cout << "Waiting..." << std::flush;
    future.wait();
    std::cout << "Done!\nResults are: "
              << future.get() << '\n';

}


void nonBlockingRun()
{
    std::cout << "Non Blocking..." << std::flush;

    std::promise<int> promise;
    std::future<int> future = promise.get_future();
    std::thread t1( [](std::promise<int> p)
                    {
                        std::chrono::milliseconds dura( 2000 );
                        std::this_thread::sleep_for( dura );
                        p.set_value(9);
                    },
                    std::move(promise) );
    t1.detach();

    std::cout << "Waiting...\n" << std::flush;
    std::future_status status;
    do {
        status = future.wait_for(std::chrono::seconds(0));

        if (status == std::future_status::deferred) {
            std::cout << "+";
        } else if (status == std::future_status::timeout) {
            std::cout << ".";
        }
    } while (status != std::future_status::ready);
    std::cout << "Done!\nResults are: "
              << future.get() << '\n';
}

void blockingRun()
{
    std::cout << "Blocking..." << std::flush;

    std::promise<int> promise;
    std::future<int> future = promise.get_future();
    std::thread t1( [](std::promise<int> p)
                    {
                        std::chrono::milliseconds dura( 2000 );
                        std::this_thread::sleep_for( dura );
                        p.set_value(9);
                    },
                    std::move(promise) );
    t1.detach();

    std::cout << "Waiting..." << std::flush;
    future.wait();
    std::cout << "Done!\nResults are: "
              << future.get() << '\n';
}

int main()
{
    nonBlockingRun();
    blockingRun();
    asyncRun();
}