3. Intermediate

3.1 Low-level concepts

3.1.1 Important compilation flags

TODO: explication

BOUNDCHECK or nothing
USEFLOAT or USEDOUBLE
LITTLEENDIAN or BIGENDIAN

Default with Pymake and Linux is BOUNDCHECK, USEDOUBLE,
LITTLEENDIAN.

3.1.2 Smart Pointers

Memory management is one of the most error-prone aspects of traditional C and C++ programming. PLearn makes it easier through the use of reference-counted smart pointers.

Traditionally, there are two basic ways an object can typically be created:

on the stack:

void f()
{   // Beginning of scope
    MyClass myinstance; 
        // memory is allocated on the stack,
        // and constructor is called

    myinstance.dosomething();
        // methods and members are called using a dot

}   // when exiting the scope, destructor of object is
    // called automatically and stack memory is freed

by calling new:

void f()
{   // Beginning of scope

    MyClass* ptr = new MyClass(); 
        // memory is allocated on the heap by the "new"
        // opearator, which returns a pointer

    ptr->dosomething(); 
        // methods and members are called using "->"

    delete ptr;
        // we have to call "delete" explicitly,
        // because object is NOT automatically destroyed
}       // when leaving the scope

In more complex cases, where several ojects may contain pointers to other objects, keeping track of when to delete an object quickly becomes a complex and error-prone bookkeeping task.

PLearn uses reference-counted smart pointers to automate this, so that you don't have to worry about calling delete. It is based on a smart pointer template (PP which stands for PLearnPointer) that can be used on any class that derives from SmartPointable. A SmartPointable object contains the count of the number of smart pointers that point to it, and is automatically destroyed when this reference count becomes 0 (i.e. when nobody any longer points to it)

class MyClass: public SmartPointable;

void f()
{   // Beginning of first scope
    PP<MyClass> ptr = new MyClass();
        // memory is allocated on the heap
        // (reference count for object is 1)

    { // Beginning of second scope
        PP<MyClass> ptr2 = ptr;
            // ptr2 and ptr point to the same object
            // (reference count becomes 2)

    } // Object is not destroyed upon exiting the second scope 
      // (reference count becomes 1)

    ptr->dosomething();
        // methods and members are called using "->"

} // Object is automatically destroyed here 
  // when reference count becomes 0

It is possible to mix traditional pointers to an object with smart pointers, and there are automatic conversions between the two. However, in general we discourage doing this, although it might prove useful in some situations (such as to keep a pointer to the actual specific type of the object rather than its base-class). If you do mix them, just remember that the object will get deleted as soon as the last smart pointer pointing to it is gone (when it gets out of scope for instance), regardless whether there are still traditional pointers pointing to it (the automatic reference count can only counts smart pointers!).

Many base classes in PLearn have an associated smart pointer type with a similar (and usually shorter) name, as shown in the following table. Sometimes this corresponding smart pointer type is a simple typedef to the type PP<baseclass>.

But we also often specialised them (by deriving PP<baseclass>) to add operators and methods for user convenience. So that, for instance, element at row i and column j of a VMat m can be accessed as m(i,j) as an alternative to the more verbose m->get(i,j).

base class	smart pointer
VMatrix	VMat
Variable	Var
RandomVariable	RandomVar
Kernel	Ker
CostFunction	CostFunc
StatsIterator	StatsIt

Several concepts in PLearn can be seen as having two levels of implementation:

A base class and its derived classes form the basic internal working mechanism for the concept which can be extended by deriving new classes. We call this the designer level.
A correponding smart pointer type for the base class, and a number of utility functions give a more user-friendly syntax to use the concept. They are mostly wrapping and syntactic sugar around the designer level classes. We call this the user level.

The person who only wishes to use the library typically doesn't need to understand all the details of the designer level hierarchy. Some concepts (such as Var) can be manipulated almost entirely through the smart pointer type and user-level functions, although knowing the most useful methods of the uinderlying base class, and the role of each subclass certainly doesn't harm.

3.2 How to subclass a PLearn Object

3.2.1 Object

PLearn defines an Object class. There is not much to it. Its role is mainly to standardise the methods for printing, saving to file, and duplicating objects. Not all classes in PLearn are derived from Object (many low-level classes aren't). But all non-strictly-concrete classes (i.e. those with virtual methods) in PLearn derive from Object. This includes the Learner base class.

Object allows an easy support for a number of useful generic facilities:

automatic memory management (through reference counted smart pointers: Object derives from PPointable)
serialization/persistence (read, write, save, load)
runtime type information (classname)
displaying (info, print)
deep copying (deepCopy)
a generic way of setting options (setOption) and a generic build() method (the combination of the two allows for instance to change the object structure and rebuild it at runtime)

3.2.2 Creating a basic class deriving from Object

First, you can use pyskeleton, a python script which creates automatically the .h and .cc files.

pyskeleton Object Person creates a class called Person derived from Object.

The first thing to do is to fill the .h file.

Example:

...

private:
    typedef Object inherited;

protected:
    // *********************
    // * protected options *
    // *********************

    // ### declare protected option fields 
    // ###  (such as learnt parameters) here
    // ...

public:
    // here we had the good things
    string firstname;
    int age;

Then you just have to fill the declareOptions method in the .cc .

void Person::declareOptions(OptionList& ol) 
{
    // ### Declare all of this object's options here.
    // ### For the "flags" of each option, you should typically specify
    // ### one of OptionBase::buildoption, OptionBase::learntoption or
    // ### OptionBase::tuningoption. If you don't provide one of these three,
    // ### this option will be ignored when loading values from a script.
    // ### You can also combine flags, for example with OptionBase::nosave:
    // ### (OptionBase::buildoption | OptionBase::nosave)

    // ### ex:
    declareOption(ol, "firstname", &Person::firstname,
                  OptionBase::buildoption,
                  "Help text describing this option");

    declareOption(ol, "age", &Person::age,
                  OptionBase::buildoption,
                  "Help text describing this option");
// ...

  // Now call the parent class' declareOptions
  inherited::declareOptions(ol);
}

3.2.3 Setting option fields and calling build()

There are several techniques to implement the facilities of finishing building afterwards and named parameters. In PLearn, we typically use public option fields (or sometimes protected fields with setter methods) and a public build() method that does the actual building. Think of those public fields as really nothing but named constructor parameters, and build() as the one and only true constructor.

The building of me in the previous example could then look as follows:

#include "Person.h"

using namespace PLearn;

int main(int argc, char** argv)
{
    Person me; // default constructor can set default values 
               // for the option-fields
               // for ex: suppose default profession is "student"
    me.firstname = "Pascal";
    me.age = 29;
    me.build(); // finalize the building process

    cout<<me.firstname; 
}

Note that there has to be a default (empty) constructor, whose role is also to set the default values of the parameters.

3.2.4 A generic way of setting options from “outside”

Sometimes, you want to set options and build an object from some form of interpreted language environment or from a text description, etc. That is to say from “outside” a C++ program. For this, PLearn provides the setOption method. Suppose Person is a subclass of Object, then we could do the following:

#include <plearn/base/Object.h>
#include "Person.h"

using namespace PLearn;

int main(int argc, char** argv)
{
    Object *o = new Person(); // o is a smart pointer to an
                              // object whose true type is Person
    o->setOption("firstname","Pascal");
    o->setOption("age","29");

    Person *p = dynamic_cast<Person*>(o);

    cout<<p->firstname;
}

Note that setOption takes 2 strings: the name of the option, and its value serialised in string form. Strings are universal because anything can be represented (serialized) as a string. Actually, setOption calls a lower-level method called readOptionVal which reads the option value from a stream (a string stream in this case...) rather than a string. Similarly there is a getOption method which returns a string representation of a named option, and whose implementation simply calls writeOptionVal on a string stream.

3.2.5 Building an object from its specification in a file

Building an object from a specification in a file is a natural extension of the setOption/build mechanism. Suppose we now have a file me.psave containing the following text:

Person( firstname="Pascal"; 
        age = 29;
      );

In the following code, we a way to build me from its description in the file.

#include <plearn/base/Object.h>
#include "Person.h"

using namespace PLearn;

int main(int argc, char** argv)
{

    Object* o = loadObject("me.psave"); 
    Person *p = dynamic_cast<Person*>(o);

    cout<<p->firstname;

    return 0;
}

There are others ways to do that:

string filename = "me.psave";

// 1) The loadObject function
{
    Object *me;
    me = loadObject(filename);
}

// 2) What loadObject actually does
{
    ifstream in(filename.c_str());
    Object *me = readObject(in);
}

// 3) An alternative (loadObject actually calls Object::read)
{
    Person me; 
    ifstream in(filename.c_str());
    me.read(in);
}

// 4) An alternative using the global generic 
//    plearn::read function
{
    Object *me;
    ifstream in(filename.c_str());
    ::read(in, me);
}

// 5) What if we have the string representation at hand?
{
    // get the content of the file as a string
    // (function in fileutils.h)
    string description = loadFileAsString(filename);
    Object *me = newObject(description);
}

Naturally, all that these functions do is parse the description in the file, and call readOptionVal (the lower-level equivalent of setOption) for each specified option, before finally calling build().

Note that options may have arbitrarily complex types. They are not limited to strings and numbers; in particular they may themselves be compex objects or arrays of things. For example:

Drawing(
    color = "blue";

    # path is an array of objects
    path = [ Line(x0=0, y0=0, x1=10, y1=20); 
             Line(x0=0, y0=0, x1=10, y1=20, width=2);
             Circle(x=20; y=30; radius=5.3, fill=true);
           ];
    );

Finally, you should use a SmartPointable for Person, as seen before in .

#include <plearn/base/Object.h>
#include "Person.h"

using namespace PLearn;

int main(int argc, char** argv)
{
    Object* o = loadObject("me.psave"); 
    PP<Person> p = dynamic_cast<Person*>(o);

    cout<<p->firstname;

    return 0;
}

3.2.6 Human description versus saved object

The me.psave file in the previous section may have been produced either manually by a human being, or automatically by calling

plearn::save("me.psave",me);

on a previously constructed Person me object.

The mechanism for building an object is the same in both cases: it automatically calls a series of readOptionVal followed by build(). However the options specified in both cases are not always the same:

A hand-written description file will typically be used to give a small number of options for the initial building of an object (with the other options taking their default value).
A file resulting from a saved object, will typically include everything that is necessary to reconstruct a new instance in the full and exact same state as the instance that was saved. This may include options, such as the learnt synaptic weights of a neural network, that are not given at the time of initial building, but only when reloading a serialised object.

We call the options typically used for initial building build options, and the second type learnt options. Note that the behaviour of the build method may have to be quite different when we are reloading a saved object (and providing it with learnt options) from when we are only doing an initial building (and providing it only with build options). It is natural that our "one and only" constructor may have to behave differently depending on the parameters it is given, but it is important to keep in mind the distinction between build options on one hand, and learnt options that are only present whe reloading, on the other hand.

There is a third conceptual category of options, that we call tuning options, which are used mostly to tune the object after an intial building. They often overlap with build options, but not necessarily, the distinction is nevertheless more conceptual than real.

3.3 Matrix-Vectors Operations with Gradients

3.3.1 Introduction to Var

The class Var is at the heart of PLearn and aims at providing matrix-variables in the mathematical sense. It is built on top of the Mat class, that provides matrix-variables in the more traditional sense of sequential computer languages.

Var should be used for Matrix-Vectors operations when you need gradients on operations. Otherwise, use Mat and Vec classes.

NO NUMERICAL COMPUTATION IS DONE AT THIS LEVEL. The purpose of the Var definitions is only to build the symbolic relationship between mathematical variables.

One can write arbitarily complex expressions using many implicit or explicit intermediary variables, and predefined functions such as in: $w=exp(-(abs(sqrt(lambda)*v)/3.0))$ . This will construct an internal representation, only with a larger number of intermediate nodes to represent intermediate states (variables) of the calculations.

Each Var contains two Vec fields.

One is called value and holds the current value assigned to that variable, and the other is called gradient and is used to backpropagate gradients with respect to another variable.

Every Var has an fprop() method that updates its value field according to the value field of its direct parents.

Every Var also has a bprop() method that updates the gradient field of its direct parents according to its own gradient field (backpropagation algorithm). Note that it accumulates gradients into its parents gradient field.

Example:

#include <plearn/var/Var_all.h>
#include <plearn/math/TMat_maths.h>

using namespace PLearn;

int main(int argc, char** argv)
{
    Var v(3,2); // declares a Variable of size 3x2
    Var lambda(1,1); // declares a scalar Variable

    v->matValue(1,0)=4.0;
    v->matValue(2,0)=5.0;
    v->matValue(0,1)=73.0;
    v->matValue(0,0)=78.0;
    v->matValue(2,1)=5.0;

    cout << v->matValue << endl;

    lambda->value = 2.0;

    Var w = lambda*v;

    w->fprop();

    cout << w->matValue << endl;
    return 0;
}

If the expression to be calculated involves intermediate variables, fprop must be called in a correct order on all those intermediate variables before it can be called on the result variable we are interested in. For example, suppose we have $z=dot(x,tanh(y))$ where $x,u \in R^3$ .

A Var builds a directed acyclic graph whose nodes are Var's, with the following structure:

 x    u                           x
  \   |                           |   
   \  |                           |  
    \ |                           |  
     \|                           |  
     [+]                          |
      |                           |           
      y ---> tanh() --> w  -----> dot----> z

To obtain the correct value of z as a function of x and u, after setting x->value and u->value, we need to perform fprop on all the intermediate nodes as well as z.

#include <plearn/var/Var_all.h>
#include <plearn/math/TMat_maths.h>

using namespace PLearn;

int main(int argc, char** argv)
{
    Var x(3,1);
    Var u(3,1);

    x->matValue(0,0)=1;
    x->matValue(1,0)=2;
    x->matValue(2,0)=3;
    u->matValue(0,0)=4;
    u->matValue(1,0)=5;
    u->matValue(2,0)=6;

    cout<<x->matValue<<endl;
    cout<<u->matValue<<endl;

    Var y = x + u;
        // y is also a 3x1 matrix
    Var w = tanh(y);
    Var z = dot(x,w);
        // z is a scalar variable result of the dot product 
        // of x and tanh(y)

    cout<<z->matValue<<endl;
    y->fprop();
    w->fprop();
    z->fprop();
    cout<<z->matValue<<endl;

    return 0;
}

To simplify the computation of values and gradients in a graph of Var's, we use a VarArray (don't forget the include).

A VarArray is simply an array of Vars, which has a method fprop() and a method bprop() which calls the fprop() (resp. bprop()) methods of all the elements of the array in the right order (note that a right order for bprop is the reverse of the order for fprop). The above function finds all the Var's on the paths from the the inputs Vars to the output Var. There are may be several input Vars so they are put in a VarArray. Once the path is obtained, we can propagate values through it with the fprop method:

#include <plearn/var/Var_all.h>
#include <plearn/math/TMat_maths.h>

using namespace PLearn;

int main(int argc, char** argv)
{
    Var x(3,1);
    Var u(3,1);

    x->matValue(0,0)=1;
    x->matValue(1,0)=2;
    x->matValue(2,0)=3;
    u->matValue(0,0)=4;
    u->matValue(1,0)=5;
    u->matValue(2,0)=6;

    cout<<x->matValue<<endl;
    cout<<u->matValue<<endl;

    Var y = x + u;
        // y is also a 3x1 matrix
    Var w = tanh(y);
    Var z = dot(x,w);
        // z is a scalar variable result of the dot product
        // of x and tanh(y)

    cout<<z->matValue<<endl;

    VarArray path = propagationPath(x & u, z);
    path.fprop();

    cout<<z->matValue<<endl;
    return 0;
}

In the previous, the VarArray is useful but not essential. Let consider the following example:

#include <plearn/var/Var_all.h>
#include <plearn/math/TMat_maths.h>

using namespace PLearn;

int main(int argc, char** argv)
{
    Var x(3,1);
    Var u(3,1);

    x->matValue(0,0)=1;
    x->matValue(1,0)=2;
    x->matValue(2,0)=3;
    u->matValue(0,0)=4;
    u->matValue(1,0)=5;
    u->matValue(2,0)=6;

    cout<<x->matValue<<endl;
    cout<<u->matValue<<endl;

    Var z = dot(x,tanh(x+u));
        // z is a scalar variable result of the dot product
        // of x and tanh(y)

    cout<<z->matValue<<endl;

    VarArray path = propagationPath(x & u, z);
    path.fprop();

    cout<<z->matValue<<endl;
    return 0;
}

This example performs exactly the same thing as the previous one. But in this case, we don't have any reference to the previous Var y,w to do fprop(). That's why a VarArray could be essential.

You can also use y->fprop_from_all_sources() instead of a VarArray but this reconstruct the path each time and so don't store it. It's not efficient for a multi fprop and it's not possible to back-propagate gradients.

Once we have this path, we can also back-propagate gradients. For example, if we set the gradient of z to 1,

#include <plearn/var/Var_all.h>
#include <plearn/math/TMat_maths.h>

using namespace PLearn;

int main(int argc, char** argv)
{
    Var x(3,1);
    Var u(3,1);

    x->matValue(0,0)=1;
    x->matValue(1,0)=2;
    x->matValue(2,0)=3;
    u->matValue(0,0)=4;
    u->matValue(1,0)=5;
    u->matValue(2,0)=6;

    cout<<x->matValue<<endl;
    cout<<u->matValue<<endl;

    Var y = x + u;
        // y is also a 3x1 matrix
    Var w = tanh(y);
    Var z = dot(x,w);
        // z is a scalar variable result of the dot product
        // of x and tanh(y)

    cout<<z->matValue<<endl;

    VarArray path = propagationPath(x & u, z);
    path.fprop();

    cout<<z->matValue<<endl;

    z->gradient = 1.0;
    path.bprop();
    cout << "dz/dx = " << x->gradient << endl;
    cout << "dz/du = " << u->gradient << endl;

    return 0;
}

We obtain the partial derivatives of z with respect to x and u in their gradient field.

3.3.2 Creating

You can create Var with several methods. The main are:

    Var(int the_length, int width_=1);
    Var(int the_length, int the_width, const char* name);
    Var(const Mat& mat);

The last one id used as in the following example:

#include <plearn/var/Var_all.h>
#include <plearn/math/TMat_maths.h>
using namespace PLearn;

int main(int argc, char** argv)
{
    Mat mx(3,1);
    mx(0,0) = 1;
    mx(1,0)=2;
    mx(2,0)=3;
    Var x(mx);
    cout<<x->matValue<<endl;
    return 0;
}

3.3.3 Manipulating

In the introduction to Var, you saw how to manipulate them.

Don't forget that all is symbolic (it will tricks you).

You can find numerous var in PLearnplearnvar, some are shortcuted by overloaded operators (such as +).

3.3.4 Loading and saving

3.3.4.1 Only the value

With the following method, you can load and save THE VALUE of a var (not the symbolic path).

#include <plearn/vmat/VMat.h>
#include <plearn/db/getDataSet.h>
#include <plearn/var/Var_all.h>
#include <plearn/math/TMat_maths.h>

using namespace PLearn;

int main(int argc, char** argv)
{
    Var y(3,1);
    y->matValue(0,0)=1;
    y->matValue(1,0)=2;
    y->matValue(2,0)=3;
    cout<<y->matValue;

    // save into a pmat file
    y->matValue.save("save.pmat");

    // save into an amat file
    VMat vm(y->matValue);
    vm->saveAMAT("save.amat");


    // load from a file
    VMat vm2 = getDataSet("save.pmat");
        // it could have been "save.amat"
    Var x(vm2.toMat());
    cout<<x->matValue;
    return 0;
}

3.3.4.2 All the var

Var is a subclass of Object, so you can use the methods of Object as in the following example. Note that it will save all the Var, including the sub ones.

#include <plearn/var/Var_all.h>
#include <plearn/math/TMat_maths.h>

using namespace PLearn;

int main(int argc, char** argv)
{
    Var x(3,1);
    Var u(3,1);

    x->matValue(0,0)=1;
    x->matValue(1,0)=2;
    x->matValue(2,0)=3;
    u->matValue(0,0)=4;
    u->matValue(1,0)=5;
    u->matValue(2,0)=6;

    cout<<x->matValue<<endl;
    cout<<u->matValue<<endl;

    Var z = dot(x,tanh(x+u));
        // z is a scalar variable result of the dot product
        // of x and tanh(y)

    cout<<z->matValue<<endl;

    VarArray path = propagationPath(x & u, z);
    path.fprop();

    cout<<z->matValue<<endl;

    save("z.psave",z);

    Object* o = loadObject("z.psave");

    // There is no PP<> nor * here,
    // because Var is already a PP<Variable>
    Var p = dynamic_cast<Variable*>(o);

    cout<<p->matValue<<endl;

    return 0;
}

3.3.5 Func

In order to make the usage of Var more friendly, you can use Func. The Func class is mode for those who want to make fprop on different values of Var in an elegant way. The two following examples illustrate this: they do exactly the same thing, but the first one without Func and the second one with.

#include <plearn/var/Var_all.h>
#include <plearn/math/TMat_maths.h>

using namespace PLearn;

int main(int argc, char** argv)
{
    Vec a(3),c(1);
    Vec da(3),dc(1);

    // Without Func
    Var x(3,1);
    Var y = dot(x,tanh(x));
        // y is a scalar variable result of the dot product
        // of x and tanh(x)
    VarArray path = propagationPath(x,y);

    a<<"1 2.3 4";

    x->value<<a;
    path.fprop();
    c=y->value;
    cout<<a<<endl;
    cout<<c<<endl;

    a<<"4 8.3 -12";
    x->value<<a;
    path.fprop();
    c=y->value;
    cout<<a<<endl;
    cout<<c<<endl;

    dc<<2.3;
    y->gradient<<dc;
    path.bprop();
    da<<x->gradient;
    cout<<dc<<endl;
    cout<<da<<endl;

    return 0;
}

With Func:

#include <plearn/var/Var_all.h>
#include <plearn/math/TMat_maths.h>

using namespace PLearn;

int main(int argc, char** argv)
{
    Vec a(3),c(1);
    Vec da(3),dc(1);

    // With Func
    Var x(3,1);
    Var result(1,1);
    Func f(x, result ,dot(x,tanh(x)));
        // z is a scalar variable result of the dot product
        // of x and tanh(y)

    a<<"1 2.3 4";
    f->fprop(a,c);
    cout<<a<<endl;
    cout<<c<<endl;

    a<<"4 8.3 -12";
    f->fprop(a,c);
    cout<<a<<endl;
    cout<<c<<endl;

    dc<<2.3;
    f->fbprop(a,c,da,dc);
    cout<<dc<<endl;
    cout<<da<<endl;

    return 0;
}

3.4 Online Learning

3.4.1 OnlineLearningModule

They can be found in ${PLEARNDIR}/plearn_learners/online.