Subsections

2. Basics

2.1 PLearn for Matrix-Vectors Operations

PLearn has its own vector and matrix data structures. The files PLearn/plearn/math/TVec_{decl,math}.h contain the declaration and implementation of the vector class templateTVec, and the matrix class template TMat can be found in files PLearn/plearn/math/TMat_{decl,math}.h.

2.1.1 Creation and Basic Manipulations

The PLearn vector and matrix data structures are easy to instantiate and support many useful basic operations, such as subvector and submatrix access.

Here is a concrete example of how to use these data structures. The data type Vec and Mat refer to the TVec<real> and TMat<real> classes, where real is a macro corresponding either to double or float, depending on the compilation options used.

#include <plearn/math/TMat_maths.h>
using namespace PLearn;

int main(int argc, char** argv)
{
    // Example use of a `real' variable
    // a compilation option makes it either a double or a float
    // Please don't use double nor float

    real a=15;
    cout<<"a="<<a<<endl;
    // Output:
    // a=15

    // Vector creation
    Vec b(3);
    b[0] = 2;
    b[1] = 42;
    b[2] = 21;
    cout<<"b="<<b<<endl;
    cout<<"b.length()="<<b.length()<<endl;
    // Output:
    // b=2           42          21
    // b.length()=3

    // Vector manipulations:

    // Subvector access of the last two elements (not a copy!!!)
    Vec b3 = b.subVec(1,2);
    cout<<"b3="<<b3<<endl;
    // Output:
    // b3=42          21

    // Concatenation
    Vec b4 = concat(b,b);
    cout<<"b4="<<b4<<endl;
    // Output:
    // b4=2           42          21          2           42          21

    // Note: "=" operator does not copy!!!
    Vec b5 = b4;
    b5[1] = 100000;
    cout<<"b4="<<b4<<endl;
    cout<<"b5="<<b5<<endl;
    // Output:
    // b4=2           100000      21          2           42          21
    // b5=2           100000      21          2           42          21
    

    // Copy
    b5 = b4.copy();
    b5[1]=1;
    cout<<"b4="<<b4<<endl;
    cout<<"b5="<<b5<<endl;
    // Output:
    // b4=2           100000      21          2           42          21
    // b5=2           1      21          2           42          21


    // Fill in one element
    Vec b6(b4.length());
    b6.fill(3);
    cout<<"b6="<<b6<<endl;
    // Output:
    // b6=3           3           3           3           3           3

    // Fill in elements of another vector
    b6 << b4;
    cout<<"b6="<<b6<<endl;
    // Output:
    // b6=2           100000      21          2           42          21

    // Clear
    b6.clear();
    cout<<"b6="<<b6<<endl;
    // Output:
    // b6=0           0           0           0           0           0

    // Resize
    b4.resize(7);
    b4[6] = 6;
    cout<<"b4="<<b4<<endl;
    // Output:
    // b4=2           100000      21          2           42          21          6
    
    b4.resize(4);
    cout<<"b4="<<b4<<endl;
    // Output:
    // b4=2           100000      21          2


    // Matrix creation : 
    Mat c(3,2);
    c(1,1)=1.1;
    c(1,0)=4;
    c(2,0)=5;
    c(0,1)=-73.2;
    c(0,0)=78;
    c(2,1)=5.32e-2;
    cout<<"c=\n"<<c<<endl;
    cout<<"c.length()="<<c.length()<<endl;
    cout<<"c.width()="<<c.width()<<endl;
    // Output:
    // c=
    // 78          -73.2
    // 4           1.1
    // 5           0.0532
    //
    // c.length()=3
    // c.width()=2

    // Matrix manipulation:
   
    // Submatrix access (not a copy!!!)...

    // ... of the last two rows and first column
    Mat c3 = c.subMat(1,0,2,1);
    cout<<"c3=\n"<<c3<<endl;
    // Output:
    // c3=
    // 4
    // 5

    // ... of the second column
    Mat c4 = c.column(1);
    cout<<"c4=\n"<<c4<<endl;
    // Output:
    // c4=
    // -73.2
    // 1.1
    // 0.0532

    // ... of the third row
    Mat c5 = c.row(2);
    cout<<"c5=\n"<<c5<<endl;
    // Output:
    // c5=
    // 5           0.0532

    // .. of the third row, as a vector
    Vec b7 = c(2);
    cout<<"b7="<<b7<<endl;
    // Output:
    // b7=5           0.0532

    // Note: "=" operator does not copy!!!
    Mat c6 = c;
    c6(1,1) = 100000;
    cout<<"c=\n"<<c<<endl;
    cout<<"c6=\n"<<c6<<endl;
    // Output:
    // c=
    // 78          -73.2
    // 4           100000
    // 5           0.0532
    //
    // c6=
    // 78          -73.2
    // 4           100000
    // 5           0.0532

    // Copy
    c6 = c.copy();
    c6(1,1) = 1;
    cout<<"c=\n"<<c<<endl;
    cout<<"c6=\n"<<c6<<endl;
    // Output:
    // c=
    // 78          -73.2
    // 4           100000
    // 5           0.0532
    //
    // c6=
    // 78          -73.2
    // 4           1
    // 5           0.0532

    // Fill in one element
    Mat c7(c.length(),c.width());
    c7.fill(3);
    cout<<"c7=\n"<<c7<<endl;
    // Output:
    // c7=
    // 3           3
    // 3           3
    // 3           3

    // Fill in elements of another matrix
    c7 << c;
    cout<<"c7=\n"<<c7<<endl;
    // Output:
    // c7=
    // 78          -73.2
    // 4           100000
    // 5           0.0532

    // Fill in a row of another matrix
    c7(2) << c(1);
    cout<<"c7=\n"<<c7<<endl;
    // Output:
    // c7=
    // 78          -73.2
    // 4           100000
    // 5           0.0532

    // Clear
    c7.clear();
    cout<<"c7=\n"<<c7<<endl;
    // Output:
    // c7=
    // 0           0
    // 0           0
    // 0           0

    // Resize
    c7.resize(4,4);
    c7.subMat(0,2,3,2)<<c.subMat(0,0,3,2);
    c7(3,0)=0.01;
    c7(3,1)=0.02;
    c7(3,2)=0.03;
    c7(3,3)=0.04;
    cout<<"c7=\n"<<c7<<endl;
    // Output:
    // c7=
    // 0           0           78          -73.2
    // 0           0           4           100000
    // 0           0           5           0.0532
    // 0.01        0.02        0.03        0.04


    c7.resize(2,3);
    cout<<"c7=\n"<<c7<<endl;
    // Output:
    // c7=
    // 0           0           78
    // 0           0           4

    return 0;
}

For other useful methods for TVec and TMat and more details on their implementation, see files PLearn/plearn/math/TVec_{decl,math}.h and PLearn/plearn/math/TMat_{decl,math}.h

2.1.2 Mathematical Manipulations

Though you might want to implement certain mathematical functions or operators yourself, many mathematical manipulations for TVec and TMat are already implemented in PLearn.

In PLearn/plearn/math/TMat_maths_impl.h, many mathematical operators, such as +, -, *, /, +=, -=, *= and /= are already overloaded. When using +, -, * or /, a new vector/matrix is created as the result of the operation, and when using +=, -=, *=, /=, the operand on the left is modified and no object is created. Also, many vector/matrix products are implemented. Given the vector $x$ and $y$ and the matrices $A$, $B$ and $C$:

Moreover, the functions productAcc, transposeProductAcc and externalProductAcc perform the same operations but accumulate the result of the computations in the modified data structure instead of overwriting what it initially contained. For example, the computation of $A x + A y = z$ can be done by the following calls: product(z,A,x) followed by productAcc(z,A,y).

Many other standard functions can be found in PLearn/plearn/math/TMat_maths_impl.h. The most popular are probably sign, max, argmax, min, argmin, softmax, exp, abs, log, logadd, sqrt, sigmoid and tanh.

When considering to implement a given mathematical function on vectors and matrices in PLearn, some time can be saved by first looking in PLearn/plearn/math/TMat_maths_impl.h in order to verify whether it has already been implemented.

In PLearn/plearn/math/TMat_maths_specialisation.h, optimized versions of vector/matrix operators for specific data types and relying on the BLAS library can be found. Also, in PLearn/plearn/math/plapack.h, other specialized functions for vectors and matrices (matrix inverse, eigenvalue and singular value decomposition, linear system solver, etc.) relying on the LAPACK library can also be found.

2.1.3 Loading and saving

You can load and save a Mat with the following code (VMat.h must be included):

#include <plearn/math/TMat_maths.h>
#include <plearn/var/Var_all.h>
#include <plearn/vmat/VMat.h>
#include <plearn/db/getDataSet.h>

using namespace PLearn;

int main(int argc, char** argv)
{
    Mat c(3,2);
    c(1,1)=1.0;
    c(1,0)=4.0;
    c(2,0)=5.0;
    c(0,1)=73.0;
    c(0,0)=78.0;
    c(2,1)=5.0;

    // save into a pmat file
    c.save("save.pmat");

    // save into an amat file
    VMat vm(c);
    vm->saveAMAT("save.amat");

    // load from a file
    VMat vm2 = getDataSet("save.pmat");
        // it could have been "save.amat"
    Mat m = vm2.toMat();
    cout<<m;

    return 0;
}

2.2 How to create a PLearner?

PLearner is the super class for learner. Here we describe how to create a PLearner. PLearner is a subclass of Object so if you want to know more about what you are doing, go to section [*].

2.2.1 What?

A PLearner is an object intended to learn some structure in the data that is provided to it during a training phase, and use this knowledge to do some inference on (usually new) data during a testing phase.

A typical training phase includes:

At this point, the PLearner is ready to be used on test data. You can:

2.2.2 Where?

If your learner is experimental, and at least until it compiles and work perfectly under every situation, you should not commit it in $PLEARNDIR/plearn_learners with the other ones, but it is still a good idea to put it in the version tracking system, and to commit often. When your learner works robustly, you can move it into the corresponding subdirectory of plearn_learners.

If you have an account on LisaPLearn, the best is to use $LISAPLEARNDIR/UserExp/your_login/any_path, if you don't you can create a subdirectory in $PLEARNDIR/plearn_learners_experimental and use it as a working directory.

2.2.3 How?

Here is a step-by-step example of how to implement MyLearner:

  1. Once you are in your working directory, type:
    $ pyskeleton PLearner MyLearner
    
    where MyLearner is the name of the PLearner you want to create.

    This will create two files, MyLearner.h and MyLearner.cc, from a template. These files contain the prototypes of the methods you need to implement in order to follow the PLearner interface, and some comments to help you filling them.

  2. Edit MyLearner.h, and add (possibly short) Doxygen documentation about what your learner is supposed to do. Something like:
    namespace PLearn {
    
    /**
     * Learns the meaning of life.
     * This class learns how to find the meaning of life through the application
     * of stochastic methods. Tests can be performed on several 42-dimensions
     * vectors.
     *
     * @todo Make God fit into this framework.
     *
     */
    class MyLearner : public PLearner
    

  3. Declare your public options. These will typically be the hyperparameters of your algorithm, and the options allowing to switch between different methods. These are the options the user will need to provide for your algorithm to know what to do, but they can change during the learning phase. Don't forget to put comments:
    class MyLearner : public PLearner
    {
        typedef PLearner inherited;
    
    public:
        //#####  Public Build Options  ############################################
    
        //! ### declare public option fields (such as build options) here
        //! Start your comments with Doxygen-compatible comments such as //!
    
        //! Initial parameters, specified by the user
        Vec init_params;
    
        /**
         * Method to use for performing learning.
         * One of:
         *   - "none": use raw data
         *   - "first": first method
         *   - "second": second method
         */
        string learning_method;
    
    You can skip the “public methods” section for the moment.

  4. Declare your protected options. These will typically be parameters learned from your data, or from the public options (cached to avoid always accessing them).
    protected:
        //#####  Protected Options  ###############################################
    
        // ### Declare protected option fields (such as learned parameters) here
    
        //! Number of initial parameters
        int nparams;
    
        //! 'learning_method' as number: 0 for none, 1 for first, 2 for second
        int method;
    
        //! Learned parameters: a matrix of size (nparams * inputsize())
        // (inputsize() is a method of PLearner)
        Mat learned_params;
    

  5. Declare other variables you will need during learning or computations, and you don't want to reallocate each time. This members can be protected or private, depending if your subclasses are likely to use them.
        //####  Not Options  ######################################################
        //! Stores intermediate results
        Vec tmp;
    

  6. Edit MyLearner.cc. First, fill the PLearn documentation of the class. This is usually the same as the doxygen one.
    namespace PLearn {
    using namespace std;
    
    PLEARN_IMPLEMENT_OBJECT(
        MyLearner,
        "Learns the meaning of life.",
        "This class learns how to find the meaning of life through the"
        " application\n"
        "of stochastic methods. Tests can be performed on several 42-dimensions\n"
        "vectors.\n");
    

  7. Write the default constructor. First, initialize all fields which need it (such as int, bool, real...), and the ones you want to, to a default value. You can skip some (like the Vec and Mat, that have a reasonable default constructor), but you have to initialize the fields in the same order you declared them.
    MyLearner::MyLearner()
        : learning_method("none"),
          nparams(-1),
          method(0),
          tmp(42)
    {
    }
    

    The comment says you may want to call build_() to finish the construction of the object. For this default constructor, you probably don't want to do it, because build_() will be called anyway after setting the actual option values.

  8. [Optional] Write other constructors. You can write new constructors, that would take (for example) as arguments all the parameters needed to build the learner completely. In such a constructor, you may want to call build_() (so you are sure everything is usable right after construction) or build everything
    // In MyLearner.h
    MyLearner( Vec the_init_params, string the_learning_method = "none" );
    
    // In MyLearner.cc
    // If everything is in the constructor:
    MyLearner::MyLearner( Vec the_init_params, string the_learning_method );
        : init_params(the_init_params),
          learning_method(the_learning_method),
          nparams(the_init_params.length()),
          method(-1),
          learned_params(nparams, max(0,inputsize())),
          tmp(42)
    {
        learning_method = lowerstring(learning_method);
        if( learning_method == "none" )
            method = 0;
        else if( learning_method == "first" )
            method = 1;
        else if( learning_method == "second" )
            method = 2;
        else
            PLERROR("MyLearner - learning_method '%s' is unknown.",
                    learning_method.c_str());
    }
    
    // If we prefer to call build():
    MyLearner::MyLearner( Vec the_init_params, string the_learning_method );
        : init_params(the_init_params),
          learning_method(the_learning_method),
          nparams(-1), method(-1)
    {
        // We are not sure inherited::build() has been called, so:
        build();
    }
    

  9. Now, declare (in the sense of PLearn) the options of the learner, as you do for any Object. The options we had in section “Public Build Option” will be labeled buildoption, the ones in “Protected Option” will be labeled learntoption, and the ones in “Not Options” will not be declared.
    void MyLearner::declareOptions(OptionList& ol)
    {
        // ### Declare all of this object's options here.
        // ### For the "flags" of each option, you should typically specify
        // ### one of OptionBase::buildoption, OptionBase::learntoption or
        // ### OptionBase::tuningoption. If you don't provide one of these three,
        // ### this option will be ignored when loading values from a script.
        // ### You can also combine flags, for example with OptionBase::nosave:
        // ### (OptionBase::buildoption | OptionBase::nosave)
    
        // First, the public build options
        declareOption(ol, "init_params", &MyLearner::init_params,
                      OptionBase::buildoption,
                      "Initial parameters");
    
        declareOption(ol, "learning_method", &MyLearner::learning_method,
                      OptionBase::buildoption,
                      "Method to use for performing learning.\n"
                      "One of:\n"
                      "  - "none": use raw data\n"
                      "  - "first": first method\n"
                      "  - "second": second method\n");
    
        // Then, the learned options
        declareOption(ol, "nparams", &MyLearner::nparams,
                      OptionBase::learntoption,
                      "Number of initial parameters");
    
        declareOption(ol, "method", &MyLearner::method,
                      OptionBase::learntoption,
                      "'learning_method' as a number:\n"
                      "0 for none, 1 for first, 2 for second.\n");
    
        declareOption(ol, "learned_params", &MyLearner::learned_params,
                      OptionBase::learntoption,
                      "Learned parameters: a matrix of size (nparams *"
                      " inputsize())");
    
        // Now call the parent class' declareOptions
        inherited::declareOptions(ol);
    }
    

  10. Now, if you include your header in plearn_inc.h:
    #include <plearn_learners_experimental/some_path/MyLearner.h>
    
    you should be able to compile plearn, and to have help of all the “buildoption” options when typing:
    $ plearn help MyLearner
    
    Options that are not labeled “buildoption”, such as the “learntoption” options defined above, do not appear: it is not necessary to provide them (it could even confuse your learner if their values are inconsistent).

  11. Now, let's implement a basic version of the build_() method. It is intended to let you test (and debug) your class in a few easy situations, but you will have to rewrite a more complete version later (see section [*]), so that it works correctly in every case.

    The goal of build() is to ensure that the object is in a consistent state, and ready to be used. This method calls inherited::build() (in our case, PLearner::build()), and then build_(), which we have to implement. It is called in various situations, so a correct version of build_() should check everything. Now, we will only focus one simple scenario, where the sequence of methods called is:

    The first call to build_() can be used to do some initializations that do not fit in the default constructor, or thar use the default values of parent object (PLearner), for example. The first and second calls set the values of the parameters learned from build options. Here, we resize learned_params only if the parameter inputsize_ is positive (meaning the input size of the learner have been set).

    //! @todo rewrite this method to work in every case
    void MyLearner::build_()
    {
        // ### This method should do the real building of the object,
        // ### according to set 'options', in *any* situation.
        // ### Typical situations include:
        // ###  - Initial building of an object from a few user-specified options
        // ###  - Building of a "reloaded" object: i.e. from the complete set of
        // ###    all serialised options.
        // ###  - Updating or "re-building" of an object after a few "tuning"
        // ###    options have been modified.
        // ### You should assume that the parent class' build_() has already been
        // ### called.
    
        nparams = init_params.length();
    
        if( inputsize_ > 0 )
            learned_params.resize(nparams, inputsize());
    
        learning_method = lowerstring(learning_method);
        if( learning_method == "none" )
            method = 0;
        else if( learning_method == "first" )
            method = 1;
        else if( learning_method == "second" )
            method = 2;
        else
            PLERROR("MyLearner - learning_method '%s' is unknown.",
                    learning_method.c_str());
    }
    

    The member PLearner::inputsize_ is equal to the size of the elements the learner takes as input, or $-1$ if they are not set (or variable). There are two possibilities to have it set: calling setTrainingSet(...) on some VMat (see section [*]), in that case it will be set to the inputsize of that VMat, or having it set as an option (or read from a script).

    In the function above, if inputsize_ is set, no matter how, it will resize learned_params, and that is what we want: always use all the informations available.

  12. Since PLearn uses smart pointers (see section [*]), when we make a copy of an Object, it is by default a “shallow” copy, meaning that the pointers are copied, but still point to the same actual data. Each Object (hence each PLearner) class has to implement a method called makeDeepCopyFromShallowCopy that creates a real, “deep” copy of every field we have a pointer to (and recursively).

    In this step, we ensure that every member of our PLearner that is (or contain) a smart pointer (a PP<something>) will be “deepCopied”. Usually, it concerns the members of type TVec<something>, Vec, TMat<something>, Mat and of course PP<something>. If you use classes defined elsewhere, be careful that a class name could be a typedef to PP<something>: for instance a VMat is a PP<VMatrix>, so you would have to call deepCopy on it.

    void MyLearner::makeDeepCopyFromShallowCopy(CopiesMap& copies)
    {
        inherited::makeDeepCopyFromShallowCopy(copies);
    
        // ### Call deepCopyField on all "pointer-like" fields
        // ### that you wish to be deepCopied rather than
        // ### shallow-copied.
        // ### ex:
        // deepCopyField(trainvec, copies);
    
        deepCopyField(init_params, copies);
        deepCopyField(learned_params, copies);
        deepCopyField(tmp, copies);
    }
    
    Don't forget to remove these lines when finished:
        // ### Remove this line when you have fully implemented this method.
        PLERROR("StackedModulesLearner::makeDeepCopyFromShallowCopy not fully (correctly) implemented yet!");
    

  13. We will now implement the methods that are specific to PLearner. Let's begin with outputsize(). Whereas the value of inputsize(), targetsize() and weightsize() will be automatically set from the training set's sizes (see [*]), you have to define the output size of your learner. It can depend on inputsize() and other parameters, but you should not change it during the learning or testing phase.
    int MyLearner::outputsize() const
    {
        // Compute and return the size of this learner's output (which typically
        // may depend on its inputsize(), targetsize() and set options).
        if( method == 0 )
            return 42;
        else if( method == 1 )
            return inputsize()+1;
        else if( method == 2 )
            return inputsize()*2;
        else
        {
            PLERROR("MyLearner::outputsize() - method '%i' is unknown.\n"
                    "Did you call 'build()'?\n", method);
            return 0; // to avoid warning, we must return in every case
        }
    }
    

  14. The method forget() is used to forget everything the PLearner learned during the training phase. It can be called when changing the training set (for the first training set, or if what it learned with one data set is not usable with another one), or if we want to try to learn with different parameters, for example. The “learntoption” parameters we have set during build_() are not affected.
    void MyLearner::forget()
    {
        //! (Re-)initialize the PLearner in its fresh state (that state may depend
        //! on the 'seed' option) and sets 'stage' back to 0 (this is the stage of
        //! a fresh learner!)
        /*!
          A typical forget() method should do the following:
          - initialize a random number generator with the seed option
          - initialize the learner's parameters, using this random generator
          - stage = 0
        */
        learned_params.clear();
        stage = 0;
    }
    

  15. The method computeOutput(input, output) computes an output vector from the input part of a data point. It is routinely called when the PLearner is trained (because it is what you train it for!), but can also be used during training. The implementation really depends on what you want your learner to do.

    void MyLearner::computeOutput(const Vec& input, Vec& output) const
    {
        // Compute the output from the input.
        int nout = outputsize();
        output.resize(nout);
    
        if( method == 0 )
        {
            tmp += sum(input);
            output << tmp;
        }
        else if( method == 1 )
        {
            output.subVec(0,input.length()) << input;
            output[ inputsize() ] = sum(tmp);
        }
        else if( method == 2 )
        {
            if( inputsize() != 42 )
                PLERROR("MyLearner::computeOutput: inputsize() is '%d', but\n"
                        "Learning method 'second' only works when inputsize() =="
                        " 42.\n", inputsize());
            output.subVec(0,42) << input;
            output.subVec(42,42) << tmp;
        }
    }
    

  16. The method computeCostsFromOutput(...) is usually called after computeOutput, because it computes the costs from already computed outputs, knowing the actual target (if any). Note that you can have several costs, each of one being a scalar. Typical costs are squared distances between output and target, NLL, hinge loss... It mainly depends on how you measure the performance of your learning algorithm.

    Do not forget the “const” at the end of the declaration line.

    void MyLearner::computeCostsFromOutputs(const Vec& input, const Vec& output,
                                            const Vec& target, Vec& costs) const
    {
        // Compute the costs from *already* computed output.
        costs.resize(1);
        costs[0] = powdistance(output, target, 2); // squared error
    }
    

  17. [Optional] The method computeOutputAndCosts(...), as it names tells, computes both the output vector and the costs, from an input vector and the corresponding target. There is a default implementation in the parent PLearner class, that calls computeOutput and then computeCostsFromOutput, but you may want to reimplement it to improve efficiency.

    It would be the case if computing the output gives you the costs as well (if you don't need the target). Then, you may want to implement computeOutputAndCosts, and make computeOutput and computeCostsFromOutput call it.

    You may also want to reimplement the method computeCostsOnly.

  18. The method train() performs the actual training. Its implementation will mostly depend on your algorithm. The pseudo-code included in MyLearner.cc will give you an idea of the general structure of a train() method, but it is possible that your learner doesn't really fit in this.

    Vague advice: don't hesitate to add helper functions or routines to prevent train() from being too big an monolithic; keeping statistics during training can be useful; you can use computeOutput, computeOutputAndCosts and computeCostsFromOutput from inside train() to avoid code duplication (but it might be impossible for some learners).

    void MyLearner::train()
    {
        // The role of the train method is to bring the learner up to
        // stage==nstages, updating train_stats with training costs measured
        // on-line in the process.
    
        static Vec input;  // static so we don't reallocate memory each time...
        static Vec target; // (but be careful that static means shared!)
        static Vec train_costs;
    
        input.resize(inputsize());    // the train_set's inputsize()
        target.resize(targetsize());  // the train_set's targetsize()
        real weight;
    
        // This generic PLearner method does a number of standard stuff useful for
        // (almost) any learner, and return 'false' if no training should take
        // place. See PLearner.h for more details.
        if (!initTrain())
            return;
    
        // learn until we arrive at desired stage
        for( ; stage < nstages ; stage ++ )
        {
            // clear stats of previous epoch
            train_stats->forget();
    
            // loop over all examples in train set
            for( int sample=0 ; sample<nsamples ; sample++ )
            {
                train_set->getExample(sample, input, target, weight);
                computeOutputAndCosts(input, target, output, train_costs);
                // keep statistics of costs
                train_stats->update(train_costs);
    
                // minimize the cost on current sample, modifying learned_params
                // this function is defined elsewhere in this file...
                minimizeByVariationalMethods(input, target, output, train_costs);
            }
            train_stats->finalize(); // finalize statistics for this epoch
        }
    }
    

  19. getTestCostNames() and getTrainCostNames() return, as a vector of strings, the names of the costs computed during testing and during training (respectively). These values are used to interpret the accumulated statistics.

    getTrainCostNames.length() should be equal to the train_costs vector used to update the training statistics, and getTestCostNames.length() should be equal to the length of the costs vectors computed in computeCostsFromOutput. However, the train and test costs are not necessarily the same, and do not have necessarily the same length.

    TVec<string> MyLearner::getTestCostNames() const
    {
        return TVec<string>(1, "squared_distance");
    }
    
    TVec<string> MyLearner::getTrainCostNames() const
    {
        return getTestCostNames();
    }
    

2.2.4 And now?

Now, you can compile plearn, and try your learner in different situations, from a program as well as from a script, with different training sets, debug it, and see if the results seem normal. It is usually a good thing to try it with a simple “toy” dataset, for which you can do the computation by hand, in order to be sure everything works as intended.


2.2.5 A build_() that works in every situation

  1. split the building of the different parts of your PLearner into different functions,

  2. call one of these functions as soon as you have all the elements needed,

  3. it is better to compute twice the same thing at build time and lose a bit of efficiency, than to think that everything has already been set to the right values when one parameter has changed.

2.2.6 Useful members and methods defined in PLearner class

  1. inputsize()
  2. setTrainingSet()
  3. train_set
  4. train_stats
  5. stage
  6. nstages


2.2.7 Datasets

In PLearn, the data structure for datasets (training, validation and test sets) correspond to the VMatrix class. Conceptually, it corresponds to a matrix where each row is a sample (training or test example). The length of the VMatrix, that is the number of samples it contains, is given by the length() method.

A row is divided in three parts called the input, target and weight parts. Their respective sizes are given by the methods inputsize(), targetsize() and weightsize() of the VMatrix object. The total width of the VMatrix is given by the method width() and should be equal to the sum of the input, target and weight part sizes. The weight size should also be either 1 or 0, that is a sample either has a weight or does not.

There are two ways of accessing a sample of a VMatrix. The method getRow(i, row) can be called, where i is the index of the row and row is a Vec which will be filled with the input, target and weight parts of the $i^{th}$ sample. The length of row should already be set to the width of the VMatrix.

Another possibility is to call getExample(i, input, target, weight), where input and target are Vec objects and will be filled with the input and target parts of the $i^{th}$ sample. The weight variable is a reference to a real, which will be equal to the weight of the $i^{th}$ sample. The input and target vectors do not have to be sized according to the VMatrix input and target part sizes. They will be resized appropriately by VMatrix (this is possible since they are passed as references to the method).

Usually, instead of manipulating VMatrix objects, you will have access to a VMat object, which can be seen as a pointer to a VMatrix object and should be used as such. The VMat class inherits from the class PP<VMatrix>, that is the class of smart pointers for VMatrix objects. To know more about smart pointers, see section [*].

The VMatrix object is implemented in files PLearn/plearn/vmat/VMatrix.{cc,h}, where you will find many more methods for this class. However, the ones that we described in this section will usually be sufficient for the implementation of a PLearner.

2.2.8 Testing phase

TODO

2.2.9 How to get the dataset?

TODO

2.2.10 How to manage the dataset?

TODO

2.2.11 If you need gradients on a cost function...

Go read the section [*] and the section [*].