3. Basics

3.1 The plearn Program

The plearn program is to be found in PLearn/commands and is used to

either run a .plearn script
or run a plearn command

Plearn scripts are essentially text files ending in .plearn that describe a learning experiment to be performed.

Plearn commands are typically little tools that allow you to manipulate or examine datasets or result files, but they can also launch more evolved interactive programs.

The plearn program has a simple yet very useful command-line help system. Type plearn help to have an overview.

3.2 Essential Commmands

The basic plearn command is plearn script.plearn.

The wisest command is plearn help ClassFoo.

But there are others:

plearn vmat view bidule.vmat to view any .vmat, .pmat or .amat file.

plearn vmat convert truc.pmat truc.amat to convert a specific data format in an other.

plearn learner train, plearn learner test, plearn learner computes_output provide useful shortcuts to avoid creating long .plearn script (cf. Tutorial).

If you are interested in more information,

plearn help commands
plearn help vmat
plearn help learner

3.3 Essential Classes

Here is a list of essential classes.

plearn help AutoVMatrix
plearn help PTester
plearn help Optimizer
--- plearn help GradientOptimizer
plearn help PLearner
--- plearn help NNet

3.4 The .plearn Object File Format

PLearn uses the same simple file format, both to describe experiments to be performed (in .plearn scripts), and to save and restore objects such as a trained neural-network (in .psave or .spec files).

Essentially these files contain the specifications of PLearn objects.

This is a typical .plearn script:

PTester( 

learner = NNet
    (
    nhidden =  10 ;
    noutputs = 1 ;
    output_transfer_func = "";
    hidden_transfer_func = "tanh"  ;
    cost_funcs = 1 [ mse ]  ;
    optimizer = GradientOptimizer(
                    start_learning_rate = .01;
                    decrease_constant = 0;
                    )
    batch_size = 1  ;
    initialization_method = "normal_sqrt"  ;
    nstages = 500 ;
    verbosity = 3;
    );

expdir = "tutorial_task2"  ;

splitter = ExplicitSplitter(splitsets = 1 2 [
    AutoVMatrix(
        specification = "reg_train.amat"
        inputsize = 1
        targetsize = 1
        weightsize = 0
    ) 
    AutoVMatrix(
        specification = "reg_test.amat"
        inputsize = 1
        targetsize = 1
        weightsize = 0
    )
    ]
) ;

statnames = ["E[E[train.mse]]" "E[E[test.mse]]" ];

);

Objects are specified by the name of their type, followed by a list of option = value pairs.

Any sequence of spaces, newlines, tabs, comma, or semicolon is considered a separator. So colons and semicolons are just there to ease the reading, spaces would work just as well.

Comments start with a # and continue until the end of the line.

The following table sums up the formats that can be used for the values of an option of a given type

**Table 3.1:** Ascii format for given data-types
Data type	Format example
Any subclass of Object	ObjectType( option1 = value1, option2 = value2, ... )
integer	-365
floating number	-3.2e-4
string	"any string"
character	'x'
1D sequences	[ 10, 20, 30, 40 ]
	[ 10 20 30 40 ]
	4 [ 10 20 30 40 ]
	4 [ "aa", "bb", "cc", "dd" ]
2D matrices	`3 2 [ 1 2 10 20 30 40 ]`
pairs	(1, "one")
tuples	(1, "one", 3.5)
maps	`{ 1:"one", 2 :"two", 3: "three" }`
pointers to new object	`*1 -> ObjType( ... )`
reference to pointer	`*1;`

Note for strings: unquoted strings, while not recommended are also supported. They are read until a separator (blank, comma, ...) or opening or closing symbol (parenthesis, bracket, ...) is met.

3.5 The .amat File Format

Ascii data file.

The new format is as follows:

The size of the matrix is indicated by a line starting with #size: and followed by length (number of rows) and width (number of columns).
An optional line starting with #sizes: gives the inputsize, targetsize, weightsize, extrasize.
An optional line starting with #: gives the names of the fields (the columns)
Regular comment lines start with a single #.

ex:

# Characteristics of a population of 534
#size: 534 3
#sizes: 2 1 0 0 
#:  age height weight
     33  1.72   71
    25  1.80   80

3.6 The .pmat File Format

PLearn native binary format.

3.7 The .vmat File Format

File containing a description of a virtual dataset.

A .vmat contains the specification of a subclass of VMatrix, in plearn serialization format.

AutoVMatrix(
	specification = "train.amat"
	inputsize = 2
	targetsize = 1
	weightsize = 0
)