#include <TextSenseSequenceVMatrix.h>
Inheritance diagram for PLearn::TextSenseSequenceVMatrix:


Public Types | |
| typedef RowBufferedVMatrix | inherited |
Public Member Functions | |
| TextSenseSequenceVMatrix () | |
| Default constructor. After setting all options individually, build() should be called. | |
| TextSenseSequenceVMatrix (VMat that_dvm, int that_window_size, TVec< int > that_res_pos=TVec< int >(0), bool that_rand_syn=false, WordNetOntology *that_wno=NULL) | |
| int | getRestrictedRow (int i, Vec v) const |
| This restricts the extraction of the context to the words that don't have their POS in res_pos and returns the position of the next non-overlapping context. | |
| virtual void | build () |
| Should call simply inherited::build(), then this class's build_(). | |
| virtual void | makeDeepCopyFromShallowCopy (map< const void *, void * > &copies) |
| Transforms a shallow copy into a deep copy. | |
| void | setOntology (WordNetOntology *that_wno) |
| Sets the ontology. | |
| void | setWindowSize (int that_window_size) |
| Sets the number of context words. | |
| void | setWordSequence (VMat that_dvm) |
| Sets the VMatrix of word/sense_tag/POS sequence. | |
| void | setRandomGeneration (bool that_rand_syn) |
| Sets the activation/desactivation of the random generation of contexts and target words. | |
| void | setRestrictedPOS (TVec< int > that_res_pos) |
| Sets the vector of forbidden POS for the context words. | |
| void | setSentenceBoundary (int b) |
| Sets the sentence boundary symbol. | |
| void | setUndefinedPOSId (int pos_id) |
| Sets the undefined pos id. | |
| PLEARN_DECLARE_OBJECT (TextSenseSequenceVMatrix) | |
| Declares name and deepCopy methods. | |
Protected Member Functions | |
| virtual void | getNewRow (int i, const Vec &v) const |
| This is the only method requiring implementation. | |
Static Protected Member Functions | |
| void | declareOptions (OptionList &ol) |
| Declares this class' options. | |
Protected Attributes | |
| VMat | dvm |
| The VMatrix containing the sequence of words or lemmas, with their POS and WordNet (optional) tags. | |
| int | window_size |
| The number of context words. | |
| bool | is_supervised_data |
| Indication that at less some of the words or lemmas are semantically disambiguated. | |
| TVec< int > | res_pos |
| The vector containing the forbidden POS of the words given in the context of a target word. | |
| bool | rand_syn |
| Indication that examples can be randomly generated using random synonym replacements. | |
| TVec< TVec< pair< int, real > > > | word_given_sense_priors |
| Probability of a word given it has some sense. | |
| WordNetOntology * | wno |
| Ontology of the sense tagging. | |
| int | my_current_row_index |
| Index of the current row. | |
| Vec | my_current_row |
| Elements of the current row. | |
| bool | keep_in_sentence |
| Indication that the context must not spread over another sentence. | |
| int | sentence_boundary |
| Sentence boundary symbol. | |
| bool | undefined_pos_set |
| Indication that the undefined pos id is defined. | |
| int | undefined_pos |
| Undefined pos id. | |
Private Member Functions | |
| void | build_ () |
| This does the actual building. | |
| void | permute (Vec v) const |
| This permutes randomly the words (target and context) with one of their corresponding synonym. | |
| void | apply_boundary (const Vec &v) const |
| This applies the sentence boundary. | |
Definition at line 17 of file TextSenseSequenceVMatrix.h.
|
|
Reimplemented from PLearn::RowBufferedVMatrix. Definition at line 128 of file TextSenseSequenceVMatrix.h. Referenced by TextSenseSequenceVMatrix(). |
|
|
Default constructor. After setting all options individually, build() should be called.
Definition at line 9 of file TextSenseSequenceVMatrix.cc. References inherited. |
|
||||||||||||||||||||||||
|
Definition at line 67 of file TextSenseSequenceVMatrix.h. References build_(), dvm, is_supervised_data, keep_in_sentence, my_current_row, my_current_row_index, rand_syn, res_pos, undefined_pos_set, window_size, and wno. |
|
|
This applies the sentence boundary.
Definition at line 344 of file TextSenseSequenceVMatrix.cc. References sentence_boundary, undefined_pos, undefined_pos_set, UNDEFINED_SS_ID, UNDEFINED_TYPE, and window_size. Referenced by getNewRow(), and getRestrictedRow(). |
|
|
Should call simply inherited::build(), then this class's build_(). This method should be callable again at later times, after modifying some option fields to change the "architecture" of the object. Reimplemented from PLearn::VMatrix. Definition at line 527 of file TextSenseSequenceVMatrix.cc. References build_(). |
|
|
This does the actual building.
Reimplemented from PLearn::VMatrix. Definition at line 454 of file TextSenseSequenceVMatrix.cc. References PLearn::Set::begin(), dvm, PLearn::Set::end(), PLearn::TVec< TVec< pair< int, real > > >::first(), PLearn::WordNetOntology::getSenseSize(), PLearn::WordNetOntology::getWordsForSense(), PLearn::PP< VMatrix >::isNull(), PLearn::VMat::length(), PLERROR, PLWARNING, rand_syn, PLearn::TVec< TVec< pair< int, real > > >::resize(), PLearn::TVec< VMField >::resize(), PLearn::SetIterator, PLearn::Set::size(), PLearn::TVec< TVec< pair< int, real > > >::size(), PLearn::sum(), PLearn::VMat::width(), window_size, wno, and word_given_sense_priors. Referenced by build(), and TextSenseSequenceVMatrix(). |
|
|
Declares this class' options.
Reimplemented from PLearn::VMatrix. Definition at line 440 of file TextSenseSequenceVMatrix.cc. References PLearn::declareOption(), and PLearn::OptionList. |
|
||||||||||||
|
This is the only method requiring implementation.
Implements PLearn::RowBufferedVMatrix. Definition at line 24 of file TextSenseSequenceVMatrix.cc. References apply_boundary(), dvm, getRestrictedRow(), is_supervised_data, keep_in_sentence, PLearn::TVec< T >::length(), PLearn::VMat::length(), my_current_row, my_current_row_index, permute(), PLERROR, rand_syn, res_pos, PLearn::TVec< T >::size(), PLearn::TVec< int >::size(), SYNSETTAG_ID, undefined_pos, undefined_pos_set, UNDEFINED_SS_ID, UNDEFINED_TYPE, PLearn::Vec, PLearn::VMat::width(), and window_size. |
|
||||||||||||
|
This restricts the extraction of the context to the words that don't have their POS in res_pos and returns the position of the next non-overlapping context.
Definition at line 171 of file TextSenseSequenceVMatrix.cc. References apply_boundary(), PLearn::TVec< int >::contains(), dvm, is_supervised_data, keep_in_sentence, PLearn::TVec< T >::length(), PLearn::VMat::length(), my_current_row, my_current_row_index, permute(), PLERROR, rand_syn, res_pos, PLearn::TVec< T >::size(), SYNSETTAG_ID, undefined_pos, undefined_pos_set, UNDEFINED_SS_ID, UNDEFINED_TYPE, PLearn::VMat::width(), and window_size. Referenced by getNewRow(). |
|
|
Transforms a shallow copy into a deep copy.
Reimplemented from PLearn::RowBufferedVMatrix. Definition at line 533 of file TextSenseSequenceVMatrix.cc. References PLearn::deepCopyField(), dvm, and res_pos. |
|
|
This permutes randomly the words (target and context) with one of their corresponding synonym.
Definition at line 377 of file TextSenseSequenceVMatrix.cc. References ADJ_TYPE, ADV_TYPE, PLearn::WordNetOntology::getSensesForWord(), PLearn::WordNetOntology::getWord(), PLearn::WordNetOntology::getWordId(), k, NOUN_TYPE, PLearn::TVec< T >::size(), PLearn::TVec< TVec< pair< int, real > > >::size(), PLearn::stemWord(), PLearn::sum(), PLearn::WordNetOntology::temp_word_to_adj_senses, PLearn::WordNetOntology::temp_word_to_adv_senses, PLearn::WordNetOntology::temp_word_to_noun_senses, PLearn::WordNetOntology::temp_word_to_verb_senses, UNDEFINED_TYPE, PLearn::uniform_sample(), VERB_TYPE, window_size, wno, and word_given_sense_priors. Referenced by getNewRow(), and getRestrictedRow(). |
|
|
Declares name and deepCopy methods.
|
|
|
Sets the ontology.
Definition at line 108 of file TextSenseSequenceVMatrix.h. References setOntology(), and wno. Referenced by setOntology(). |
|
|
Sets the activation/desactivation of the random generation of contexts and target words.
Definition at line 117 of file TextSenseSequenceVMatrix.h. References rand_syn, and setRandomGeneration(). Referenced by setRandomGeneration(). |
|
|
Sets the vector of forbidden POS for the context words.
Definition at line 120 of file TextSenseSequenceVMatrix.h. References res_pos, and setRestrictedPOS(). Referenced by setRestrictedPOS(). |
|
|
Sets the sentence boundary symbol.
Definition at line 123 of file TextSenseSequenceVMatrix.h. References keep_in_sentence, sentence_boundary, and setSentenceBoundary(). Referenced by setSentenceBoundary(). |
|
|
Sets the undefined pos id.
Definition at line 126 of file TextSenseSequenceVMatrix.h. References setUndefinedPOSId(), undefined_pos, and undefined_pos_set. Referenced by setUndefinedPOSId(). |
|
|
Sets the number of context words.
Definition at line 111 of file TextSenseSequenceVMatrix.h. References setWindowSize(), and window_size. Referenced by setWindowSize(). |
|
|
Sets the VMatrix of word/sense_tag/POS sequence.
Definition at line 114 of file TextSenseSequenceVMatrix.h. References dvm, is_supervised_data, and setWordSequence(). Referenced by setWordSequence(). |
|
|
The VMatrix containing the sequence of words or lemmas, with their POS and WordNet (optional) tags.
Definition at line 25 of file TextSenseSequenceVMatrix.h. Referenced by build_(), getNewRow(), getRestrictedRow(), makeDeepCopyFromShallowCopy(), setWordSequence(), and TextSenseSequenceVMatrix(). |
|
|
Indication that at less some of the words or lemmas are semantically disambiguated.
Definition at line 29 of file TextSenseSequenceVMatrix.h. Referenced by getNewRow(), getRestrictedRow(), setWordSequence(), and TextSenseSequenceVMatrix(). |
|
|
Indication that the context must not spread over another sentence.
Definition at line 43 of file TextSenseSequenceVMatrix.h. Referenced by getNewRow(), getRestrictedRow(), setSentenceBoundary(), and TextSenseSequenceVMatrix(). |
|
|
Elements of the current row.
Definition at line 41 of file TextSenseSequenceVMatrix.h. Referenced by getNewRow(), getRestrictedRow(), and TextSenseSequenceVMatrix(). |
|
|
Index of the current row.
Definition at line 39 of file TextSenseSequenceVMatrix.h. Referenced by getNewRow(), getRestrictedRow(), and TextSenseSequenceVMatrix(). |
|
|
Indication that examples can be randomly generated using random synonym replacements.
Definition at line 33 of file TextSenseSequenceVMatrix.h. Referenced by build_(), getNewRow(), getRestrictedRow(), setRandomGeneration(), and TextSenseSequenceVMatrix(). |
|
|
The vector containing the forbidden POS of the words given in the context of a target word.
Definition at line 31 of file TextSenseSequenceVMatrix.h. Referenced by getNewRow(), getRestrictedRow(), makeDeepCopyFromShallowCopy(), setRestrictedPOS(), and TextSenseSequenceVMatrix(). |
|
|
Sentence boundary symbol.
Definition at line 45 of file TextSenseSequenceVMatrix.h. Referenced by apply_boundary(), and setSentenceBoundary(). |
|
|
Undefined pos id.
Definition at line 49 of file TextSenseSequenceVMatrix.h. Referenced by apply_boundary(), getNewRow(), getRestrictedRow(), and setUndefinedPOSId(). |
|
|
Indication that the undefined pos id is defined.
Definition at line 47 of file TextSenseSequenceVMatrix.h. Referenced by apply_boundary(), getNewRow(), getRestrictedRow(), setUndefinedPOSId(), and TextSenseSequenceVMatrix(). |
|
|
The number of context words.
Definition at line 27 of file TextSenseSequenceVMatrix.h. Referenced by apply_boundary(), build_(), getNewRow(), getRestrictedRow(), permute(), setWindowSize(), and TextSenseSequenceVMatrix(). |
|
|
Ontology of the sense tagging.
Definition at line 37 of file TextSenseSequenceVMatrix.h. Referenced by build_(), permute(), setOntology(), and TextSenseSequenceVMatrix(). |
|
|
Probability of a word given it has some sense.
Definition at line 35 of file TextSenseSequenceVMatrix.h. |
1.3.7