**Description: ** | SAXVSM combines the SAX representation used in BOP with the vector space model commonly used in Information Retrieval. The key differences between BOP and SAXVSM is that SAXVSM forms word distributions over classes rather than series and weights these by the term frequency/inverse document frequency ($tf\cdot idf$). For SAXVSM, term frequency $tf$ refers to the number of times a word appears in a class and document frequency $df$ means the number of classes a word appears in. $tf\cdot idf$ is then defined as
$$ tfidf(tf, df) = \left.
\begin{cases}
\log{(1+tf)}\cdot \log(\frac{c}{ df }) & \text{if } df > 0 \\
0 & otherwise \\
\end{cases} \right. $$
where $c$ is the number of classes. SAXVSM is described formally in Algorithm 13. Parameters $l$, $\alpha$ and $w$ are set through cross validation on the training data. Predictions are made using a 1-NN classification based on the word frequency distribution of the new case and the $tf\cdot idf$ vectors of each class. The Cosine similarity measure is used. |