Acronym: SAXVSMType: DictionaryYear: 2013Publication: ICDM

Description: SAXVSM combines the SAX representation used in BOP with the vector space model commonly used in Information Retrieval. The key differences between BOP and SAXVSM is that SAXVSM forms word distributions over classes rather than series and weights these by the term frequency/inverse document frequency ($tf\cdot idf$). For SAXVSM, term frequency $tf$ refers to the number of times a word appears in a class and document frequency $df$ means the number of classes a word appears in. $tf\cdot idf$ is then defined as
$$ tfidf(tf, df) = \left. \begin{cases} \log{(1+tf)}\cdot \log(\frac{c}{ df }) & \text{if } df > 0 \\ 0 & otherwise \\ \end{cases} \right. $$
where $c$ is the number of classes. SAXVSM is described formally in Algorithm 13.
Parameters $l$, $\alpha$ and $w$ are set through cross validation on the training data. Predictions are made using a 1-NN classification based on the word frequency distribution of the new case and the $tf\cdot idf$ vectors of each class. The Cosine similarity measure is used.
Source Code: SAX and Vector Space Model Code
Published Results:Recreated Results:

Published
Dataset:Result:
Adiac0.381
Beef0.033
CBF0.002
FaceAll0.207
Fish0.017
GunPoint0.007
Lightning20.196
Lightning70.301
OliveOil0.1
OSULeaf0.107
SwedishLeaf0.251
SyntheticControl0.01
TwoPatterns0.004
Wafer0.0006
Yoga0.164

Recreated
Dataset:Result:
Adiac0.4574
ArrowHead0.7789
Beef0.4960
BeetleFly0.9085
BirdChicken0.8860
Car0.8613
CBF0.9575
ChlorineConcentration0.6572
CinCECGtorso0.7299
Coffee0.9382
Computers0.6656
CricketX0.7155
CricketY0.6601
CricketZ0.7188
DiatomSizeReduction0.8433
DistalPhalanxOutlineCorrect0.7370
DistalPhalanxOutlineAgeGroup0.7743
DistalPhalanxTW0.6301
Earthquakes0.7376
ECG2000.8354
ECG50000.9145
ECGFiveDays0.9182
ElectricDevices0.6984
FaceAll0.9645
FaceFour0.9434
FacesUCR0.9227
FiftyWords0.4407
Fish0.9405
FordA0.8232
FordB0.7427
GunPoint0.9591
Ham0.8025
HandOutlines0.8778
Haptics0.4223
Herring0.5938
InlineSkate0.4066
InsectWingbeatSound0.5360
ItalyPowerDemand0.8169
LargeKitchenAppliances0.8507
Lightning20.7439
Lightning70.5955
Mallat0.8506
Meat0.9540
MedicalImages0.4778
MiddlePhalanxOutlineCorrect0.7072
MiddlePhalanxOutlineAgeGroup0.6203
MiddlePhalanxTW0.5393
MoteStrain0.8125
NonInvasiveFatalECGThorax10.5404
NonInvasiveFatalECGThorax20.6108
OliveOil0.8463
OSULeaf0.8598
PhalangesOutlinesCorrect0.6834
Phoneme0.1342
Plane0.9799
ProximalPhalanxOutlineCorrect0.7986
ProximalPhalanxOutlineAgeGroup0.7960
ProximalPhalanxTW0.6670
RefrigerationDevices0.6292
ScreenType0.5238
ShapeletSim0.8086
ShapesAll0.6872
SmallKitchenAppliances0.5824
SonyAIBORobotSurface10.7543
SonyAIBORobotSurface20.8271
StarlightCurves0.8412
Strawberry0.9698
SwedishLeaf0.7056
Symbols0.8711
SyntheticControl0.8691
ToeSegmentation10.9279
ToeSegmentation20.9208
Trace0.9924
TwoLeadECG0.9155
TwoPatterns0.8890
UWaveGestureLibraryX0.5334
UWaveGestureLibraryY0.4384
UWaveGestureLibraryZ0.4809
UWaveGestureLibraryAll0.8006
Wafer0.9964
Wine0.8781
WordSynonyms0.4546
Worms0.5932
WormsTwoClass0.7210
Yoga0.8359

Algorithm: