PRML report2 (2008/10/28)

PRML, www.kameda-lab.org 2008/10/28

How to Submit


Report 2A

Form and test SVM classifier for the given dataset below.

Dataset

standard set Training: Set-A/Set-B, for #10, #100, and #1000.
Test: Set-A/Set-B, for #10, #100, and #1000 respectively.

Data in CSV format


[2A-1] SVM

Prepare SVM which can select linear / non-linear, with/without soft margin, some kernels (for non-linear SVM).
You don't need to write down your own code. You can introduce any library/software.

You need to write down following topics:


[2A-2] Training SVM

Form 48 kinds of SVM (shown later) by giving the training set and show the recognition result.
Recognition results should show True-Positive, True-Negative, False-Positive, False-Negative.

True / False : symbol assigned to the date
Positive / Negative : result shown by SVM.
For the 100 samples;

False True Sum
Negative False-Negative True-Negative FN + TN = N = ? (expeected to be 100)
Positive False-Positive True-Positive FP + TP = P = ? (expected to be 100)
Total FN + FP = F = 100TN + TP = T = 100 200

SVM (48 kinds)


[2A-3] SVM test

Test 48 kinds of SVM obtained in [2A-2] by the test set.
Results should be shown in True-Positive, True-Negative, False-Positive, and False-Negative for them

Write the discussion: The test results may be worse than the training ones. Is this true? Check it out and discuss it.


Report 2B

Form and test SVM classifier for the given dataset below .

Dataset

challange set Training: Set-A/Set-B, for #10, #100, and #1000.
Test: Set-A/Set-B, for #10, #100, and #1000 respectively.

Data in CSV format


[2B-1] Training SVM

Choose the best way of forming the best SVM (any of linear/non-linear, soft margin). You are asked to make only one (best) SVM this time.

Show the SVM you used and results (again in in True-Positive, True-Negative, False-Positive, and False-Negative) for #10, #100, and #1000.


[2B-2] Test SVM

Verify the SVM performance (of [2B-1]) with the test set. Show the results (again in in True-Positive, True-Negative, False-Positive, and False-Negative) for #10, #100, and #1000.

Write the discussion: Estimate the data set and imagine the distribution form (or formula behind the data sets).
Discuss the perfomance upper bound / limits of the SVM.


kameda[at]iit.tsukuba.ac.jp