The support vector machine (SVM) is a flexible classification or regression method by using its many kernels. To apply a SVM, we possibly need to specify a kernel, a regularization parameter c and some kernel parameters like gamma. Besides the selection of regularization parameter c in my previous post, the SVM procedure and the iris flower data set are used here to discuss the kernel selection in SAS.
Exploration of the iris flower data
The iris data is classic for classification exercise. If we use the first two components from Principle Component Analysis (PCA) to compress the four predictors, petal length, petal width, sepal length, sepal width, to 2D space, then two linear boundaries seem barely able to separate the three different species such as Setosa, Versicolor and Virginica. In general, the SASHELP.IRIS is a well-separated data set for the response variable
****(1). Data exploration of iris flower data set****; data iris; set sashelp.iris; run; proc contents data=iris position; run; proc princomp data=iris out=iris_pca; var Sepal: Petal:; run; proc sgplot data=iris_pca; scatter x = prin1 y = prin2 / group = species; run;
PROC SVM with four different kernels
|Kernel methods||Option in SAS||Formula||Parameter in SAS|
|polynomial||polynom||(gamma*u’*v + coef)^degree||K_PAR|
|sigmoid||sigmoid||tanh(gamma*u’*v + coef)||K_PAR; K_PAR2|
PROC SVM in SAS has provided a range of kernels for selection, including
TANH. Another great thing is that it supports cross-validation including Leave-One-Out Cross-Validation ( by
loooption in PROC SVM) and k-Fold Cross-Validation (by
splitoption in PROC SVM).
Here the error rates of Leave-One-Out Cross-Validation is used to compare the performance among the four common kernels including linear, radial basis function, polynomial and sigmoid. And in this experiment most time the parameters such as c and gamma are arbitrarily set to be 1. As the result showed in the bar plot, the RBF and linear kernels bring great results, while RBF is slightly better than linear. On the contrary, the polynomial and sigmoid kernels behave very badly. In conclusion, the selection of kernel for SVM depends on the reality of the data set. A non-linear or complicated kernel is actually not necessary for an easily-classified example like the iris flower data set.
****(2). Cross validation error comparison of 4 kernels****; proc dmdb batch data=iris dmdbcat=_cat out=_iris; var Sepal: Petal:; class species; run; %let knl = linear; proc svm data=_iris dmdbcat=_cat kernel=&knl c=1 cv =loo; title "The kernel is &knl"; ods output restab = &knl; var Sepal: Petal:; target species; run; %let knl = rbf; proc svm data=_iris dmdbcat=_cat kernel=&knl c=1 K_PAR=1 cv=loo; title "The kernel is &knl"; ods output restab = &knl; var Sepal: Petal:; target species; run; %let knl = polynom; proc svm data=_iris dmdbcat=_cat kernel=&knl c=1 K_PAR =3 cv=loo; title "The kernel is &knl"; ods output restab = &knl; var Sepal: Petal:; target species; run; %let knl = sigmoid; proc svm data=_iris dmdbcat=_cat kernel=&knl c=1 K_PAR=1 K_PAR2=1 cv=loo; title "The kernel is &knl"; ods output restab = &knl; var Sepal: Petal:; target species; run; data total; set linear rbf polynom sigmoid; where label1 in ('Kernel Function','Classification Error (Loo)'); cValue1 = lag(cValue1); if missing(nValue1) = 0; run; proc sgplot data=total; title " "; vbar cValue1 / response = nValue1; xaxis label = "Selection of kernel"; yaxis label = "Classification Error by Leave-one-out Cross Validation"; run;