这里的文档。
I'm wondering how important the coef0
parameter is for SVCs under the polynomial and sigmoid kernels. As I understand it, it is the intercept term, just a constant as in linear regression to offset the function from zero. However to my knowledge, the SVM (scikit uses libsvm) should find this value.
我想知道在多项式和sigmoid内核下的SVCs的coef0参数有多重要。正如我所理解的,它是截距项,它是线性回归中的常数,以抵消函数为零的函数。但是据我所知,SVM (scikit使用libsvm)应该会找到这个值。
What's a good general range to test over (is there one?). For example, generally with C
, a safe choice is 10^-5 ... 10^5, going up in exponential steps.
一个好的一般测试范围是什么(有吗?)例如,一般与C,一个安全的选择是10 ^ 5…10 ^ 5,在指数的步骤。
But for coef0
, the value seems highly data dependent and I'm not sure how to automate choosing good ranges for each grid search on each dataset. Any pointers?
但是对于coef0,这个值似乎是高度依赖于数据的,我不确定如何在每个数据集上为每个网格搜索选择合适的范围。指针吗?
1 个解决方案
#1
3
First, sigmoid function is rarely the kernel. In fact, for almost none values of parameters it is known to induce the valid kernel (in the Mercer's sense).
首先,sigmoid函数很少是内核。事实上,对于几乎没有参数值的情况,已知会导致有效的内核(在Mercer的意义上)。
Second, coef0 is not an intercept term, it is a parameter of the kernel projection, which can be used to overcome one of the important issues with the polynomial kernel. In general, just using coef0=0 should be just fine, but polynomial kernel has one issue, with p->inf, it more and more separates pairs of points, for which <x,y>
is smaller than 1 and <a,b>
with bigger value. it is because powers of values smaller than one gets closer and closer to 0, while the same power of value bigger than one grows to infinity. You can use coef0 to "scale" your data so there is no such distinction - you can add 1-min <x,y>
, so no values are smaller than 1 . If you really feel the need for tuning this parameter, I would suggest search in the range of [min(1-min , 0),max(<x,y>
)], where max is computed through all the training set.
其次,coef0不是一个截距项,它是核投影的一个参数,可以用来克服多项式核的一个重要问题。一般来说,仅仅使用coef0=0应该是没问题的,但是多项式核有一个问题,p->inf,它越来越多地分离成对的点,因为其中
#1
3
First, sigmoid function is rarely the kernel. In fact, for almost none values of parameters it is known to induce the valid kernel (in the Mercer's sense).
首先,sigmoid函数很少是内核。事实上,对于几乎没有参数值的情况,已知会导致有效的内核(在Mercer的意义上)。
Second, coef0 is not an intercept term, it is a parameter of the kernel projection, which can be used to overcome one of the important issues with the polynomial kernel. In general, just using coef0=0 should be just fine, but polynomial kernel has one issue, with p->inf, it more and more separates pairs of points, for which <x,y>
is smaller than 1 and <a,b>
with bigger value. it is because powers of values smaller than one gets closer and closer to 0, while the same power of value bigger than one grows to infinity. You can use coef0 to "scale" your data so there is no such distinction - you can add 1-min <x,y>
, so no values are smaller than 1 . If you really feel the need for tuning this parameter, I would suggest search in the range of [min(1-min , 0),max(<x,y>
)], where max is computed through all the training set.
其次,coef0不是一个截距项,它是核投影的一个参数,可以用来克服多项式核的一个重要问题。一般来说,仅仅使用coef0=0应该是没问题的,但是多项式核有一个问题,p->inf,它越来越多地分离成对的点,因为其中