I use linear SVM from scikit learn (LinearSVC) for binary classification problem. I understand that LinearSVC can give me the predicted labels, and the decision scores but I wanted probability estimates (confidence in the label). I want to continue using LinearSVC because of speed (as compared to sklearn.svm.SVC with linear kernel) Is it reasonable to use a logistic function to convert the decision scores to probabilities?
我使用来自scikit学习(LinearSVC)的SVM来处理二进制分类问题。我知道LinearSVC可以给我预测的标签和决策分数,但我想要的是概率估计(对标签的信心)。我想继续使用线性svc,因为速度(与sklearn.svm相比)。具有线性核的SVC)使用逻辑函数将决策分数转换为概率是否合理?
import sklearn.svm as suppmach
# Fit model:
svmmodel=suppmach.LinearSVC(penalty='l1',C=1)
predicted_test= svmmodel.predict(x_test)
predicted_test_scores= svmmodel.decision_function(x_test)
I want to check if it makes sense to obtain Probability estimates simply as [1 / (1 + exp(-x)) ] where x is the decision score.
我想检查得到概率估计值是否有意义,只要得到[1 / (1 + exp(-x))],其中x是决策分数。
Alternately, are there other options wrt classifiers that I can use to do this efficiently?
或者,是否有其他的选项wrt分类器可以有效地实现这一点?
Thanks.
谢谢。
3 个解决方案
#1
6
I took a look at the apis in sklearn.svm.* family. All below models, e.g.,
我查看了sklearn.svm中的api。*家庭。所有下面的模型,例如,
- sklearn.svm.SVC
- sklearn.svm.SVC
- sklearn.svm.NuSVC
- sklearn.svm.NuSVC
- sklearn.svm.SVR
- sklearn.svm.SVR
- sklearn.svm.NuSVR
- sklearn.svm.NuSVR
have a common interface that supplies a
有一个共同的接口提供a
probability: boolean, optional (default=False)
parameter to the model. If this parameter is set to True, libsvm will train a probability transformation model on top of the SVM's outputs based on idea of Platt Scaling. The form of transformation is similar to a logistic function as you pointed out, however two specific constants A
and B
are learned in a post-processing step. Also see this * post for more details.
对模型参数。如果将该参数设置为True, libsvm将基于Platt缩放思想在SVM输出的基础上构建一个概率转换模型。转换的形式类似于你所指出的逻辑函数,但是两个特定的常数a和B是在后处理步骤中学习到的。有关更多细节,请参见这篇*文章。
I actually don't know why this post-processing is not available for LinearSVC. Otherwise, you would just call predict_proba(X)
to get the probability estimate.
实际上我不知道为什么线性svc不能使用这种后处理。否则,您只需调用predict_proba(X)来获取概率估计值。
Of course, if you just apply a naive logistic transform, it will not perform as well as a calibrated approach like Platt Scaling. If you can understand the underline algorithm of platt scaling, probably you can write your own or contribute to the scikit-learn svm family. :) Also feel free to use the above four SVM variations that support predict_proba
.
当然,如果您只是应用一个简单的逻辑转换,它将不会执行像Platt缩放那样的校准方法。如果您能够理解platt缩放的下划线算法,那么您可能可以编写自己的或者为scikit-learn svm家族做出贡献。:)也可以使用上述支持predict_proba的SVM变体。
#2
47
scikit-learn provides CalibratedClassifierCV which can be used to solve this problem: it allows to add probability output to LinearSVC or any other classifier which implements decision_function method:
scikitt -learn提供了可以用来解决这个问题的校准的分类器:它允许将概率输出添加到线性svc或任何其他实现decision_function方法的分类器:
svm = LinearSVC()
clf = CalibratedClassifierCV(svm)
clf.fit(X_train, y_train)
y_proba = clf.predict_proba(X_test)
User guide has a nice section on that. By default CalibratedClassifierCV+LinearSVC will get you Platt scaling, but it also provides other options (isotonic regression method), and it is not limited to SVM classifiers.
用户指南有一个很好的部分。默认的标准的分类器+线性svc将得到你的Platt缩放,但它也提供其他选项(等等回归方法),而且它不限于SVM分类器。
#3
13
If you want speed, then just replace the SVM with sklearn.linear_model.LogisticRegression
. That uses the exact same training algorithm as LinearSVC
, but with log-loss instead of hinge loss.
如果您想要速度,那么只需用sklearn.linear_model.LogisticRegression替换SVM。这使用了与LinearSVC完全相同的训练算法,但对数损耗而不是铰链损耗。
Using [1 / (1 + exp(-x))] will produce probabilities, in a formal sense (numbers between zero and one), but they won't adhere to any justifiable probability model.
使用[1 / (1 + exp(-x))]将产生概率,在形式意义上(0和1之间的数字),但它们不会遵循任何合理的概率模型。
#1
6
I took a look at the apis in sklearn.svm.* family. All below models, e.g.,
我查看了sklearn.svm中的api。*家庭。所有下面的模型,例如,
- sklearn.svm.SVC
- sklearn.svm.SVC
- sklearn.svm.NuSVC
- sklearn.svm.NuSVC
- sklearn.svm.SVR
- sklearn.svm.SVR
- sklearn.svm.NuSVR
- sklearn.svm.NuSVR
have a common interface that supplies a
有一个共同的接口提供a
probability: boolean, optional (default=False)
parameter to the model. If this parameter is set to True, libsvm will train a probability transformation model on top of the SVM's outputs based on idea of Platt Scaling. The form of transformation is similar to a logistic function as you pointed out, however two specific constants A
and B
are learned in a post-processing step. Also see this * post for more details.
对模型参数。如果将该参数设置为True, libsvm将基于Platt缩放思想在SVM输出的基础上构建一个概率转换模型。转换的形式类似于你所指出的逻辑函数,但是两个特定的常数a和B是在后处理步骤中学习到的。有关更多细节,请参见这篇*文章。
I actually don't know why this post-processing is not available for LinearSVC. Otherwise, you would just call predict_proba(X)
to get the probability estimate.
实际上我不知道为什么线性svc不能使用这种后处理。否则,您只需调用predict_proba(X)来获取概率估计值。
Of course, if you just apply a naive logistic transform, it will not perform as well as a calibrated approach like Platt Scaling. If you can understand the underline algorithm of platt scaling, probably you can write your own or contribute to the scikit-learn svm family. :) Also feel free to use the above four SVM variations that support predict_proba
.
当然,如果您只是应用一个简单的逻辑转换,它将不会执行像Platt缩放那样的校准方法。如果您能够理解platt缩放的下划线算法,那么您可能可以编写自己的或者为scikit-learn svm家族做出贡献。:)也可以使用上述支持predict_proba的SVM变体。
#2
47
scikit-learn provides CalibratedClassifierCV which can be used to solve this problem: it allows to add probability output to LinearSVC or any other classifier which implements decision_function method:
scikitt -learn提供了可以用来解决这个问题的校准的分类器:它允许将概率输出添加到线性svc或任何其他实现decision_function方法的分类器:
svm = LinearSVC()
clf = CalibratedClassifierCV(svm)
clf.fit(X_train, y_train)
y_proba = clf.predict_proba(X_test)
User guide has a nice section on that. By default CalibratedClassifierCV+LinearSVC will get you Platt scaling, but it also provides other options (isotonic regression method), and it is not limited to SVM classifiers.
用户指南有一个很好的部分。默认的标准的分类器+线性svc将得到你的Platt缩放,但它也提供其他选项(等等回归方法),而且它不限于SVM分类器。
#3
13
If you want speed, then just replace the SVM with sklearn.linear_model.LogisticRegression
. That uses the exact same training algorithm as LinearSVC
, but with log-loss instead of hinge loss.
如果您想要速度,那么只需用sklearn.linear_model.LogisticRegression替换SVM。这使用了与LinearSVC完全相同的训练算法,但对数损耗而不是铰链损耗。
Using [1 / (1 + exp(-x))] will produce probabilities, in a formal sense (numbers between zero and one), but they won't adhere to any justifiable probability model.
使用[1 / (1 + exp(-x))]将产生概率,在形式意义上(0和1之间的数字),但它们不会遵循任何合理的概率模型。