cross_val_score(model_name, x_samples, y_labels, cv=k)
作用:验证某个模型在某个训练集上的稳定性,输出k个预测精度。
K折交叉验证(k-fold)
把初始训练样本分成k份,其中(k-1)份被用作训练集,剩下一份被用作评估集,这样一共可以对分类器做k次训练,并且得到k个训练结果。
1 from sklearn.model_selection import cross_val_score 2 clf = sklearn.linear_model.LogisticRegression() 3 # X:features y:targets cv:k 4 cross_val_score(clf, X, y, cv=5)
模型的训练、预测和评价
1 def svm_model(): 2 from sklearn.metrics import accuracy_score 3 from sklearn.metrics import precision_score, recall_score, f1_score 4 from sklearn.svm import SVC 5 # 模型训练 6 clf = SVC(kernel='linear') 7 clf.fit(x_train_samples, y_train_labels) 8 # 模型存储 9 joblib.dump(clf, './model/svm_mode.pkl') 10 # 模型评估 11 predict_labels = clf.predict(x_test_samples) 12 Accuracy = accuracy_score(y_test_labels, predict_labels) 13 Precision = precision_score(y_test_labels, predict_labels, pos_label=0) 14 Recall = recall_score(y_test_labels, predict_labels, pos_label=0) 15 F1_scores = f1_score(y_test_labels, predict_labels, pos_label=0)
整个过程结束。需要说明的是调用K折交叉验证,结果输出的是准确率,其它的指标不会输出。所以,建议还是前期,使用train_test_split()函数划分训练集和验证集,后期根据实际需求评估模型