如果看到上篇,xgboost没有安装成功的可以提问我,本文主要讲xgboost的测试例子,例子模仿别人的,但补充了很多,希望能帮到更多人!
import sys,os
sys.path.append('E:\\xgboost-master\\xgboost-master\\wrapper')
import numpy as np
import scipy.sparse
import xgboost as xgb
dtrain = xgb.DMatrix('E:\\my-train.txt');
dtest = xgb.DMatrix('E:\\my-test.txt');
param = {'max_depth':6, 'eta':0.3, 'silent':1, 'objective':'binary:logistic'}
watchlist = [(dtest,'eval'), (dtrain,'train')]
num_round = 20
bst = xgb.train(param, dtrain, num_round, watchlist)
# this is prediction
preds = bst.predict(dtest)
labels = dtest.get_label()
print ('error=%f' % ( sum(1 for i in range(len(preds)) if int(preds[i]>0.5)!=labels[i]) /float(len(preds))))
bst.save_model('C:\\xgb.model')
我把xgboost放到E盘,简单测试了两个文件:
my-train.txt
1 1:1 2:1 3:1 4:1
1 1:2 2:2 3:2 4:2
1 1:3 2:3 3:3 4:3
1 1:4 2:4 3:4 4:4
1 1:5 2:5 3:5 4:5
1 1:6 2:6 3:6 4:6
0 1:62 2:32 3:24 4:26
0 1:39 2:73 3:93 4:35
0 1:41 2:43 3:42 4:43
0 1:5 2:35 3:52 4:53
0 1:64 2:16 3:46 4:36
my-test.txt
1 1:5 2:5 3:5 4:5
1 1:6 2:6 3:6 4:6
0 1:62 2:32 3:24 4:26
0 1:39 2:73 3:93 4:35
0 1:41 2:43 3:42 4:43
运行结果:
Error when loading sklearn/plotting. Please install scikit-learn
error=0.000000
11x5 matrix with 44 entries is loaded from E:\my-train.txt
5x5 matrix with 20 entries is loaded from E:\my-test.txt
[0] eval-error:0.000000train-error:0.000000
[1] eval-error:0.000000train-error:0.000000
[2] eval-error:0.000000train-error:0.000000
[3] eval-error:0.000000train-error:0.000000
[4] eval-error:0.000000train-error:0.000000
[5] eval-error:0.000000train-error:0.000000
[6] eval-error:0.000000train-error:0.000000
[7] eval-error:0.000000train-error:0.000000
[8] eval-error:0.000000train-error:0.000000
[9] eval-error:0.000000train-error:0.000000
[10] eval-error:0.000000train-error:0.000000
[11] eval-error:0.000000train-error:0.000000
[12] eval-error:0.000000train-error:0.000000
[13] eval-error:0.000000train-error:0.000000
[14] eval-error:0.000000train-error:0.000000
[15] eval-error:0.000000train-error:0.000000
[16] eval-error:0.000000train-error:0.000000
[17] eval-error:0.000000train-error:0.000000
[18] eval-error:0.000000train-error:0.000000
[19] eval-error:0.000000train-error:0.000000
预测代码,先加载模型和数据,然后进行预测。
#! /usr/bin/env python
#coding=utf-8
import sys,os
sys.path.append('E:\\xgboost-master\\xgboost-master\\wrapper')
import numpy as np
import scipy.sparse
import xgboost as xgb
dtest2 = xgb.DMatrix('E:\\my-test2.txt')
bst2 = xgb.Booster(model_file='C:\\xgb.model')
preds2 = bst2.predict(dtest2)
print preds2
# this is prediction
outing = open('C:\\Result.txt', 'w')
outing.write(str(int(preds2[0]>0.5))) #只输出了一个
outing.close()
my-test2:
1 1:15 2:15 3:15 4:15
1 1:6 2:6 3:6 4:6
1 1:16 2:16 3:16 4:16
0 1:62 2:32 3:24 4:26
0 1:39 2:73 3:93 4:35
0 1:411 2:43 3:42 4:43
输出结果:
[ 0.26937994 0.77472818 0.26937994 0.26937994 0.26937994 0.26937994]
6x5 matrix with 24 entries is loaded from E:\my-test2.txt