xgb误差。setinfo(dmat, names(p)， p[[1]]):标签的长度必须等于输入数据中的行数。

I am using xgboost in R. I had a matrix and created the xgb matrix fine, but when I reduce the columns in the data, I am getting the following error: Error in xgb.setinfo(dmat, names(p), p[[1]]) : The length of labels must equal to the number of rows in the input data

我在r中使用xgboost，我有一个矩阵，并创建了xgb矩阵，但是当我减少数据中的列时，我得到了以下错误:xgb中的错误。setinfo(dmat, names(p)， p[[1]]):标签的长度必须等于输入数据中的行数。

Here is the R code:

这是R代码:

xgbmat1=xgb.DMatrix(Matrix(data.matrix(ctt1)),label=as.matrix(as.numeric(data$V2))-1) xgbmat1=xgb.DMatrix(Matrix(data.matrix(ctt1[,nr])),label=as.matrix(as.numeric(data$V2))-1)

xgbmat1 = xgb.DMatrix(矩阵(data.matrix(ctt1)),标签= as.matrix(as.numeric(数据V2)美元)1)xgbmat1 = xgb.DMatrix(矩阵(data.matrix(ctt1(nr))),标签= as.matrix(as.numeric(数据V2)美元)1)

the first works fine though.

第一个作品很好。

dim(ctt1[,nr])

暗(ctt1(nr))

[1] 6401 1048

6401年[1]6401

dim(ctt1)

暗(ctt1)

[1] 6401 5901

6401年[1]6401

3 个解决方案

#1

It turns out that by removing some columns, there are some rows with all 0s, and could not contribute to model.

事实证明，通过删除一些列，有一些列与所有的0，并且不能对模型作出贡献。

#2

For sparse matrices, xgboost R interface uses the CSC format creation method. The problem currently is that this method automatically determines the number of rows from the existing non-sparse values, and any completely sparse rows at the end are not counted in. A similar loss of completely sparse columns at the end can happen with the CSR sparse format. For more details see xgboost issue #1223 and also wikipedia on the sparse matrix formats.

对于稀疏矩阵，xgboost R接口使用CSC格式创建方法。目前的问题是，该方法自动地从现有的非稀疏值中确定行数，并且在末尾的任何完全稀疏的行都不被计算在内。在最后可能会出现类似的完全稀疏列的损失。有关更多细节，请参见xgboost问题#1223，以及关于稀疏矩阵格式的wikipedia。

#3

In my case I fixed this error by changing assign operation:

在我的例子中，我通过改变分配操作来修正这个错误:

labels <- df_train$target_feature

标签< - df_train target_feature美元

#1