I'm generating my feature dataset for machine learning, and I have a 2d numpy array X where X.shape = (n, d) - n samples, d features.
我正在为机器学习生成我的特征数据集,并且我有一个2d numpy数组X,其中X.shape =(n,d) - n个样本,d个特征。
Now I generate a new feature with one-hot-encoding - f where f.shape = (n, 1, k) - n samples, k labels.
现在我用one-hot-encoding-f生成一个新特性,其中f.shape =(n,1,k) - n个样本,k个标签。
What would be the best way for me to add this new feature to my existing feature dataset?
将这个新功能添加到现有要素数据集中的最佳方法是什么?
1 个解决方案
#1
1
The second dimension of the one-hot vector is redundant, so you can drop it and use f as a 2D array of shape (n, k)
.
You would do something like:
单热矢量的第二维是冗余的,因此您可以将其删除并将f用作形状(n,k)的2D数组。你会做的事情如下:
new_data = np.concatenate((X, f.squeeze()), axis=1)
where the squeeze()
function removes all 1-dimensions from you array (i.e. f.squeeze().shape == (n, k)
.
其中squeeze()函数从你的数组中删除所有1维(即f.squeeze()。shape ==(n,k)。
Cheers
#1
1
The second dimension of the one-hot vector is redundant, so you can drop it and use f as a 2D array of shape (n, k)
.
You would do something like:
单热矢量的第二维是冗余的,因此您可以将其删除并将f用作形状(n,k)的2D数组。你会做的事情如下:
new_data = np.concatenate((X, f.squeeze()), axis=1)
where the squeeze()
function removes all 1-dimensions from you array (i.e. f.squeeze().shape == (n, k)
.
其中squeeze()函数从你的数组中删除所有1维(即f.squeeze()。shape ==(n,k)。
Cheers