将inverse_scale NN输出和model.predict_classes输出附加到csv中

I have trained a Neural Network and I want to append the prediction values to inverse_scaled test data so I can check the predictions vs the original feature values. However, when I run the code, the following line:

我已经训练了一个神经网络,我想将预测值附加到inverse_scaled测试数据,这样我就可以检查预测值与原始特征值。但是,当我运行代码时,以下行:

Xtest["prediciton"] = pred

throws the following error:

抛出以下错误:

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

I believe that is because after the following line, Xtest becomes a np.array:

我相信这是因为在以下行之后,Xtest变成了一个np.array:

Xtest = scaler.inverse_transform(Xtest)

Here's the full code:

这是完整的代码:

import keras
import numpy
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
import pandas as pd
import numpy as np
import matplotlib
from matplotlib import style
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from IPython.core.display import display
from sklearn.preprocessing import MinMaxScaler

matplotlib.style.use('ggplot')

data_num = pd.read_csv('mult_test.csv')
print(data_num.head(n=10))

scaler = MinMaxScaler(feature_range=(0, 1))
features = data_num.drop(['Label1'], axis=1, errors='ignore')
features = pd.DataFrame(scaler.fit_transform(features))
scale_num_data = pd.concat([data_num['Label1'], features], axis=1)


dtrain, dtest = train_test_split(scale_num_data, test_size=0.25, random_state=570)
X = dtrain.drop(['Label1'], axis=1, errors='ignore')
y = dtrain['Label1']
Xtest = dtest.drop(['Label1'], axis=1, errors='ignore')
Xtest.to_csv('X_test_1.csv')
ytest = dtest['Label1']


model = Sequential([
    Dense(10, input_shape=(4, ), activation='relu'),
    Dense(32, activation='relu'),
    Dense(10, activation='softmax')
])

model.summary()
model.compile(optimizer='rmsprop', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X, y, epochs=10, batch_size=10, shuffle=True)


scores = model.evaluate(Xtest, ytest, batch_size=5)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))


pred = model.predict_classes(Xtest)
Xtest = scaler.inverse_transform(Xtest)
Xtest["prediciton"] = pred
Xtest.to_csv("Xtest_predict.csv")

Thank you for your help, guys!

伙计们,谢谢你的帮助!

1 个解决方案

#1

scikit's transformers don't preserve panda dataframes. So when you pass one through a transform or inverse_transform, np.ndarray comes out.

scikit的变形金刚不保留熊猫数据帧。因此当你通过变换或inverse_transform传递一个时,np.ndarray就出来了。

You tried to access the Xtest's item 'prediction', but that's not a valid position item for an np.ndarray.

您试图访问Xtest的项目“预测”,但这不是np.ndarray的有效位置项。

To fix this, just create a new DataFrame or don't dump scaler.inverse_transform directly onto Xtest:

要解决此问题,只需创建一个新的DataFrame,或者不要将scaler.inverse_transform直接转储到Xtest上:

results = pd.DataFrame({
  "data": scaler.inverse_transform(Xtest),
  "predictions": model.predict_classes(Xtest)
})
results.to_csv("Xtest_predict.csv")

#1