I am trying to write an SVM following this tutorial but using my own data. https://pythonprogramming.net/preprocessing-machine-learning/?completed=/linear-svc-machine-learning-testing-data/
我正在尝试按照本教程编写SVM,但使用我自己的数据。 https://pythonprogramming.net/preprocessing-machine-learning/?completed=/linear-svc-machine-learning-testing-data/
I keep getting this error:
我一直收到这个错误:
ValueError: The number of classes has to be greater than one; got 1
My code is:
我的代码是:
header1 = ["Number of Sides", "Standard Deviation of Number of Sides/Perimeter",
"Standard Deviation of the Angles", "Largest Angle"]
header2 = ["Label"]
features = header1
features1 = header2
def Build_Data_Set():
data_df = pd.DataFrame.from_csv("featureVectors.csv")
#data_df = data_df[:3]
X = np.array(data_df[features].values)
data_df2 = pd.DataFrame.from_csv("labels.csv")
y = np.array(data_df2[features1].replace("Circle",0).replace("Triangle",1)
.replace("Square",2).replace("Parallelogram",3)
.replace("Rectangle",4).values.tolist())
return X,y
def Analysis():
test_size = 4
X,y = Build_Data_Set()
print(len(X))
clf = svm.SVC(kernel = 'linear', C = 1.0)
clf.fit(X[:-test_size],y[:-test_size])
correct_count = 0
for x in range(1, test_size+1):
if clf.predict(X[-x])[0] == y[-x]:
correct_count += 1
print("Accuracy:", (correct_count/test_size) * 100.00)
My array for features which is used for X looks like this:
我用于X的功能数组如下所示:
[[4, 0.001743713493735165, 0.6497055601752815, 90.795723552739275],
[4, 0.0460937435599832, 0.19764217920409227, 90.204147248752378],
[1, 0.001185534503063044, 0.3034913722821194, 60.348908179729023],
[1, 0.015455289770298222, 0.8380914254332884, 109.02120657826231],
[3, 0.0169961646358455, 0.2458746325894564, 136.83829993466398]]
My array for labels used in Y looks like this:
我在Y中使用的标签数组如下所示:
['Square', 'Square', 'Circle', 'Circle', 'Triangle']
I have only used 5 sets of data so far because I knew the program wasn't working.
到目前为止,我只使用了5组数据,因为我知道该程序无效。
I have attached pictures of the values in their csv files in case that helps.
我已经附加了他们的csv文件中的值的图片,以防有帮助。
Printing X.shape and y.shape and showing the full error
打印X.shape和y.shape并显示完整错误
1 个解决方案
#1
1
Looks to me like the problem is this line:
在我看来问题就是这一行:
clf.fit(X[:-test_size],y[:-test_size])
Since X has 5 rows, and you've set test_size to 4, X[:-test_size] only gives one row (the first one). Read up on python's slice notation, if this confuses you: Explain Python's slice notation
由于X有5行,并且您已将test_size设置为4,因此X [: - test_size]仅提供一行(第一行)。阅读python的切片表示法,如果这让您感到困惑:解释Python的切片表示法
So there is only one class in the training set ("Square" in this case). I wonder if you meant to do X[:test_size]
which would give the first 4 rows. Anyway, try training on a bigger data set.
因此训练集中只有一个类(在这种情况下为“Square”)。我想知道你是否打算做X [:test_size],它会给出前4行。无论如何,尝试对更大的数据集进行培训。
I can reproduce your error with the following:
我可以使用以下内容重现您的错误:
import numpy as np
from sklearn import svm
X = np.array([[4, 0.001743713493735165, 0.6497055601752815, 90.795723552739275],
[4, 0.0460937435599832, 0.19764217920409227, 90.204147248752378],
[1, 0.001185534503063044, 0.3034913722821194, 60.348908179729023],
[1, 0.015455289770298222, 0.8380914254332884, 109.02120657826231],
[3, 0.0169961646358455, 0.2458746325894564, 136.83829993466398]])
y = np.array(['Square', 'Square', 'Circle', 'Circle', 'Triangle'])
print X.shape # (5,4)
print y.shape # (5,)
clf = svm.SVC(kernel='linear',C=1.0)
test_size = 4
clf.fit(X[:-test_size],y[:-test_size])
#1
1
Looks to me like the problem is this line:
在我看来问题就是这一行:
clf.fit(X[:-test_size],y[:-test_size])
Since X has 5 rows, and you've set test_size to 4, X[:-test_size] only gives one row (the first one). Read up on python's slice notation, if this confuses you: Explain Python's slice notation
由于X有5行,并且您已将test_size设置为4,因此X [: - test_size]仅提供一行(第一行)。阅读python的切片表示法,如果这让您感到困惑:解释Python的切片表示法
So there is only one class in the training set ("Square" in this case). I wonder if you meant to do X[:test_size]
which would give the first 4 rows. Anyway, try training on a bigger data set.
因此训练集中只有一个类(在这种情况下为“Square”)。我想知道你是否打算做X [:test_size],它会给出前4行。无论如何,尝试对更大的数据集进行培训。
I can reproduce your error with the following:
我可以使用以下内容重现您的错误:
import numpy as np
from sklearn import svm
X = np.array([[4, 0.001743713493735165, 0.6497055601752815, 90.795723552739275],
[4, 0.0460937435599832, 0.19764217920409227, 90.204147248752378],
[1, 0.001185534503063044, 0.3034913722821194, 60.348908179729023],
[1, 0.015455289770298222, 0.8380914254332884, 109.02120657826231],
[3, 0.0169961646358455, 0.2458746325894564, 136.83829993466398]])
y = np.array(['Square', 'Square', 'Circle', 'Circle', 'Triangle'])
print X.shape # (5,4)
print y.shape # (5,)
clf = svm.SVC(kernel='linear',C=1.0)
test_size = 4
clf.fit(X[:-test_size],y[:-test_size])