I am trying to run all the elements in just_test_data
to all the elements in just_train_data
, and return the lowest number, then run the new just_test_data
through all the just_train_data
, and so on until all the just_test_data
has been run.
我试图将just_test_data中的所有元素运行到just_train_data中的所有元素,并返回最小的数字,然后通过所有just_train_data运行新的just_test_data,依此类推,直到所有的just_test_data都已运行。
The error I keep getting is in the line
我不断得到的错误就在于此
step_1 = (abs(just_test_data[i] - just_train_data[n]) ** 2)
IndexError: arrays used as indices must be of integer (or boolean) type
When I first try to run the loop.
当我第一次尝试运行循环时。
import numpy as np
testing_data = np.genfromtxt("C:\Users\zkrumlinde\Desktop\Statistical Programming\Week 3\iris-testing-data.csv", delimiter= ',')
training_data = np.genfromtxt("C:\Users\zkrumlinde\Desktop\Statistical Programming\Week 3\iris-training-data.csv", delimiter= ',')
#create 4 arrays, the first two with the measurements of training and testing data
#the last two have the labels of each line
just_test_data = np.array(testing_data[:, 0:4])
just_train_data = np.array(training_data[:, 0:4])
testing_labels = np.array(testing_data[:, 4])
training_labels = np.array(training_data[:, 4])
n = 0
while n < len(just_train_data):
for i in just_test_data:
old_distance = 'inf'
step_1 = (abs(just_test_data[i] - just_train_data[n]) ** 2)
step_2 = sum(step_1)
new_distance = np.sqrt(step_2)
if new_distance < old_distance:
old_distance = new_distance
index = n
n = n + 1
print(training_labels[index])
2 个解决方案
#1
1
By using for i in just_test_data
you're iterating through all the elements in the just_test_data array and not and index between 0 and the array length.
通过在just_test_data中使用for i,您将遍历just_test_data数组中的所有元素,而不是在0和数组长度之间进行索引。
Also, it seems that your n = n + 1
line is not indented correctly.
此外,您的n = n + 1行似乎没有正确缩进。
Here's my guess for an updated version of your code:
以下是我对代码更新版本的猜测:
import numpy as np
testing_data = np.genfromtxt("C:\Users\zkrumlinde\Desktop\Statistical Programming\Week 3\iris-testing-data.csv", delimiter= ',')
training_data = np.genfromtxt("C:\Users\zkrumlinde\Desktop\Statistical Programming\Week 3\iris-training-data.csv", delimiter= ',')
#create 4 arrays, the first two with the measurements of training and testing data
#the last two have the labels of each line
just_test_data = np.array(testing_data[:, 0:4])
just_train_data = np.array(training_data[:, 0:4])
testing_labels = np.array(testing_data[:, 4])
training_labels = np.array(training_data[:, 4])
n = 0
while n < len(just_train_data):
for i in range(len(just_test_data)):
old_distance = 'inf'
step_1 = (abs(just_test_data[i] - just_train_data[n]) ** 2)
step_2 = sum(step_1)
new_distance = np.sqrt(step_2)
if new_distance < old_distance:
old_distance = new_distance
index = n
n = n + 1
print(training_labels[index])
#2
1
when you say for i in just_test_data:
i will be the element itself, not the index.
当你在just_test_data中为我说:我将是元素本身,而不是索引。
you probably want something like for i in range(len(just_test_data))
this will have i
as a number from 0
to the length of just_test_data - 1
.
你可能想要像我在范围内的东西(len(just_test_data))这将有一个从0到just_test_data - 1长度的数字。
edit: a few weird things in your code:
编辑:代码中的一些奇怪的东西:
step_1 = (abs(just_test_data[i] - just_train_data[n]) ** 2)
step_2 = sum(step_1)
new_distance = np.sqrt(step_2)
this just returns abs(just_test_data[i] - just_train_data[n])
. are you meaning to add a ton of step_1
up and then eventually take the sqrt
? you need to check your indents.
这只是返回abs(just_test_data [i] - just_train_data [n])。你是想增加一大步step_1然后最终采取sqrt?你需要检查你的缩进。
old_distance = 'inf'
is a string (pretty sure). you are probably looking for either np.inf
or float('inf')
. Also because you set this inside the for loop, it is getting reset for every i
. you probably want it above 'for i in just_test_data:'
old_distance ='inf'是一个字符串(非常肯定)。你可能正在寻找np.inf或float('inf')。另外因为你在for循环中设置了它,它会被重置为每个i。你可能想要它在'just_test_data:'我
a quick pass at your code:
快速通过您的代码:
min_distance = np.inf
for n in range(len(just_train_data)):
step_2 = 0
for i in range(len(just_test_data)):
step_1 = (just_test_data[i] - just_train_data[n]) ** 2
step_2 += step_1
distance = np.sqrt(step_2)
if distance < min_distance:
min_distance = distance
index = n
print(training_labels[index])
This compares a point in just_train_data
to all the points in just_test_data
to compute a distance. It will print the minimum of these distances.
这将just_train_data中的点与just_test_data中的所有点进行比较以计算距离。它将打印这些距离的最小值。
#1
1
By using for i in just_test_data
you're iterating through all the elements in the just_test_data array and not and index between 0 and the array length.
通过在just_test_data中使用for i,您将遍历just_test_data数组中的所有元素,而不是在0和数组长度之间进行索引。
Also, it seems that your n = n + 1
line is not indented correctly.
此外,您的n = n + 1行似乎没有正确缩进。
Here's my guess for an updated version of your code:
以下是我对代码更新版本的猜测:
import numpy as np
testing_data = np.genfromtxt("C:\Users\zkrumlinde\Desktop\Statistical Programming\Week 3\iris-testing-data.csv", delimiter= ',')
training_data = np.genfromtxt("C:\Users\zkrumlinde\Desktop\Statistical Programming\Week 3\iris-training-data.csv", delimiter= ',')
#create 4 arrays, the first two with the measurements of training and testing data
#the last two have the labels of each line
just_test_data = np.array(testing_data[:, 0:4])
just_train_data = np.array(training_data[:, 0:4])
testing_labels = np.array(testing_data[:, 4])
training_labels = np.array(training_data[:, 4])
n = 0
while n < len(just_train_data):
for i in range(len(just_test_data)):
old_distance = 'inf'
step_1 = (abs(just_test_data[i] - just_train_data[n]) ** 2)
step_2 = sum(step_1)
new_distance = np.sqrt(step_2)
if new_distance < old_distance:
old_distance = new_distance
index = n
n = n + 1
print(training_labels[index])
#2
1
when you say for i in just_test_data:
i will be the element itself, not the index.
当你在just_test_data中为我说:我将是元素本身,而不是索引。
you probably want something like for i in range(len(just_test_data))
this will have i
as a number from 0
to the length of just_test_data - 1
.
你可能想要像我在范围内的东西(len(just_test_data))这将有一个从0到just_test_data - 1长度的数字。
edit: a few weird things in your code:
编辑:代码中的一些奇怪的东西:
step_1 = (abs(just_test_data[i] - just_train_data[n]) ** 2)
step_2 = sum(step_1)
new_distance = np.sqrt(step_2)
this just returns abs(just_test_data[i] - just_train_data[n])
. are you meaning to add a ton of step_1
up and then eventually take the sqrt
? you need to check your indents.
这只是返回abs(just_test_data [i] - just_train_data [n])。你是想增加一大步step_1然后最终采取sqrt?你需要检查你的缩进。
old_distance = 'inf'
is a string (pretty sure). you are probably looking for either np.inf
or float('inf')
. Also because you set this inside the for loop, it is getting reset for every i
. you probably want it above 'for i in just_test_data:'
old_distance ='inf'是一个字符串(非常肯定)。你可能正在寻找np.inf或float('inf')。另外因为你在for循环中设置了它,它会被重置为每个i。你可能想要它在'just_test_data:'我
a quick pass at your code:
快速通过您的代码:
min_distance = np.inf
for n in range(len(just_train_data)):
step_2 = 0
for i in range(len(just_test_data)):
step_1 = (just_test_data[i] - just_train_data[n]) ** 2
step_2 += step_1
distance = np.sqrt(step_2)
if distance < min_distance:
min_distance = distance
index = n
print(training_labels[index])
This compares a point in just_train_data
to all the points in just_test_data
to compute a distance. It will print the minimum of these distances.
这将just_train_data中的点与just_test_data中的所有点进行比较以计算距离。它将打印这些距离的最小值。