I have the code below where a simple rule based classification data set is formed:
我有下面的代码,其中形成了一个简单的基于规则的分类数据集:
# # Data preparation
data = data.frame(A = round(runif(100)), B = round(runif(100)), C = round(runif(100)))
# Y - is the classification output column
data$Y = ifelse((data$A == 1 & data$B == 1 & data$C == 0), 1, ifelse((data$A == 0 & data$B == 1 & data$C == 1), 1, ifelse((data$A == 0 & data$B ==0 & data$C == 0), 1, 0)))
# Shuffling the data set
data = data[sample(rownames(data)), ]
I have divided the data set into training and testing so that I can validate my results on the test set:
我已将数据集划分为训练和测试,以便我可以在测试集上验证我的结果:
# # Divide into train and test
library(caret)
trainIndex = createDataPartition(data[, "Y"], p = .7, list = FALSE, times = 1) # for balanced sampling
train = data[trainIndex, ]
test = data[-trainIndex, ]
I have tried building a simple neural-net with the number of neurons in hidden layer is chosen by looping (as mentioned here)
我尝试构建一个简单的神经网络,通过循环选择隐藏层中的神经元数量(如此处所述)
# # Build a neural net
library(neuralnet)
for(alpha in 2:10)
{
nHidden = round(nrow(train)/(alpha*(3+1)))
nn = neuralnet(Y ~ A + B + C, train, linear.output = F, likelihood = T, err.fct = "ce", hidden = nHidden)
# Calculate Mean Squared Error for Train and Test
trainMSE = mean((round(nn$net.result[[1]]) - train$Y)^2)
testPred = round(compute(nn,test[-length(ncol(test))])$net.result)
testMSE = mean((testPred - test$Y)^2)
print(paste("Train Error: " , round(trainMSE, 4), ", Test Error: ", round(testMSE, 4), ", #. Hidden = ", nHidden, sep = ""))
}
[1] "Train Error: 0, Test Error: 0.6, #. Hidden = 9"
[1]“列车错误:0,测试错误:0.6,#。隐藏= 9”
[1] "Train Error: 0, Test Error: 0.6, #. Hidden = 6"
[1]“训练错误:0,测试错误:0.6,#。隐藏= 6”
[1] "Train Error: 0, Test Error: 0.6, #. Hidden = 4"
[1]“训练错误:0,测试错误:0.6,#。隐藏= 4”
[1] "Train Error: 0, Test Error: 0.6, #. Hidden = 4"
[1]“训练错误:0,测试错误:0.6,#。隐藏= 4”
[1] "Train Error: 0.1429, Test Error: 0.8333, #. Hidden = 3"
[1]“训练错误:0.1429,测试错误:0.8333,#。隐藏= 3”
[1] "Train Error: 0.1429, Test Error: 0.8333, #. Hidden = 2"
[1]“训练错误:0.1429,测试错误:0.8333,#。隐藏= 2”
[1] "Train Error: 0.0857, Test Error: 0.6, #. Hidden = 2"
[1]“训练错误:0.0857,测试错误:0.6,#。隐藏= 2”
[1] "Train Error: 0.1429, Test Error: 0.8333, #. Hidden = 2"
[1]“训练错误:0.1429,测试错误:0.8333,#。隐藏= 2”
[1] "Train Error: 0.0857, Test Error: 0.6, #. Hidden = 2"
[1]“训练错误:0.0857,测试错误:0.6,#。隐藏= 2”
It was giving poor over fitted results. But, when I have built a simple random forest on the same data set. I am getting the train and test errors as - 0
它给了穷人超过合适的结果。但是,当我在同一数据集上构建一个简单的随机森林时。我得到的火车和测试错误为 - 0
# # Build a Random Forest
trainRF = train
trainRF$Y = as.factor(trainRF$Y)
testRF = test
library(randomForest)
rf = randomForest(Y ~ ., data = trainRF, mtry = 2)
# Calculate Mean Squared Error for Train and Test
trainMSE = mean((round(rf$votes[,2]) - as.numeric(as.character(trainRF$Y)))^2)
testMSE = mean((round(predict(rf, testRF, type = "prob")[,2]) - as.numeric(as.character(testRF$Y)))^2)
print(paste("Train Error: " , round(trainMSE, 4), ", Test Error: ", round(testMSE, 4), sep = ""))
[1] "Train Error: 0, Test Error: 0"
[1]“训练错误:0,测试错误:0”
Please help me in understanding why neural nets are failing in a simple case where random forest is working with 100-Percent Accuracy.
请帮助我理解为什么神经网络在一个简单的情况下失败,其中随机森林正在以100%的准确度工作。
Note: I have used only one hidden layer (assuming one hidden layer is enough for such simple classification) and iterated on the number of neurons in the hidden layer.
注意:我只使用了一个隐藏层(假设一个隐藏层足以进行这种简单的分类)并迭代隐藏层中的神经元数量。
Also, help me if my understanding of neural network parameters is wrong.
另外,如果我对神经网络参数的理解是错误的,请帮助我。
Complete code can be found here
完整的代码可以在这里找到
1 个解决方案
#1
1
A similar question has been hunting me for some time, so I tried understanding your data and problem and compared them to mine. In the end, though, it's just a small bug in this line:
一个类似的问题一直在寻找我,所以我尝试了解你的数据和问题,并将它们与我的相比较。但最后,这只是这一行中的一个小错误:
testPred = round(compute(nn,test[-length(ncol(test))])$net.result)
You select B
, C
and Y
for prediction, instead of A
, B
and C
, because length(ncol(something))
will always return 1. You want only test[-ncol(test)]
.
您选择B,C和Y进行预测,而不是A,B和C,因为长度(ncol(某事物))将始终返回1.您只需要测试[-ncol(test)]。
> summary(test[-length(ncol(test))])
B C Y
Min. :0.00 Min. :0.0 Min. :0.0000000
1st Qu.:0.00 1st Qu.:0.0 1st Qu.:0.0000000
Median :0.00 Median :0.5 Median :0.0000000
Mean :0.48 Mean :0.5 Mean :0.3766667
3rd Qu.:1.00 3rd Qu.:1.0 3rd Qu.:1.0000000
Max. :1.00 Max. :1.0 Max. :1.0000000
#1
1
A similar question has been hunting me for some time, so I tried understanding your data and problem and compared them to mine. In the end, though, it's just a small bug in this line:
一个类似的问题一直在寻找我,所以我尝试了解你的数据和问题,并将它们与我的相比较。但最后,这只是这一行中的一个小错误:
testPred = round(compute(nn,test[-length(ncol(test))])$net.result)
You select B
, C
and Y
for prediction, instead of A
, B
and C
, because length(ncol(something))
will always return 1. You want only test[-ncol(test)]
.
您选择B,C和Y进行预测,而不是A,B和C,因为长度(ncol(某事物))将始终返回1.您只需要测试[-ncol(test)]。
> summary(test[-length(ncol(test))])
B C Y
Min. :0.00 Min. :0.0 Min. :0.0000000
1st Qu.:0.00 1st Qu.:0.0 1st Qu.:0.0000000
Median :0.00 Median :0.5 Median :0.0000000
Mean :0.48 Mean :0.5 Mean :0.3766667
3rd Qu.:1.00 3rd Qu.:1.0 3rd Qu.:1.0000000
Max. :1.00 Max. :1.0 Max. :1.0000000