I am trying to figure out how isolated certain points are within my data set. I am using two methods to determine isolation, the distance of the closest neighbor and the number of neighboring sites within a given radius. All my coordinates are in latitude and longitude
我想弄清楚我的数据集中有多少孤立的某些点。我使用两种方法来确定隔离,最近邻居的距离和给定半径内的相邻站点的数量。我所有的坐标都是纬度和经度
This is what my data looks like:
这就是我的数据:
pond lat long area canopy avg.depth neighbor n.lat n.long n.distance n.area n.canopy n.depth n.avg.depth radius1500
A10 41.95928 -72.14605 1500 66 60.61538462
AA006 41.96431 -72.121 250 0 57.77777778
Blacksmith 41.95508 -72.123803 361 77 71.3125
Borrow.Pit.1 41.95601 -72.15419 0 0 41.44444444
Borrow.Pit.2 41.95571 -72.15413 0 0 37.7
Borrow.Pit.3 41.95546 -72.15375 0 0 29.22222222
Boulder 41.918223 -72.14978 1392 98 43.53333333
I want to put the name of the nearest neighboring pond in the column neighbor, its lat and long in n.lat and n.long, the distance between the two ponds in n.distance, and the area, canopy and avg.depth in each of the appropriate columns.
我想把最近的邻近池塘的名称放在列邻居中,它的纬度和长度在n.lat和n.long,两个池塘之间的距离为n.distance,以及区域,冠层和avg.depth in每个适当的列。
Second, I want to put the number of ponds within 1500m of the target pond into radius1500.
其次,我想把目标池塘1500米范围内的池塘数量调到半径1500。
Does anyone know of a function or package that will help me calculate the distances/numbers that I want? If it's an issue, it won't be hard to enter the other data I need, but the nearest neighbor's name and distance, plus the number of ponds within 1500m is what I really need help with.
有没有人知道有助于我计算我想要的距离/数字的功能或包?如果这是一个问题,输入我需要的其他数据并不困难,但是最近邻居的名字和距离加上1500米以内的池塘数量是我真正需要帮助的。
Thank you.
谢谢。
2 个解决方案
#1
32
Best option is to use libraries sp
and rgeos
, which enable you to construct spatial classes and perform geoprocessing.
最佳选择是使用库sp和rgeos,这使您可以构建空间类并执行地理处理。
library(sp)
library(rgeos)
Read the data and transform them to spatial objects:
读取数据并将其转换为空间对象:
mydata <- read.delim('d:/temp/testfile.txt', header=T)
sp.mydata <- mydata
coordinates(sp.mydata) <- ~long+lat
class(sp.mydata)
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
Now calculate pairwise distances between points
现在计算点之间的成对距离
d <- gDistance(sp.mydata, byid=T)
Find second shortest distance (closest distance is of point to itself, therefore use second shortest)
找到第二个最短距离(最近距离指向自身,因此使用第二个最短距离)
min.d <- apply(d, 1, function(x) order(x, decreasing=F)[2])
Construct new data frame with desired variables
使用所需变量构造新数据框
newdata <- cbind(mydata, mydata[min.d,], apply(d, 1, function(x) sort(x, decreasing=F)[2]))
colnames(newdata) <- c(colnames(mydata), 'neighbor', 'n.lat', 'n.long', 'n.area', 'n.canopy', 'n.avg.depth', 'distance')
newdata
pond lat long area canopy avg.depth neighbor n.lat n.long n.area n.canopy n.avg.depth
6 A10 41.95928 -72.14605 1500 66 60.61538 Borrow.Pit.3 41.95546 -72.15375 0 0 29.22222
3 AA006 41.96431 -72.12100 250 0 57.77778 Blacksmith 41.95508 -72.12380 361 77 71.31250
2 Blacksmith 41.95508 -72.12380 361 77 71.31250 AA006 41.96431 -72.12100 250 0 57.77778
5 Borrow.Pit.1 41.95601 -72.15419 0 0 41.44444 Borrow.Pit.2 41.95571 -72.15413 0 0 37.70000
4 Borrow.Pit.2 41.95571 -72.15413 0 0 37.70000 Borrow.Pit.1 41.95601 -72.15419 0 0 41.44444
5.1 Borrow.Pit.3 41.95546 -72.15375 0 0 29.22222 Borrow.Pit.2 41.95571 -72.15413 0 0 37.70000
6.1 Boulder 41.91822 -72.14978 1392 98 43.53333 Borrow.Pit.3 41.95546 -72.15375 0 0 29.22222
distance
6 0.0085954872
3 0.0096462277
2 0.0096462277
5 0.0003059412
4 0.0003059412
5.1 0.0004548626
6.1 0.0374480316
Edit: if coordinates are in degrees and you would like to calculate distance in kilometers, use package geosphere
编辑:如果坐标以度为单位并且您想要以公里为单位计算距离,请使用包地球圈
library(geosphere)
d <- distm(sp.mydata)
# rest is the same
This should provide better results, if the points are scattered across the globe and coordinates are in degrees
如果点分散在地球上并且坐标以度为单位,则应该提供更好的结果
#2
1
The Solution propose by @Zbynek is quite nice but if you are looking for a distance between two neighboor in km like I am , I am proposing this solution.
@Zbynek提出的解决方案非常好,但如果你正在寻找像我这样的两个neighboor之间的距离,我建议这个解决方案。
earth.dist<-function(lat1,long1,lat2,long2){
rad <- pi/180
a1 <- lat1 * rad
a2 <- long1 * rad
b1 <- lat2 * rad
b2 <- long2 * rad
dlat <- b1-a1
dlon<- b2-a2
a <- (sin(dlat/2))^2 +cos(a1)*cos(b1)*(sin(dlon/2))^2
c <- 2*atan2(sqrt(a),sqrt(1-a))
R <- 6378.145
dist <- R *c
return(dist)
}
Dist <- matrix(0,ncol=length(mydata),nrow=length(mydata.sp))
for (i in 1:length(mydata)){
for(j in 1:length(mydata.sp)){
Dist[i,j] <- earth.dist(mydata$lat[i],mydata$long[i],mydata.sp$lat[j],mydata.sp$long[j])
}}
DDD <- matrix(0, ncol=5,nrow=ncol(Dist)) ### RECTIFY the nb of col by the number of variable you want
for(i in 1:ncol(Dist)){
sub<- sort(Dist[,i])[2]
DDD[i,1] <- names(sub)
DDD[i,2] <- sub
DDD[i,3] <- rownames(Dist)[i]
sub_neig_atr <- Coord[Coord$ID==names(sub),]
DDD[i,4] <- sub_neig_atr$area
DDD[i,5] <- sub_neig_atr$canopy
### Your can add any variable you want here
}
DDD <- as.data.frame(DDD)
names(DDD)<-c("neigboor_ID","distance","pond","n.area","n.canopy")
data <- merge(mydata,DDD, by="pond")
You end up getting a distance in km if your coordinates are long and lat.
如果你的坐标很长并且是纬度的话,你最终得到的距离是km。
Any suggestions to make it better ?
有什么建议让它更好吗?
#1
32
Best option is to use libraries sp
and rgeos
, which enable you to construct spatial classes and perform geoprocessing.
最佳选择是使用库sp和rgeos,这使您可以构建空间类并执行地理处理。
library(sp)
library(rgeos)
Read the data and transform them to spatial objects:
读取数据并将其转换为空间对象:
mydata <- read.delim('d:/temp/testfile.txt', header=T)
sp.mydata <- mydata
coordinates(sp.mydata) <- ~long+lat
class(sp.mydata)
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
Now calculate pairwise distances between points
现在计算点之间的成对距离
d <- gDistance(sp.mydata, byid=T)
Find second shortest distance (closest distance is of point to itself, therefore use second shortest)
找到第二个最短距离(最近距离指向自身,因此使用第二个最短距离)
min.d <- apply(d, 1, function(x) order(x, decreasing=F)[2])
Construct new data frame with desired variables
使用所需变量构造新数据框
newdata <- cbind(mydata, mydata[min.d,], apply(d, 1, function(x) sort(x, decreasing=F)[2]))
colnames(newdata) <- c(colnames(mydata), 'neighbor', 'n.lat', 'n.long', 'n.area', 'n.canopy', 'n.avg.depth', 'distance')
newdata
pond lat long area canopy avg.depth neighbor n.lat n.long n.area n.canopy n.avg.depth
6 A10 41.95928 -72.14605 1500 66 60.61538 Borrow.Pit.3 41.95546 -72.15375 0 0 29.22222
3 AA006 41.96431 -72.12100 250 0 57.77778 Blacksmith 41.95508 -72.12380 361 77 71.31250
2 Blacksmith 41.95508 -72.12380 361 77 71.31250 AA006 41.96431 -72.12100 250 0 57.77778
5 Borrow.Pit.1 41.95601 -72.15419 0 0 41.44444 Borrow.Pit.2 41.95571 -72.15413 0 0 37.70000
4 Borrow.Pit.2 41.95571 -72.15413 0 0 37.70000 Borrow.Pit.1 41.95601 -72.15419 0 0 41.44444
5.1 Borrow.Pit.3 41.95546 -72.15375 0 0 29.22222 Borrow.Pit.2 41.95571 -72.15413 0 0 37.70000
6.1 Boulder 41.91822 -72.14978 1392 98 43.53333 Borrow.Pit.3 41.95546 -72.15375 0 0 29.22222
distance
6 0.0085954872
3 0.0096462277
2 0.0096462277
5 0.0003059412
4 0.0003059412
5.1 0.0004548626
6.1 0.0374480316
Edit: if coordinates are in degrees and you would like to calculate distance in kilometers, use package geosphere
编辑:如果坐标以度为单位并且您想要以公里为单位计算距离,请使用包地球圈
library(geosphere)
d <- distm(sp.mydata)
# rest is the same
This should provide better results, if the points are scattered across the globe and coordinates are in degrees
如果点分散在地球上并且坐标以度为单位,则应该提供更好的结果
#2
1
The Solution propose by @Zbynek is quite nice but if you are looking for a distance between two neighboor in km like I am , I am proposing this solution.
@Zbynek提出的解决方案非常好,但如果你正在寻找像我这样的两个neighboor之间的距离,我建议这个解决方案。
earth.dist<-function(lat1,long1,lat2,long2){
rad <- pi/180
a1 <- lat1 * rad
a2 <- long1 * rad
b1 <- lat2 * rad
b2 <- long2 * rad
dlat <- b1-a1
dlon<- b2-a2
a <- (sin(dlat/2))^2 +cos(a1)*cos(b1)*(sin(dlon/2))^2
c <- 2*atan2(sqrt(a),sqrt(1-a))
R <- 6378.145
dist <- R *c
return(dist)
}
Dist <- matrix(0,ncol=length(mydata),nrow=length(mydata.sp))
for (i in 1:length(mydata)){
for(j in 1:length(mydata.sp)){
Dist[i,j] <- earth.dist(mydata$lat[i],mydata$long[i],mydata.sp$lat[j],mydata.sp$long[j])
}}
DDD <- matrix(0, ncol=5,nrow=ncol(Dist)) ### RECTIFY the nb of col by the number of variable you want
for(i in 1:ncol(Dist)){
sub<- sort(Dist[,i])[2]
DDD[i,1] <- names(sub)
DDD[i,2] <- sub
DDD[i,3] <- rownames(Dist)[i]
sub_neig_atr <- Coord[Coord$ID==names(sub),]
DDD[i,4] <- sub_neig_atr$area
DDD[i,5] <- sub_neig_atr$canopy
### Your can add any variable you want here
}
DDD <- as.data.frame(DDD)
names(DDD)<-c("neigboor_ID","distance","pond","n.area","n.canopy")
data <- merge(mydata,DDD, by="pond")
You end up getting a distance in km if your coordinates are long and lat.
如果你的坐标很长并且是纬度的话,你最终得到的距离是km。
Any suggestions to make it better ?
有什么建议让它更好吗?