I have two csv files that I want to compare and perform a function/calculation if four conditions are satisfied.
我有两个我要比较的csv文件,如果满足四个条件,则执行函数/计算。
file1:
SN CY Year Month Day Hour Lat Lon
196101 1 1961 1 14 12 8.3 134.7
196101 1 1961 1 14 18 8.8 133.4
196101 1 1961 1 15 0 9.1 132.5
196101 1 1961 1 15 6 9.3 132.2
196101 1 1961 1 15 12 9.5 132
196101 1 1961 1 15 18 9.9 131.8
196125 1 1961 1 14 12 10.0 136
196125 1 1961 1 14 18 10.5 136.5
file2:
Year Month Day RR Hour Lat Lon
1961 1 14 0 0 14.0917 121.055
1961 1 14 0 6 14.0917 121.055
1961 1 14 0 12 14.0917 121.055
1961 1 14 0 18 14.0917 121.055
1961 1 15 0 0 14.0917 121.055
1961 1 15 0 6 14.0917 121.055
I am trying to calculate the distance between Lat-Lon points from these two files whenever they have the same Year,Month,Day,Hour. Here is my code:
我试图计算Lat-Lon点与这两个文件之间的距离,只要它们具有相同的年,月,日,小时。这是我的代码:
jtwc <-read.csv("file1.csv",header=T,sep=",")
stn <-read.csv("file2.csv",header=T,sep=",")
dms_to_rad <- function(d, m, s) (d + m / 60 + s / 3600) * pi / 180
great_circle_distance <- function(lat1, long1, lat2, long2) {
a <- sin(0.5 * (lat2 - lat1))
b <- sin(0.5 * (long2 - long1))
12742 * asin(sqrt(a * a + cos(lat1) * cos(lat2) * b * b))
}
jtwc$dist<- great_circle_distance(dms_to_rad(jtwc$Lat,0,0),dms_to_rad(jtwc$Lon,0,0),dms_to_rad(stn$Lat,0,0),dms_to_rad(stn$Lon,0,0))
write.csv(stn,file="dist.csv",row.names=T)
The "SN" column is a unique identifier in file1. What I want to do:
“SN”列是file1中的唯一标识符。我想做的事:
[1] Calculate the distance(jtwc$dist) when the two files have the same Year,Month,Day, and Hour.
[1]当两个文件具有相同的年,月,日和小时时,计算距离(jtwc $ dist)。
[2] In case a row has the same Year,Month,Day,and Hour but different SN number in file1,I will use the values in the row with the same Year,Month,Day,and Hour in file2 in computing the distance.
[2]如果一行具有相同的年,月,日和小时但在file1中具有不同的SN编号,我将使用文件2中具有相同年,月,日和小时的行中的值来计算距离。
The output should like this:
输出应该是这样的:
SN CY Year Month Day Hour Lat Lon dist
196101 1 1961 1 14 12 8.3 134.7 1620.961
196101 1 1961 1 14 18 8.8 133.4 1467.859
196101 1 1961 1 15 0 9.1 132.5 1334.382
196101 1 1961 1 15 6 9.3 132.2 1324.915
196125 1 1961 1 14 12 10.0 136 1687.127
196125 1 1961 1 14 18 10.5 136.5 1724.351
Any suggestion on how to do this correctly?
有关如何正确执行此操作的任何建议?
1 个解决方案
#1
2
If I understand you right, you can try this solution:
如果我理解你,你可以尝试这个解决方案:
library(tidyverse)
#functions
dms_to_rad <- function(d, m, s) (d + m / 60 + s / 3600) * pi / 180
great_circle_distance <- function(lat1, long1, lat2, long2) {
a <- sin(0.5 * (lat2 - lat1))
b <- sin(0.5 * (long2 - long1))
12742 * asin(sqrt(a * a + cos(lat1) * cos(lat2) * b * b))
}
#read file
dir1 = 'path_to_your_files'
dir1 = 'path_to_your_files'
jtwc <- read.csv(dir1) %>%
unite('key',c('Year','Month','Day','Hour'))
stn <- read.csv(dir2) %>%
unite('key',c('Year','Month','Day','Hour'))
#aggregating
stn <- left_join(jtwc,stn,by = 'key') %>%
drop_na() %>%
mutate_at(vars(Lat.x,Lon.x, Lat.y,Lon.y),funs(dms_to_rad),m = 0,s =0) %>%
mutate(dist = great_circle_distance(Lat.x,Lon.x, Lat.y,Lon.y))
write.csv(stn,file="dist.csv",row.names=T)
#1
2
If I understand you right, you can try this solution:
如果我理解你,你可以尝试这个解决方案:
library(tidyverse)
#functions
dms_to_rad <- function(d, m, s) (d + m / 60 + s / 3600) * pi / 180
great_circle_distance <- function(lat1, long1, lat2, long2) {
a <- sin(0.5 * (lat2 - lat1))
b <- sin(0.5 * (long2 - long1))
12742 * asin(sqrt(a * a + cos(lat1) * cos(lat2) * b * b))
}
#read file
dir1 = 'path_to_your_files'
dir1 = 'path_to_your_files'
jtwc <- read.csv(dir1) %>%
unite('key',c('Year','Month','Day','Hour'))
stn <- read.csv(dir2) %>%
unite('key',c('Year','Month','Day','Hour'))
#aggregating
stn <- left_join(jtwc,stn,by = 'key') %>%
drop_na() %>%
mutate_at(vars(Lat.x,Lon.x, Lat.y,Lon.y),funs(dms_to_rad),m = 0,s =0) %>%
mutate(dist = great_circle_distance(Lat.x,Lon.x, Lat.y,Lon.y))
write.csv(stn,file="dist.csv",row.names=T)