This question already has an answer here:
这个问题在这里已有答案:
- Dictionary style replace multiple items 7 answers
字典样式替换多个项目7个答案
It might be a common use case, I was doing this in python, but in this case, I have to do it in R. How to replace the rd to road, st to street, etc.. in R.
这可能是一个常见的用例,我在python中这样做,但在这种情况下,我必须在R.中如何将rd替换为road,st to street等。在R.
Suppose I have a mapping dictionary like this,
假设我有这样的映射字典,
dict = { st : street, rd : road, Ln : Lane, Pl : Place}
In my df,
在我的df中,
Address
2/20,Queen St,London,UK
1,King Ln,Paris,France
5,Stuart Pl,Paris,France
How do I get this,
我怎么得到这个,
Address
2/20,Queen Street,London,UK
1,King Lane,Paris,France
5,Stuart Place,Paris,France
Thanks.
1 个解决方案
#1
0
You can use the function gsub
for that. gsub("Ln", "Lane", addresses)
where adresses
is a vector with your adresses as strings, replaces all occurences of "Ln" with "Lane". You can use Regex with this, but I don't think that really helps you.
您可以使用gsub函数。 gsub(“Ln”,“Lane”,地址)其中地址是一个向量,地址为字符串,用“Lane”替换所有出现的“Ln”。你可以使用正则表达式,但我认为这不会对你有所帮助。
So all you have to do is call that function for all substitutions you want to make and you're done. R doesn't have dictionaries (as far as I know), so doing it all in once would require another format to store your mappings.
所以你要做的就是为你想做的所有替换调用该函数,然后你就完成了。 R没有字典(据我所知),所以一次性完成它需要另一种格式来存储你的映射。
To answer your question on how to do it for multiple dictionary entries:
要回答有关如何为多个字典条目执行此操作的问题:
Since we don't have dictionaries in R, we take the next best thing: lists. List entries have a name and an object (value, vector, anything really). We can make the name of the entry the dictionary key, and the value its translation:
由于我们在R中没有字典,我们采取了下一个最好的事情:列表。列表条目有一个名称和一个对象(值,矢量,任何东西)。我们可以将条目的名称设为字典键,并将其值转换为:
dict <- list(St = "Street",
Rd = "Road",
Ln = "Lane",
Pl = "Place")
Taking the adresses in your example:
以你的例子中的地址为例:
Adresses <- c("2/20,Queen St,London,UK",
"1,King Ln,Paris,France",
"5,Stuart Pl,Paris,France")
Now we can loop over the entries of the list, create the expression (using the \b
tags as mentioned by @wibeasley), and replace it with the entry in the list. Each time we overwrite the Adresses vector with the results, so we are sequentially applying all filters.
现在我们可以循环遍历列表的条目,创建表达式(使用@wibeasley提到的\ b标记),并将其替换为列表中的条目。每次我们用结果覆盖Adresses向量时,我们依次应用所有过滤器。
for(i in 1:length(dict)){
Adresses <- gsub(paste0("\\b", names(dict)[i], "\\b"), dict[[i]], Adresses)
}
#1
0
You can use the function gsub
for that. gsub("Ln", "Lane", addresses)
where adresses
is a vector with your adresses as strings, replaces all occurences of "Ln" with "Lane". You can use Regex with this, but I don't think that really helps you.
您可以使用gsub函数。 gsub(“Ln”,“Lane”,地址)其中地址是一个向量,地址为字符串,用“Lane”替换所有出现的“Ln”。你可以使用正则表达式,但我认为这不会对你有所帮助。
So all you have to do is call that function for all substitutions you want to make and you're done. R doesn't have dictionaries (as far as I know), so doing it all in once would require another format to store your mappings.
所以你要做的就是为你想做的所有替换调用该函数,然后你就完成了。 R没有字典(据我所知),所以一次性完成它需要另一种格式来存储你的映射。
To answer your question on how to do it for multiple dictionary entries:
要回答有关如何为多个字典条目执行此操作的问题:
Since we don't have dictionaries in R, we take the next best thing: lists. List entries have a name and an object (value, vector, anything really). We can make the name of the entry the dictionary key, and the value its translation:
由于我们在R中没有字典,我们采取了下一个最好的事情:列表。列表条目有一个名称和一个对象(值,矢量,任何东西)。我们可以将条目的名称设为字典键,并将其值转换为:
dict <- list(St = "Street",
Rd = "Road",
Ln = "Lane",
Pl = "Place")
Taking the adresses in your example:
以你的例子中的地址为例:
Adresses <- c("2/20,Queen St,London,UK",
"1,King Ln,Paris,France",
"5,Stuart Pl,Paris,France")
Now we can loop over the entries of the list, create the expression (using the \b
tags as mentioned by @wibeasley), and replace it with the entry in the list. Each time we overwrite the Adresses vector with the results, so we are sequentially applying all filters.
现在我们可以循环遍历列表的条目,创建表达式(使用@wibeasley提到的\ b标记),并将其替换为列表中的条目。每次我们用结果覆盖Adresses向量时,我们依次应用所有过滤器。
for(i in 1:length(dict)){
Adresses <- gsub(paste0("\\b", names(dict)[i], "\\b"), dict[[i]], Adresses)
}