r - 用地址[duplicate]替换缩写

时间:2022-12-27 11:47:48

This question already has an answer here:

这个问题在这里已有答案:

It might be a common use case, I was doing this in python, but in this case, I have to do it in R. How to replace the rd to road, st to street, etc.. in R.

这可能是一个常见的用例,我在python中这样做,但在这种情况下,我必须在R.中如何将rd替换为road,st to street等。在R.

Suppose I have a mapping dictionary like this,

假设我有这样的映射字典,

dict = { st : street, rd : road, Ln : Lane, Pl : Place}

In my df,

在我的df中,

Address
2/20,Queen St,London,UK
1,King Ln,Paris,France
5,Stuart Pl,Paris,France

How do I get this,

我怎么得到这个,

Address
2/20,Queen Street,London,UK
1,King Lane,Paris,France
5,Stuart Place,Paris,France

Thanks.

1 个解决方案

#1


0  

You can use the function gsub for that. gsub("Ln", "Lane", addresses) where adresses is a vector with your adresses as strings, replaces all occurences of "Ln" with "Lane". You can use Regex with this, but I don't think that really helps you.

您可以使用gsub函数。 gsub(“Ln”,“Lane”,地址)其中地址是一个向量,地址为字符串,用“Lane”替换所有出现的“Ln”。你可以使用正则表达式,但我认为这不会对你有所帮助。

So all you have to do is call that function for all substitutions you want to make and you're done. R doesn't have dictionaries (as far as I know), so doing it all in once would require another format to store your mappings.

所以你要做的就是为你想做的所有替换调用该函数,然后你就完成了。 R没有字典(据我所知),所以一次性完成它需要另一种格式来存储你的映射。

To answer your question on how to do it for multiple dictionary entries:

要回答有关如何为多个字典条目执行此操作的问题:

Since we don't have dictionaries in R, we take the next best thing: lists. List entries have a name and an object (value, vector, anything really). We can make the name of the entry the dictionary key, and the value its translation:

由于我们在R中没有字典,我们采取了下一个最好的事情:列表。列表条目有一个名称和一个对象(值,矢量,任何东西)。我们可以将条目的名称设为字典键,并将其值转换为:

dict <- list(St = "Street",
             Rd = "Road",
             Ln = "Lane",
             Pl = "Place")

Taking the adresses in your example:

以你的例子中的地址为例:

Adresses <- c("2/20,Queen St,London,UK",
              "1,King Ln,Paris,France",
              "5,Stuart Pl,Paris,France")

Now we can loop over the entries of the list, create the expression (using the \b tags as mentioned by @wibeasley), and replace it with the entry in the list. Each time we overwrite the Adresses vector with the results, so we are sequentially applying all filters.

现在我们可以循环遍历列表的条目,创建表达式(使用@wibeasley提到的\ b标记),并将其替换为列表中的条目。每次我们用结果覆盖Adresses向量时,我们依次应用所有过滤器。

for(i in 1:length(dict)){
  Adresses <- gsub(paste0("\\b", names(dict)[i], "\\b"), dict[[i]], Adresses)
}

#1


0  

You can use the function gsub for that. gsub("Ln", "Lane", addresses) where adresses is a vector with your adresses as strings, replaces all occurences of "Ln" with "Lane". You can use Regex with this, but I don't think that really helps you.

您可以使用gsub函数。 gsub(“Ln”,“Lane”,地址)其中地址是一个向量,地址为字符串,用“Lane”替换所有出现的“Ln”。你可以使用正则表达式,但我认为这不会对你有所帮助。

So all you have to do is call that function for all substitutions you want to make and you're done. R doesn't have dictionaries (as far as I know), so doing it all in once would require another format to store your mappings.

所以你要做的就是为你想做的所有替换调用该函数,然后你就完成了。 R没有字典(据我所知),所以一次性完成它需要另一种格式来存储你的映射。

To answer your question on how to do it for multiple dictionary entries:

要回答有关如何为多个字典条目执行此操作的问题:

Since we don't have dictionaries in R, we take the next best thing: lists. List entries have a name and an object (value, vector, anything really). We can make the name of the entry the dictionary key, and the value its translation:

由于我们在R中没有字典,我们采取了下一个最好的事情:列表。列表条目有一个名称和一个对象(值,矢量,任何东西)。我们可以将条目的名称设为字典键,并将其值转换为:

dict <- list(St = "Street",
             Rd = "Road",
             Ln = "Lane",
             Pl = "Place")

Taking the adresses in your example:

以你的例子中的地址为例:

Adresses <- c("2/20,Queen St,London,UK",
              "1,King Ln,Paris,France",
              "5,Stuart Pl,Paris,France")

Now we can loop over the entries of the list, create the expression (using the \b tags as mentioned by @wibeasley), and replace it with the entry in the list. Each time we overwrite the Adresses vector with the results, so we are sequentially applying all filters.

现在我们可以循环遍历列表的条目,创建表达式(使用@wibeasley提到的\ b标记),并将其替换为列表中的条目。每次我们用结果覆盖Adresses向量时,我们依次应用所有过滤器。

for(i in 1:length(dict)){
  Adresses <- gsub(paste0("\\b", names(dict)[i], "\\b"), dict[[i]], Adresses)
}