Python Pandas DataFrame替换:从尾随数字中删除字符串

时间:2021-06-26 20:23:04

I have a long DataFrame with index values like this:

我有一个长的DataFrame,索引值如下:

| burger10 | ...

| pasta25  | ...

| milk     | ...

| yoghurt() | ...

I need to get rid of the trailing digits or parentheses. I am trying to use replace() with regex, but without success. Tried this:

我需要摆脱尾随的数字或括号。我正在尝试使用带有正则表达式的replace(),但没有成功。试过这个:

energy.replace(to_replace='[0-9,\.,\(,\)]+', value='', regex=True, inplace=True)

1 个解决方案

#1


2  

You don't need to escape () or use , in character class [], just use them as literal, and if you mean trailing, you need the anchor $ to match the end of string:

你不需要在字符类[]中使用escape()或use,只需将它们用作文字,如果你的意思是尾随,你需要使用anchor $来匹配字符串的结尾:

energy[0].str.replace("[0-9()]+$", "")

#0     burger
#1      pasta
#2       milk
#3    yoghurt
#Name: 0, dtype: object

If the strings are in the index, you can use .index to access, modify it and reassign it back to the data frame:

如果字符串在索引中,则可以使用.index访问,修改它并将其重新分配回数据框:

energy.index = energy.index.str.replace("[0-9()]+$", "")

#1


2  

You don't need to escape () or use , in character class [], just use them as literal, and if you mean trailing, you need the anchor $ to match the end of string:

你不需要在字符类[]中使用escape()或use,只需将它们用作文字,如果你的意思是尾随,你需要使用anchor $来匹配字符串的结尾:

energy[0].str.replace("[0-9()]+$", "")

#0     burger
#1      pasta
#2       milk
#3    yoghurt
#Name: 0, dtype: object

If the strings are in the index, you can use .index to access, modify it and reassign it back to the data frame:

如果字符串在索引中,则可以使用.index访问,修改它并将其重新分配回数据框:

energy.index = energy.index.str.replace("[0-9()]+$", "")