如何使用pandas从数据框中删除列?

时间:2021-09-18 01:38:40

I read my data

我读了我的数据

import pandas as pd
df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t')
print df

and get:

得到:

          id    text
0    361.273    text1...
1    374.350    text2...
2    374.350    text3...

How can I delete the id column from the above data frame?. I tried the following:

如何从上面的数据框中删除id列?我尝试了以下方法:

import pandas as pd
df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t')
print df.drop('id', 1)

But it raises this exception:

但它引发了这个例外:

ValueError: labels ['id'] not contained in axis

3 个解决方案

#1


12  

To actually delete the column

要实际删除列

del df['id'] or df.drop('id', 1) should have worked if the passed column matches exactly

如果传递的列完全匹配,则del ff ['id']或df.drop('id',1)应该有效

However, if you don't need to delete the column then you can just select the column of interest like so:

但是,如果您不需要删除该列,则可以选择感兴趣的列,如下所示:

In [54]:

df['text']
Out[54]:
0    text1
1    text2
2    textn
Name: text, dtype: object

If you never wanted it in the first place then you pass a list of cols to read_csv as a param usecols:

如果您从未想过它,那么您将cols列表作为param usecols传递给read_csv:

In [53]:
import io
temp="""id    text
363.327    text1
366.356    text2
37782    textn"""
df = pd.read_csv(io.StringIO(temp), delimiter='\s+', usecols=['text'])
df
Out[53]:
    text
0  text1
1  text2
2  textn

Regarding your error it's because 'id' is not in your columns or that it's spelt differently or has whitespace. To check this look at the output from print(df.columns.tolist()) this will output a list of the columns and will show if you have any leading/trailing whitespace.

关于你的错误,因为'id'不在你的列中,或者拼写不同或有空格。要查看print(df.columns.tolist())的输出,这将输出列的列表,并显示是否有任何前导/尾随空格。

#2


53  

df.drop(colname, axis=1) (or del df[colname]) is the correct method to use to delete a column.

df.drop(colname,axis = 1)(或del df [colname])是用于删除列的正确方法。

If a ValueError is raised, it means the column name is not exactly what you think it is.

如果引发ValueError,则意味着列名称与您的想法不完全相同。

Check df.columns to see what Pandas thinks are the names of the columns.

检查df.columns以查看Pandas认为列的名称。

#3


28  

The best way to delete a column in pandas is to use drop:

删除pandas中列的最佳方法是使用drop:

df = df.drop('column_name', axis=1)

where 1 is the axis number (0 for rows and 1 for columns.)

其中1是轴编号(0表示行,1表示列。)

To delete the column without having to reassign df you can do:

要删除列而不必重新分配df,您可以执行以下操作:

df.drop('column_name', axis=1, inplace=True)

Finally, to drop by column number instead of by column label, try this to delete, e.g. the 1st, 2nd and 4th columns:

最后,要按列号而不是按列标签删除,请尝试删除,例如第1,第2和第4列:

df.drop(df.columns[[0, 1, 3]], axis=1)  # df.columns is zero-based pd.Index 


Exceptions:

例外:

If a wrong column number or label is requested an error will be thrown. To check the number of columns use df.shape[1] or len(df.columns.values) and to check the column labels use df.columns.values.

如果请求错误的列号或标签,则会引发错误。要使用df.shape [1]或len(df.columns.values)检查列数,并使用df.columns.values检查列标签。

An exception would be raised answer was based on @LondonRob's answer and left here to help future visitors of this page.

将提出一个例外答案是基于@LondonRob的答案,并留下来帮助此页面的未来访问者。

#1


12  

To actually delete the column

要实际删除列

del df['id'] or df.drop('id', 1) should have worked if the passed column matches exactly

如果传递的列完全匹配,则del ff ['id']或df.drop('id',1)应该有效

However, if you don't need to delete the column then you can just select the column of interest like so:

但是,如果您不需要删除该列,则可以选择感兴趣的列,如下所示:

In [54]:

df['text']
Out[54]:
0    text1
1    text2
2    textn
Name: text, dtype: object

If you never wanted it in the first place then you pass a list of cols to read_csv as a param usecols:

如果您从未想过它,那么您将cols列表作为param usecols传递给read_csv:

In [53]:
import io
temp="""id    text
363.327    text1
366.356    text2
37782    textn"""
df = pd.read_csv(io.StringIO(temp), delimiter='\s+', usecols=['text'])
df
Out[53]:
    text
0  text1
1  text2
2  textn

Regarding your error it's because 'id' is not in your columns or that it's spelt differently or has whitespace. To check this look at the output from print(df.columns.tolist()) this will output a list of the columns and will show if you have any leading/trailing whitespace.

关于你的错误,因为'id'不在你的列中,或者拼写不同或有空格。要查看print(df.columns.tolist())的输出,这将输出列的列表,并显示是否有任何前导/尾随空格。

#2


53  

df.drop(colname, axis=1) (or del df[colname]) is the correct method to use to delete a column.

df.drop(colname,axis = 1)(或del df [colname])是用于删除列的正确方法。

If a ValueError is raised, it means the column name is not exactly what you think it is.

如果引发ValueError,则意味着列名称与您的想法不完全相同。

Check df.columns to see what Pandas thinks are the names of the columns.

检查df.columns以查看Pandas认为列的名称。

#3


28  

The best way to delete a column in pandas is to use drop:

删除pandas中列的最佳方法是使用drop:

df = df.drop('column_name', axis=1)

where 1 is the axis number (0 for rows and 1 for columns.)

其中1是轴编号(0表示行,1表示列。)

To delete the column without having to reassign df you can do:

要删除列而不必重新分配df,您可以执行以下操作:

df.drop('column_name', axis=1, inplace=True)

Finally, to drop by column number instead of by column label, try this to delete, e.g. the 1st, 2nd and 4th columns:

最后,要按列号而不是按列标签删除,请尝试删除,例如第1,第2和第4列:

df.drop(df.columns[[0, 1, 3]], axis=1)  # df.columns is zero-based pd.Index 


Exceptions:

例外:

If a wrong column number or label is requested an error will be thrown. To check the number of columns use df.shape[1] or len(df.columns.values) and to check the column labels use df.columns.values.

如果请求错误的列号或标签,则会引发错误。要使用df.shape [1]或len(df.columns.values)检查列数,并使用df.columns.values检查列标签。

An exception would be raised answer was based on @LondonRob's answer and left here to help future visitors of this page.

将提出一个例外答案是基于@LondonRob的答案,并留下来帮助此页面的未来访问者。