如何在Pandas中找到数字列?

时间:2022-05-17 04:37:04

Let's say df is a pandas DataFrame. I would like to find all columns of numeric type. Something like:

让我们说df是一个pandas DataFrame。我想找到所有数字类型的列。就像是:

isNumeric = is_numeric(df)

8 个解决方案

#1


67  

You could use select_dtypes method of DataFrame. It includes two parameters include and exclude. So isNumeric would look like:

您可以使用DataFrame的select_dtypes方法。它包括两个参数include和exclude。所以isNumeric看起来像:

numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']

newdf = df.select_dtypes(include=numerics)

#2


36  

You can use the following command to filter only numeric columns

您可以使用以下命令仅筛选数字列

df._get_numeric_data()

Example

In [32]: data
Out[32]:
   A  B
0  1  s
1  2  s
2  3  s
3  4  s

In [33]: data._get_numeric_data()
Out[33]:
   A
0  1
1  2
2  3
3  4

#3


15  

Simple one-line answer to create a new dataframe with only numeric columns:

创建仅包含数字列的新数据框的简单单行答案:

df.select_dtypes(include=[np.number])

If you want the names of numeric columns:

如果需要数字列的名称:

df.select_dtypes(include=[np.number]).columns.tolist()

Complete code:

完整代码:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': range(7, 10),
                   'B': np.random.rand(3),
                   'C': ['foo','bar','baz'],
                   'D': ['who','what','when']})
df
#    A         B    C     D
# 0  7  0.704021  foo   who
# 1  8  0.264025  bar  what
# 2  9  0.230671  baz  when

df_numerics_only = df.select_dtypes(include=[np.number])
df_numerics_only
#    A         B
# 0  7  0.704021
# 1  8  0.264025
# 2  9  0.230671

colnames_numerics_only = df.select_dtypes(include=[np.number]).columns.tolist()
colnames_numerics_only
# ['A', 'B']

#4


11  

df.select_dtypes(exclude=['object'])

#5


2  

def is_type(df, baseType):
    import numpy as np
    import pandas as pd
    test = [issubclass(np.dtype(d).type, baseType) for d in df.dtypes]
    return pd.DataFrame(data = test, index = df.columns, columns = ["test"])
def is_float(df):
    import numpy as np
    return is_type(df, np.float)
def is_number(df):
    import numpy as np
    return is_type(df, np.number)
def is_integer(df):
    import numpy as np
    return is_type(df, np.integer)

#6


2  

Adapting this answer, you could do

适应这个答案,你可以做到

df.ix[:,df.applymap(np.isreal).all(axis=0)]

Here, np.applymap(np.isreal) shows whether every cell in the data frame is numeric, and .axis(all=0) checks if all values in a column are True and returns a series of Booleans that can be used to index the desired columns.

这里,np.applymap(np.isreal)显示数据框中的每个单元格是否为数字,而.axis(all = 0)检查列中的所有值是否为True并返回一系列可用于索引的布尔值所需的列。

#7


1  

This is another simple code for finding numeric column in pandas data frame,

这是在pandas数据框中查找数字列的另一个简单代码,

      numeric_clmns = df.dtypes[df.dtypes != "object"].index 

#8


1  

Please see the below code:

请参阅以下代码:

if(dataset.select_dtypes(include=[np.number]).shape[1] > 0):
display(dataset.select_dtypes(include=[np.number]).describe())
if(dataset.select_dtypes(include=[np.object]).shape[1] > 0):
display(dataset.select_dtypes(include=[np.object]).describe())

This way you can check whether the value are numeric such as float and int or the srting values. the second if statement is used for checking the string values which is referred by the object.

这样,您可以检查值是否为数字,例如float和int或srting值。第二个if语句用于检查对象引用的字符串值。

#1


67  

You could use select_dtypes method of DataFrame. It includes two parameters include and exclude. So isNumeric would look like:

您可以使用DataFrame的select_dtypes方法。它包括两个参数include和exclude。所以isNumeric看起来像:

numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']

newdf = df.select_dtypes(include=numerics)

#2


36  

You can use the following command to filter only numeric columns

您可以使用以下命令仅筛选数字列

df._get_numeric_data()

Example

In [32]: data
Out[32]:
   A  B
0  1  s
1  2  s
2  3  s
3  4  s

In [33]: data._get_numeric_data()
Out[33]:
   A
0  1
1  2
2  3
3  4

#3


15  

Simple one-line answer to create a new dataframe with only numeric columns:

创建仅包含数字列的新数据框的简单单行答案:

df.select_dtypes(include=[np.number])

If you want the names of numeric columns:

如果需要数字列的名称:

df.select_dtypes(include=[np.number]).columns.tolist()

Complete code:

完整代码:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': range(7, 10),
                   'B': np.random.rand(3),
                   'C': ['foo','bar','baz'],
                   'D': ['who','what','when']})
df
#    A         B    C     D
# 0  7  0.704021  foo   who
# 1  8  0.264025  bar  what
# 2  9  0.230671  baz  when

df_numerics_only = df.select_dtypes(include=[np.number])
df_numerics_only
#    A         B
# 0  7  0.704021
# 1  8  0.264025
# 2  9  0.230671

colnames_numerics_only = df.select_dtypes(include=[np.number]).columns.tolist()
colnames_numerics_only
# ['A', 'B']

#4


11  

df.select_dtypes(exclude=['object'])

#5


2  

def is_type(df, baseType):
    import numpy as np
    import pandas as pd
    test = [issubclass(np.dtype(d).type, baseType) for d in df.dtypes]
    return pd.DataFrame(data = test, index = df.columns, columns = ["test"])
def is_float(df):
    import numpy as np
    return is_type(df, np.float)
def is_number(df):
    import numpy as np
    return is_type(df, np.number)
def is_integer(df):
    import numpy as np
    return is_type(df, np.integer)

#6


2  

Adapting this answer, you could do

适应这个答案,你可以做到

df.ix[:,df.applymap(np.isreal).all(axis=0)]

Here, np.applymap(np.isreal) shows whether every cell in the data frame is numeric, and .axis(all=0) checks if all values in a column are True and returns a series of Booleans that can be used to index the desired columns.

这里,np.applymap(np.isreal)显示数据框中的每个单元格是否为数字,而.axis(all = 0)检查列中的所有值是否为True并返回一系列可用于索引的布尔值所需的列。

#7


1  

This is another simple code for finding numeric column in pandas data frame,

这是在pandas数据框中查找数字列的另一个简单代码,

      numeric_clmns = df.dtypes[df.dtypes != "object"].index 

#8


1  

Please see the below code:

请参阅以下代码:

if(dataset.select_dtypes(include=[np.number]).shape[1] > 0):
display(dataset.select_dtypes(include=[np.number]).describe())
if(dataset.select_dtypes(include=[np.object]).shape[1] > 0):
display(dataset.select_dtypes(include=[np.object]).describe())

This way you can check whether the value are numeric such as float and int or the srting values. the second if statement is used for checking the string values which is referred by the object.

这样,您可以检查值是否为数字,例如float和int或srting值。第二个if语句用于检查对象引用的字符串值。