使用Pandas value.counts()获取一个值

时间:2022-05-14 11:10:49

Is there a way to get counts of only specific item in a column?

有没有办法只计算列中特定项目的计数?

To clarify, say I use:

澄清一下,说我使用:

countDat = df['country'].value_counts()

Then I'll get something like:

然后我会得到类似的东西:

Australia  35
Brazil 32
USA 93

... and so on

... 等等

Is there a way to only extract counts of Brazil? I just need the number 32 extracted from countDat.

有没有办法只提取巴西的数量?我只需要从countDat中提取的数字32。

I know countDat[1] will give Brazil but is there a way to search it through the key 'Brazil'?

我知道countDat [1]会给巴西,但有没有办法通过关键的'巴西'进行搜索?

2 个解决方案

#1


2  

One way is to drop down to numpy:

一种方法是下降到numpy:

res = (df['country'].values == 'Brazil').sum()

See here for benchmarking results from a similar problem.

请参阅此处以查看类似问题的基准测试结果。

You should see better performance if you are using Categorical Data, which also has other benefits.

如果您使用分类数据,您应该会看到更好的性能,这也有其他好处。

#2


1  

consider the data frame df

考虑数据帧df

df = pd.DataFrame(dict(country=np.array('AUS BRA USA'.split()).repeat([35, 32, 93])))

and value counts

和价值计数

countDat = df['country'].value_counts()

countDat

USA    93
AUS    35
BRA    32
Name: country, dtype: int64

per @cᴏʟᴅsᴘᴇᴇᴅ

df.loc[df.country == 'BRA', 'country'].count()

32

per @DSM

countDat["BRA"]

32

Boolean sum

df.country.eq('BRA').sum()

query + len

len(df.query('country == "BRA"')

groupby + len

len(df.groupby('country').groups['BRA'])

#1


2  

One way is to drop down to numpy:

一种方法是下降到numpy:

res = (df['country'].values == 'Brazil').sum()

See here for benchmarking results from a similar problem.

请参阅此处以查看类似问题的基准测试结果。

You should see better performance if you are using Categorical Data, which also has other benefits.

如果您使用分类数据,您应该会看到更好的性能,这也有其他好处。

#2


1  

consider the data frame df

考虑数据帧df

df = pd.DataFrame(dict(country=np.array('AUS BRA USA'.split()).repeat([35, 32, 93])))

and value counts

和价值计数

countDat = df['country'].value_counts()

countDat

USA    93
AUS    35
BRA    32
Name: country, dtype: int64

per @cᴏʟᴅsᴘᴇᴇᴅ

df.loc[df.country == 'BRA', 'country'].count()

32

per @DSM

countDat["BRA"]

32

Boolean sum

df.country.eq('BRA').sum()

query + len

len(df.query('country == "BRA"')

groupby + len

len(df.groupby('country').groups['BRA'])