Is there a way to get counts of only specific item in a column?
有没有办法只计算列中特定项目的计数?
To clarify, say I use:
澄清一下,说我使用:
countDat = df['country'].value_counts()
Then I'll get something like:
然后我会得到类似的东西:
Australia 35
Brazil 32
USA 93
... and so on
... 等等
Is there a way to only extract counts of Brazil? I just need the number 32 extracted from countDat
.
有没有办法只提取巴西的数量?我只需要从countDat中提取的数字32。
I know countDat[1]
will give Brazil but is there a way to search it through the key 'Brazil'?
我知道countDat [1]会给巴西,但有没有办法通过关键的'巴西'进行搜索?
2 个解决方案
#1
2
One way is to drop down to numpy
:
一种方法是下降到numpy:
res = (df['country'].values == 'Brazil').sum()
See here for benchmarking results from a similar problem.
请参阅此处以查看类似问题的基准测试结果。
You should see better performance if you are using Categorical Data, which also has other benefits.
如果您使用分类数据,您应该会看到更好的性能,这也有其他好处。
#2
1
consider the data frame df
考虑数据帧df
df = pd.DataFrame(dict(country=np.array('AUS BRA USA'.split()).repeat([35, 32, 93])))
and value counts
和价值计数
countDat = df['country'].value_counts()
countDat
USA 93
AUS 35
BRA 32
Name: country, dtype: int64
per @cᴏʟᴅsᴘᴇᴇᴅ
df.loc[df.country == 'BRA', 'country'].count()
32
per @DSM
countDat["BRA"]
32
Boolean sum
df.country.eq('BRA').sum()
query
+ len
len(df.query('country == "BRA"')
groupby
+ len
len(df.groupby('country').groups['BRA'])
#1
2
One way is to drop down to numpy
:
一种方法是下降到numpy:
res = (df['country'].values == 'Brazil').sum()
See here for benchmarking results from a similar problem.
请参阅此处以查看类似问题的基准测试结果。
You should see better performance if you are using Categorical Data, which also has other benefits.
如果您使用分类数据,您应该会看到更好的性能,这也有其他好处。
#2
1
consider the data frame df
考虑数据帧df
df = pd.DataFrame(dict(country=np.array('AUS BRA USA'.split()).repeat([35, 32, 93])))
and value counts
和价值计数
countDat = df['country'].value_counts()
countDat
USA 93
AUS 35
BRA 32
Name: country, dtype: int64
per @cᴏʟᴅsᴘᴇᴇᴅ
df.loc[df.country == 'BRA', 'country'].count()
32
per @DSM
countDat["BRA"]
32
Boolean sum
df.country.eq('BRA').sum()
query
+ len
len(df.query('country == "BRA"')
groupby
+ len
len(df.groupby('country').groups['BRA'])