如何使用Plots在Python中执行单变量分析

时间:2021-03-17 21:26:40

I have a dataset below and I wanted to perform univariate analysis on Income Category as the sample plot shown. Here the point is in the Number category 1 is treated as Male and 0 is treated as female.
Is there any way possible to solve this.

我在下面有一个数据集,我想对收入类别进行单变量分析,如图所示。这里的要点是数字类别1被视为男性,0被视为女性。有没有办法解决这个问题。

Income  Population  Number  Category
54        77           1       A
23        88           1       A
44        87           0       B
55        88           0       B
66        89           1       B
73        90           0       A
12        89           1       C
34        9            0       C
54        77           1       A
23        88           1       A
44        87           0       B
55        88           0       B
66        89           1       B
73        90           0       A
12        89           1       C
34        9            0       C

如何使用Plots在Python中执行单变量分析

2 个解决方案

#1


1  

I am not sure if your question is clear. But, followings plots are commonly used to perform univariate and bivariate analysis.

我不确定你的问题是否清楚。但是,以下图表通常用于执行单变量和双变量分析。

import seaborn as sns
import numpy as np
import pandas as pd

df = pd.DataFrame({'Income': [54,23,44,55,66,],
                   'Population':[77,88,87,88,89],
                   'Number':[1,1,0,0],
                   'Category':['A','A','B','B','C']})

### Univariate analysis
sns.distplot(df.Income) # numeric
sns.boxplot(df.Income) # numeric
sns.distplot(df.Population)
sns.countplot(df.Category) # categorical
sns.countplot(df.Number)

## Bivariate analysis
sns.jointplot('Income', 'Population', data = df, kind='scatter')
sns.lmplot(df.Income, df.Population, data=df, hue='Number', fit_reg=False)
sns.countplot(Category, hue = 'Number', data=df)

## Multivariate analysis
sns.pairplot(df.select_dtypes(include=[np.int, np.float]])

#2


0  

If you put the data into a pandas DataFrame then you can get the easily separate out the values for Males and Females, e.g. (just using Income and Number):

如果您将数据放入pandas DataFrame中,那么您可以轻松地将男性和女性的值分开,例如: (仅使用收入和数字):

import pandas as pd
# a dictionary of the data
data = {'Income': [54, 23, 44, 55, 66, 73, 12], 'Number': [1, 1, 0, 0, 1, 0, 1]}
# put the data into a pandas DataFrame
d = pd.DataFrame(data=data)

# get a list of Income for the Males
incomem = d['Income'][d['Number'] == 1].tolist() # you don't really need the tolist() call

# get a list of Income for the Females
incomef = d['Income'][d['Number'] == 0].tolist()

You can then plot a bar graph using, e.g. the example here. The plot.ly package also looks nice for this sort of thing, as in the example here.

然后,您可以使用例如绘制条形图。这里的例子。 plot.ly包对于这种事情看起来也很不错,就像这里的例子一样。

#1


1  

I am not sure if your question is clear. But, followings plots are commonly used to perform univariate and bivariate analysis.

我不确定你的问题是否清楚。但是,以下图表通常用于执行单变量和双变量分析。

import seaborn as sns
import numpy as np
import pandas as pd

df = pd.DataFrame({'Income': [54,23,44,55,66,],
                   'Population':[77,88,87,88,89],
                   'Number':[1,1,0,0],
                   'Category':['A','A','B','B','C']})

### Univariate analysis
sns.distplot(df.Income) # numeric
sns.boxplot(df.Income) # numeric
sns.distplot(df.Population)
sns.countplot(df.Category) # categorical
sns.countplot(df.Number)

## Bivariate analysis
sns.jointplot('Income', 'Population', data = df, kind='scatter')
sns.lmplot(df.Income, df.Population, data=df, hue='Number', fit_reg=False)
sns.countplot(Category, hue = 'Number', data=df)

## Multivariate analysis
sns.pairplot(df.select_dtypes(include=[np.int, np.float]])

#2


0  

If you put the data into a pandas DataFrame then you can get the easily separate out the values for Males and Females, e.g. (just using Income and Number):

如果您将数据放入pandas DataFrame中,那么您可以轻松地将男性和女性的值分开,例如: (仅使用收入和数字):

import pandas as pd
# a dictionary of the data
data = {'Income': [54, 23, 44, 55, 66, 73, 12], 'Number': [1, 1, 0, 0, 1, 0, 1]}
# put the data into a pandas DataFrame
d = pd.DataFrame(data=data)

# get a list of Income for the Males
incomem = d['Income'][d['Number'] == 1].tolist() # you don't really need the tolist() call

# get a list of Income for the Females
incomef = d['Income'][d['Number'] == 0].tolist()

You can then plot a bar graph using, e.g. the example here. The plot.ly package also looks nice for this sort of thing, as in the example here.

然后,您可以使用例如绘制条形图。这里的例子。 plot.ly包对于这种事情看起来也很不错,就像这里的例子一样。