用于numpy数组的python多个图

时间:2021-04-04 23:40:43

I have a multi dimensional numpy array of shape (200, 1500). I want to visualise summary statistics for this data. Because the num_cols is too high I can't plot all of them. My questions are:

我有一个多维的numpy形状阵列(200,1500)。我想要可视化此数据的摘要统计信息。因为num_cols太高我无法绘制所有这些。我的问题是:

  1. Which summary statistics shall I visualise?
  2. 我应该想出哪些摘要统计数据?

  3. Do i visualise all columns?
  4. 我是否可视化所有列?

  5. I thought of randomly choosing N columns from the data and showing distribution and box plots. Example shown below is for second column in array X. However, i can't figure out how to show both plots for N columns in a single figure. Can someone help me with this?

    我想从数据中随机选择N列并显示分布和箱形图。下面显示的示例是针对数组X中的第二列。但是,我无法弄清楚如何在单个图中显示N列的两个图。有人可以帮我弄这个吗?

    dist plot

    plt.figure(figsize=(20,4)) plt.subplot(121)
    ax = sns.distplot(X[:,1])

    plt.figure(figsize =(20,4))plt.subplot(121)ax = sns.distplot(X [:,1])

    Box Plot

    plt.subplot(122) plt.xlim(X[:,1].min()*1.1, X[:,1].max()*1.1) sns.boxplot(x=X[:,1])

    plt.subplot(122)plt.xlim(X [:,1] .min()* 1.1,X [:,1] .max()* 1.1)sns.boxplot(x = X [:,1])

用于numpy数组的python多个图

1 个解决方案

#1


1  

As @Shiva mentioned, the summary statistics and visualisation approach depends on your problem. The problem formulation determines whether you need mean or median values, standard deviations, eigenvalues, frequency distributions, etc. If you provide more details, the community could offer more specific advice.

正如@Shiva所提到的,摘要统计和可视化方法取决于您的问题。问题公式确定您是否需要均值或中值,标准差,特征值,频率分布等。如果您提供更多详细信息,社区可以提供更具体的建议。

Nevertheless, there are general-purpose analytical techniques that you could consider. See e.g. this blog post demonstrating various dimensionality reduction techniques, applied to the MNIST data set. Also check out this blog post discussing the application of an autoencoder for this purpose (scroll down). More specific to visualisation, you could browse through the Seaborn examples gallery to see if there are any examples you could apply to your own dataset.

然而,您可以考虑使用通用分析技术。参见例如这篇博客文章展示了应用于MNIST数据集的各种降维技术。另请参阅此博客文章,讨论为此目的应用自动编码器(向下滚动)。更具体的可视化,您可以浏览Seaborn示例库,看看是否有任何示例可以应用于您自己的数据集。

#1


1  

As @Shiva mentioned, the summary statistics and visualisation approach depends on your problem. The problem formulation determines whether you need mean or median values, standard deviations, eigenvalues, frequency distributions, etc. If you provide more details, the community could offer more specific advice.

正如@Shiva所提到的,摘要统计和可视化方法取决于您的问题。问题公式确定您是否需要均值或中值,标准差,特征值,频率分布等。如果您提供更多详细信息,社区可以提供更具体的建议。

Nevertheless, there are general-purpose analytical techniques that you could consider. See e.g. this blog post demonstrating various dimensionality reduction techniques, applied to the MNIST data set. Also check out this blog post discussing the application of an autoencoder for this purpose (scroll down). More specific to visualisation, you could browse through the Seaborn examples gallery to see if there are any examples you could apply to your own dataset.

然而,您可以考虑使用通用分析技术。参见例如这篇博客文章展示了应用于MNIST数据集的各种降维技术。另请参阅此博客文章,讨论为此目的应用自动编码器(向下滚动)。更具体的可视化,您可以浏览Seaborn示例库,看看是否有任何示例可以应用于您自己的数据集。