为什么pyplot.plot()创建一个宽度= 1,高度= 1的额外Rectangle?

时间:2020-11-28 23:39:00

I'm creating a simple bar plot from a DataFrame. (The plot method on Series and DataFrame is just a simple wrapper around pyplot.plot)

我正在从DataFrame创建一个简单的条形图。 (Series和DataFrame上的绘图方法只是pyplot.plot的一个简单包装器)

import pandas as pd
import matplotlib as mpl

df = pd.DataFrame({'City': ['Berlin', 'Munich', 'Hamburg'],
               'Population': [3426354, 1260391, 1739117]})
df = df.set_index('City')

ax = df.plot(kind='bar')

This is the generated plot
为什么pyplot.plot()创建一个宽度= 1,高度= 1的额外Rectangle?

这是生成的图

Now I want to access the individual bars. And what I've noticed is that there is an additional bar (Rectangle) with width=1, height=1

现在我想访问各个栏。我注意到的是有一个额外的条形(矩形),宽度= 1,高度= 1

rects = [rect for rect in ax.get_children() if isinstance(rect, mpl.patches.Rectangle)]
for r in rects:
   print(r)

output:

Rectangle(xy=(-0.25, 0), width=0.5, height=3.42635e+06, angle=0)
Rectangle(xy=(0.75, 0), width=0.5, height=1.26039e+06, angle=0)
Rectangle(xy=(1.75, 0), width=0.5, height=1.73912e+06, angle=0)
Rectangle(xy=(0, 0), width=1, height=1, angle=0)

I would expect only three rectangles here. What is the purpose of the fourth?

我希望这里只有三个矩形。第四个目的是什么?

2 个解决方案

#1


1  

The fourth Rectangle is the bounding box for the Axis subplot.
This is an artifact of the way Pyplot handles bounding boxes, it's not specific to Pandas. For example, plotting with regular Pyplot:

第四个Rectangle是Axis子图的边界框。这是Pyplot处理边界框的方式的工件,它不是Pandas特有的。例如,使用常规Pyplot进行绘图:

f, ax = plt.subplots()
ax.bar(range(3), df.Population.values)
rects = [rect for rect in ax.get_children() if isinstance(rect, mpl.patches.Rectangle)]
for r in rects:
    print(r)

Still results in four Rectangles:

仍然导致四个矩形:

Rectangle(-0.4,0;0.8x3.42635e+06)
Rectangle(0.6,0;0.8x1.26039e+06)
Rectangle(1.6,0;0.8x1.73912e+06)
Rectangle(0,0;1x1)

There's a line in the Pyplot tight layout docs which refers to this extra Rectangle (and also why its coordinates are (0,0),(1,1). It refers to a rect parameter:

Pyplot紧密布局文档中有一行引用了这个额外的Rectangle(以及它的坐标为(0,0),(1,1)的原因。它指的是一个rect参数:

...which specifies the bounding box that the subplots will be fit inside. The coordinates must be in normalized figure coordinates and the default is (0, 0, 1, 1).

...指定子图将适合的边界框。坐标必须是标准化的图形坐标,默认值为(0,0,1,1)。

There's probably a more official section of the Matplotlib documentation that describes this architecture more thoroughly, but I find those docs difficult to navigate, this is the best I could come up with.

Matplotlib文档中可能有一个更正式的部分更全面地描述了这个架构,但我发现这些文档难以导航,这是我能想到的最好的。

#2


2  

You would not want to mess with all the children of the axes to get those of interest. If there are only bar plots in the axes, ax.patches gives you the rectangles in the axes.

你不会想要与轴的所有孩子混淆以获得感兴趣的东西。如果轴上只有条形图,则ax.patches会为您提供轴中的矩形。

Concerning the labeling of the bars, the linked article may not be the best choice. It argues to calculate the distance of the label manually, which is not really useful. Instead you would just offset the annotation by some points compared to the bar top, using the argument textcoords="offset points" to plt.annotation.

关于条形标签,链接的文章可能不是最佳选择。它主张手动计算标签的距离,这实际上并不实用。相反,您只需使用参数textcoords =“offset points”与plt.annotation相比,使用某些点将注释偏移到条形顶部。

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'City': ['Berlin', 'Munich', 'Hamburg'],
               'Population': [3426354, 1260391, 1739117]})
df = df.set_index('City')

ax = df.plot(kind='bar')


def autolabel(rects, ax):
    for rect in rects:
        x = rect.get_x() + rect.get_width()/2.
        y = rect.get_height()
        ax.annotate("{}".format(y), (x,y), xytext=(0,5), textcoords="offset points",
                    ha='center', va='bottom')

autolabel(ax.patches,ax)

ax.margins(y=0.1)
plt.show()

为什么pyplot.plot()创建一个宽度= 1,高度= 1的额外Rectangle?

Finally note that using the shapes in the plot to create the annotations may still not be the optimal choice. Instead why not using the data itself?

最后请注意,使用图中的形状来创建注释可能仍然不是最佳选择。相反,为什么不使用数据本身呢?

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'City': ['Berlin', 'Munich', 'Hamburg'],
               'Population': [3426354, 1260391, 1739117]})

ax = df.plot(x = "City", y="Population", kind='bar')

def autolabel(s, ax=None, name=""):
    x = s.name
    y = s[name]
    ax.annotate("{}".format(y), (x,y), xytext=(0,5), textcoords="offset points",
                ha='center', va='bottom')

df.apply(autolabel, axis=1, ax=ax, name="Population")

ax.margins(y=0.1)
plt.show()

This produces the same plot as above.

这产生与上面相同的图。

#1


1  

The fourth Rectangle is the bounding box for the Axis subplot.
This is an artifact of the way Pyplot handles bounding boxes, it's not specific to Pandas. For example, plotting with regular Pyplot:

第四个Rectangle是Axis子图的边界框。这是Pyplot处理边界框的方式的工件,它不是Pandas特有的。例如,使用常规Pyplot进行绘图:

f, ax = plt.subplots()
ax.bar(range(3), df.Population.values)
rects = [rect for rect in ax.get_children() if isinstance(rect, mpl.patches.Rectangle)]
for r in rects:
    print(r)

Still results in four Rectangles:

仍然导致四个矩形:

Rectangle(-0.4,0;0.8x3.42635e+06)
Rectangle(0.6,0;0.8x1.26039e+06)
Rectangle(1.6,0;0.8x1.73912e+06)
Rectangle(0,0;1x1)

There's a line in the Pyplot tight layout docs which refers to this extra Rectangle (and also why its coordinates are (0,0),(1,1). It refers to a rect parameter:

Pyplot紧密布局文档中有一行引用了这个额外的Rectangle(以及它的坐标为(0,0),(1,1)的原因。它指的是一个rect参数:

...which specifies the bounding box that the subplots will be fit inside. The coordinates must be in normalized figure coordinates and the default is (0, 0, 1, 1).

...指定子图将适合的边界框。坐标必须是标准化的图形坐标,默认值为(0,0,1,1)。

There's probably a more official section of the Matplotlib documentation that describes this architecture more thoroughly, but I find those docs difficult to navigate, this is the best I could come up with.

Matplotlib文档中可能有一个更正式的部分更全面地描述了这个架构,但我发现这些文档难以导航,这是我能想到的最好的。

#2


2  

You would not want to mess with all the children of the axes to get those of interest. If there are only bar plots in the axes, ax.patches gives you the rectangles in the axes.

你不会想要与轴的所有孩子混淆以获得感兴趣的东西。如果轴上只有条形图,则ax.patches会为您提供轴中的矩形。

Concerning the labeling of the bars, the linked article may not be the best choice. It argues to calculate the distance of the label manually, which is not really useful. Instead you would just offset the annotation by some points compared to the bar top, using the argument textcoords="offset points" to plt.annotation.

关于条形标签,链接的文章可能不是最佳选择。它主张手动计算标签的距离,这实际上并不实用。相反,您只需使用参数textcoords =“offset points”与plt.annotation相比,使用某些点将注释偏移到条形顶部。

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'City': ['Berlin', 'Munich', 'Hamburg'],
               'Population': [3426354, 1260391, 1739117]})
df = df.set_index('City')

ax = df.plot(kind='bar')


def autolabel(rects, ax):
    for rect in rects:
        x = rect.get_x() + rect.get_width()/2.
        y = rect.get_height()
        ax.annotate("{}".format(y), (x,y), xytext=(0,5), textcoords="offset points",
                    ha='center', va='bottom')

autolabel(ax.patches,ax)

ax.margins(y=0.1)
plt.show()

为什么pyplot.plot()创建一个宽度= 1,高度= 1的额外Rectangle?

Finally note that using the shapes in the plot to create the annotations may still not be the optimal choice. Instead why not using the data itself?

最后请注意,使用图中的形状来创建注释可能仍然不是最佳选择。相反,为什么不使用数据本身呢?

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'City': ['Berlin', 'Munich', 'Hamburg'],
               'Population': [3426354, 1260391, 1739117]})

ax = df.plot(x = "City", y="Population", kind='bar')

def autolabel(s, ax=None, name=""):
    x = s.name
    y = s[name]
    ax.annotate("{}".format(y), (x,y), xytext=(0,5), textcoords="offset points",
                ha='center', va='bottom')

df.apply(autolabel, axis=1, ax=ax, name="Population")

ax.margins(y=0.1)
plt.show()

This produces the same plot as above.

这产生与上面相同的图。