
时间:2022-10-18 22:02:18

I can create a simple columnar diagram in a matplotlib according to the 'simple' dictionary:


import matplotlib.pyplot as pltD = {u'Label1':26, u'Label2': 17, u'Label3':30}plt.bar(range(len(D)), D.values(), align='center')plt.xticks(range(len(D)), D.keys())plt.show()

绘制Y轴文本数据(非数字)和X轴数值数据的图But, how do I create curved line on the text and numeric data of this dictionarie, I do not know?


Т_OLD = {'10': 'need1', '11': 'need2', '12': 'need1', '13': 'need2', '14': 'need1'}

Like the picture below绘制Y轴文本数据(非数字)和X轴数值数据的图


3 个解决方案



You may use numpy to convert the dictionary to an array with two columns, which can be plotted.


import matplotlib.pyplot as pltimport numpy as npT_OLD = {'10' : 'need1', '11':'need2', '12':'need1', '13':'need2','14':'need1'}x = list(zip(*T_OLD.items()))# sort array, since dictionary is unsortedx = np.array(x)[:,np.argsort(x[0])].T# let second column be "True" if "need2", else be "Falsex[:,1] = (x[:,1] == "need2").astype(int)# plot the two columns of the arrayplt.plot(x[:,0], x[:,1])#set the labels accordinlyplt.gca().set_yticks([0,1])plt.gca().set_yticklabels(['need1', 'need2'])plt.show()


The following would be a version, which is independent on the actual content of the dictionary; only assumption is that the keys can be converted to floats.


import matplotlib.pyplot as pltimport numpy as npT_OLD = {'10': 'run', '11': 'tea', '12': 'mathematics', '13': 'run', '14' :'chemistry'}x = np.array(list(zip(*T_OLD.items())))u, ind = np.unique(x[1,:], return_inverse=True)x[1,:] = indx = x.astype(float)[:,np.argsort(x[0])].T# plot the two columns of the arrayplt.plot(x[:,0], x[:,1])#set the labels accordinlyplt.gca().set_yticks(range(len(u)))plt.gca().set_yticklabels(u)plt.show()




Use numeric values for your y-axis ticks, and then map them to desired strings with plt.yticks():


import matplotlib.pyplot as pltimport pandas as pd # example datatimes = pd.date_range(start='2017-10-17 00:00', end='2017-10-17 5:00', freq='H')data = np.random.choice([0,1], size=len(times))data_labels = ['need1','need2']fig, ax = plt.subplots()ax.plot(times, data, marker='o', linestyle="None")plt.yticks(data, data_labels)plt.xlabel("time")


Note: It's generally not a good idea to use a line graph to represent categorical changes in time (e.g. from need1 to need2). Doing that gives the visual impression of a continuum between time points, which may not be accurate. Here, I changed the plotting style to points instead of lines. If for some reason you need the lines, just remove linestyle="None" from the call to plt.plot().

注意:使用折线图来表示时间上的分类变化(例如从need1到need2)通常不是一个好主意。这样做会给出时间点之间连续性的视觉印象,这可能不准确。在这里,我将绘图样式更改为点而不是线。如果由于某种原因你需要这些行,只需从调用plt.plot()中删除linestyle =“None”。

(per comments)


To make this work with a y-axis category set of arbitrary length, use ax.set_yticks() and ax.set_yticklabels() to map to y-axis values.


For example, given a set of potential y-axis values labels, let N be the size of a subset of labels (here we'll set it to 4, but it could be any size).


Then draw a random sample data of y values and plot against time, labeling the y-axis ticks based on the full set labels. Note that we still use set_yticks() first with numerical markers, and then replace with our category labels with set_yticklabels().


labels = np.array(['A','B','C','D','E','F','G'])N = 4# example datatimes = pd.date_range(start='2017-10-17 00:00', end='2017-10-17 5:00', freq='H')data = np.random.choice(np.arange(len(labels)), size=len(times))fig, ax = plt.subplots(figsize=(15,10))ax.plot(times, data, marker='o', linestyle="None")ax.set_yticks(np.arange(len(labels)))ax.set_yticklabels(labels)plt.xlabel("time")



This gives the exact desired plot:


import matplotlib.pyplot as pltfrom collections import OrderedDictT_OLD = {'10' : 'need1', '11':'need2', '12':'need1', '13':'need2','14':'need1'}T_SRT = OrderedDict(sorted(T_OLD.items(), key=lambda t: t[0]))plt.plot(map(int, T_SRT.keys()), map(lambda x: int(x[-1]), T_SRT.values()),'r')plt.ylim([0.9,2.1])ax = plt.gca()ax.set_yticks([1,2])ax.set_yticklabels(['need1', 'need2'])plt.title('T_OLD')plt.xlabel('time')plt.ylabel('need')plt.show()

For Python 3.X the plotting lines needs to explicitly convert the map() output to lists:

对于Python 3.X,绘图线需要显式地将map()输出转换为列表:

plt.plot(list(map(int, T_SRT.keys())), list(map(lambda x: int(x[-1]), T_SRT.values())),'r')

as in Python 3.X map() returns an iterator as opposed to a list in Python 2.7.

与Python中一样3.X map()返回迭代器而不是Python 2.7中的列表。

The plot uses the dictionary keys converted to ints and last elements of need1 or need2, also converted to ints. This relies on the particular structure of your data, if the values where need1 and need3 it would need a couple more operations.


After plotting and changing the axes limits, the program simply modifies the tick labels at y positions 1 and 2. It then also adds the title and the x and y axis labels.


Important part is that the dictionary/input data has to be sorted. One way to do it is to use OrderedDict. Here T_SRT is an OrderedDict object sorted by keys in T_OLD.


The output is:



This is a more general case for more values/labels in T_OLD. It assumes that the label is always 'needX' where X is any number. This can readily be done for a general case of any string preceding the number though it would require more processing,


import matplotlib.pyplot as pltfrom collections import OrderedDictimport reT_OLD = {'10' : 'need1', '11':'need8', '12':'need11', '13':'need1','14':'need3'}T_SRT = OrderedDict(sorted(T_OLD.items(), key=lambda t: t[0]))x_val = list(map(int, T_SRT.keys()))y_val = list(map(lambda x: int(re.findall(r'\d+', x)[-1]), T_SRT.values()))plt.plot(x_val, y_val,'r')plt.ylim([0.9*min(y_val),1.1*max(y_val)])ax = plt.gca()y_axis = list(set(y_val))ax.set_yticks(y_axis)ax.set_yticklabels(['need' + str(i) for i in y_axis])plt.title('T_OLD')plt.xlabel('time')plt.ylabel('need')plt.show()

This solution finds the number at the end of the label using re.findall to accommodate for the possibility of multi-digit numbers. Previous solution just took the last component of the string because numbers were single digit. It still assumes that the number for plotting position is the last number in the string, hence the [-1]. Again for Python 3.X map output is explicitly converted to list, step not necessary in Python 2.7.

此解决方案使用re.findall查找标签末尾的数字,以适应多位数的可能性。之前的解决方案只占用字符串的最后一个部分,因为数字是单个数字。它仍假设绘制位置的数字是字符串中的最后一个数字,因此为[-1]。再次为Python 3.X映射输出显式转换为list,Python 2.7中不需要步骤。

The labels are now generated by first selecting unique y-values using set and then renaming their labels through concatenation of the strings 'need' with its corresponding integer.


The limits of y-axis are set as 0.9 of the minimum value and 1.1 of the maximum value. Rest of the formatting is as before.


The result for this test case is:





You may use numpy to convert the dictionary to an array with two columns, which can be plotted.


import matplotlib.pyplot as pltimport numpy as npT_OLD = {'10' : 'need1', '11':'need2', '12':'need1', '13':'need2','14':'need1'}x = list(zip(*T_OLD.items()))# sort array, since dictionary is unsortedx = np.array(x)[:,np.argsort(x[0])].T# let second column be "True" if "need2", else be "Falsex[:,1] = (x[:,1] == "need2").astype(int)# plot the two columns of the arrayplt.plot(x[:,0], x[:,1])#set the labels accordinlyplt.gca().set_yticks([0,1])plt.gca().set_yticklabels(['need1', 'need2'])plt.show()


The following would be a version, which is independent on the actual content of the dictionary; only assumption is that the keys can be converted to floats.


import matplotlib.pyplot as pltimport numpy as npT_OLD = {'10': 'run', '11': 'tea', '12': 'mathematics', '13': 'run', '14' :'chemistry'}x = np.array(list(zip(*T_OLD.items())))u, ind = np.unique(x[1,:], return_inverse=True)x[1,:] = indx = x.astype(float)[:,np.argsort(x[0])].T# plot the two columns of the arrayplt.plot(x[:,0], x[:,1])#set the labels accordinlyplt.gca().set_yticks(range(len(u)))plt.gca().set_yticklabels(u)plt.show()




Use numeric values for your y-axis ticks, and then map them to desired strings with plt.yticks():


import matplotlib.pyplot as pltimport pandas as pd # example datatimes = pd.date_range(start='2017-10-17 00:00', end='2017-10-17 5:00', freq='H')data = np.random.choice([0,1], size=len(times))data_labels = ['need1','need2']fig, ax = plt.subplots()ax.plot(times, data, marker='o', linestyle="None")plt.yticks(data, data_labels)plt.xlabel("time")


Note: It's generally not a good idea to use a line graph to represent categorical changes in time (e.g. from need1 to need2). Doing that gives the visual impression of a continuum between time points, which may not be accurate. Here, I changed the plotting style to points instead of lines. If for some reason you need the lines, just remove linestyle="None" from the call to plt.plot().

注意:使用折线图来表示时间上的分类变化(例如从need1到need2)通常不是一个好主意。这样做会给出时间点之间连续性的视觉印象,这可能不准确。在这里,我将绘图样式更改为点而不是线。如果由于某种原因你需要这些行,只需从调用plt.plot()中删除linestyle =“None”。

(per comments)


To make this work with a y-axis category set of arbitrary length, use ax.set_yticks() and ax.set_yticklabels() to map to y-axis values.


For example, given a set of potential y-axis values labels, let N be the size of a subset of labels (here we'll set it to 4, but it could be any size).


Then draw a random sample data of y values and plot against time, labeling the y-axis ticks based on the full set labels. Note that we still use set_yticks() first with numerical markers, and then replace with our category labels with set_yticklabels().


labels = np.array(['A','B','C','D','E','F','G'])N = 4# example datatimes = pd.date_range(start='2017-10-17 00:00', end='2017-10-17 5:00', freq='H')data = np.random.choice(np.arange(len(labels)), size=len(times))fig, ax = plt.subplots(figsize=(15,10))ax.plot(times, data, marker='o', linestyle="None")ax.set_yticks(np.arange(len(labels)))ax.set_yticklabels(labels)plt.xlabel("time")



This gives the exact desired plot:


import matplotlib.pyplot as pltfrom collections import OrderedDictT_OLD = {'10' : 'need1', '11':'need2', '12':'need1', '13':'need2','14':'need1'}T_SRT = OrderedDict(sorted(T_OLD.items(), key=lambda t: t[0]))plt.plot(map(int, T_SRT.keys()), map(lambda x: int(x[-1]), T_SRT.values()),'r')plt.ylim([0.9,2.1])ax = plt.gca()ax.set_yticks([1,2])ax.set_yticklabels(['need1', 'need2'])plt.title('T_OLD')plt.xlabel('time')plt.ylabel('need')plt.show()

For Python 3.X the plotting lines needs to explicitly convert the map() output to lists:

对于Python 3.X,绘图线需要显式地将map()输出转换为列表:

plt.plot(list(map(int, T_SRT.keys())), list(map(lambda x: int(x[-1]), T_SRT.values())),'r')

as in Python 3.X map() returns an iterator as opposed to a list in Python 2.7.

与Python中一样3.X map()返回迭代器而不是Python 2.7中的列表。

The plot uses the dictionary keys converted to ints and last elements of need1 or need2, also converted to ints. This relies on the particular structure of your data, if the values where need1 and need3 it would need a couple more operations.


After plotting and changing the axes limits, the program simply modifies the tick labels at y positions 1 and 2. It then also adds the title and the x and y axis labels.


Important part is that the dictionary/input data has to be sorted. One way to do it is to use OrderedDict. Here T_SRT is an OrderedDict object sorted by keys in T_OLD.


The output is:



This is a more general case for more values/labels in T_OLD. It assumes that the label is always 'needX' where X is any number. This can readily be done for a general case of any string preceding the number though it would require more processing,


import matplotlib.pyplot as pltfrom collections import OrderedDictimport reT_OLD = {'10' : 'need1', '11':'need8', '12':'need11', '13':'need1','14':'need3'}T_SRT = OrderedDict(sorted(T_OLD.items(), key=lambda t: t[0]))x_val = list(map(int, T_SRT.keys()))y_val = list(map(lambda x: int(re.findall(r'\d+', x)[-1]), T_SRT.values()))plt.plot(x_val, y_val,'r')plt.ylim([0.9*min(y_val),1.1*max(y_val)])ax = plt.gca()y_axis = list(set(y_val))ax.set_yticks(y_axis)ax.set_yticklabels(['need' + str(i) for i in y_axis])plt.title('T_OLD')plt.xlabel('time')plt.ylabel('need')plt.show()

This solution finds the number at the end of the label using re.findall to accommodate for the possibility of multi-digit numbers. Previous solution just took the last component of the string because numbers were single digit. It still assumes that the number for plotting position is the last number in the string, hence the [-1]. Again for Python 3.X map output is explicitly converted to list, step not necessary in Python 2.7.

此解决方案使用re.findall查找标签末尾的数字,以适应多位数的可能性。之前的解决方案只占用字符串的最后一个部分,因为数字是单个数字。它仍假设绘制位置的数字是字符串中的最后一个数字,因此为[-1]。再次为Python 3.X映射输出显式转换为list,Python 2.7中不需要步骤。

The labels are now generated by first selecting unique y-values using set and then renaming their labels through concatenation of the strings 'need' with its corresponding integer.


The limits of y-axis are set as 0.9 of the minimum value and 1.1 of the maximum value. Rest of the formatting is as before.


The result for this test case is:

