使用SciPy将数据插入到二次拟合中

时间:2021-05-28 21:25:48

I have a set of data that when plotted most points congregate to the left of the x axis:

我有一组数据,当绘制的大多数点聚集到x轴的左边:

plt.plot(x, y, marker='o')
plt.title('Original')
plt.show()

ORIGINAL GRAPH

原始图

I want to use scipy to interpolate the data and later try to fit a quadratic line to the data. I am avoiding to simply fit a quadratic curve without interpolation since this will make the obtained curve biased towards the mass of data at one extreme end of the x axis. I tried this by using

我想用scipy来插入数据,然后尝试用二次曲线来匹配数据。我避免简单地拟合二次曲线而不进行插值,因为这将使得到的曲线偏向x轴一端的数据量。我用了这个。

f = interp1d(x, y, kind='quadratic')

# Array with points in between min(x) and max(x) for interpolation
x_interp = np.linspace(min(x), max(x), num=np.size(x))

# Plot graph with interpolation
plt.plot(x_interp, f(x_interp), marker='o')
plt.title('Interpolated')
plt.show()

and got INTERPOLATED GRAPH.

并得到了插值图。

However, what I intend to get is something like this: EXPECTED GRAPH

但是,我想得到的是这样的:期望图

What am I doing wrong?

我做错了什么?

My values for x can be found here and values for y here. Thank you!

x的值在这里,y的值在这里。谢谢你!

1 个解决方案

#1


2  

Solution 1

I'm pretty sure this does what you want. It fits a second degree (quadratic) polynomial to your data, then plots that function on an evenly spaced array of x values ranging from the minimum to the maximum of your original x data.

我很确定这是你想要的。它将一个二次多项式拟合到数据上,然后将该函数绘制在一个均匀间隔的x值数组上,该数组的值从最初的x数据的最小值到最大值不等。

new_x = np.linspace(min(x), max(x), num=np.size(x))
coefs = np.polyfit(x,y,2)
new_line = np.polyval(coefs, new_x)

Plotting it returns:

策划它返回:

plt.scatter(x,y)
plt.scatter(new_x,new_line,c='g', marker='^', s=5)
plt.xlim(min(x)-0.00001,max(x)+0.00001)
plt.xticks(rotation=90)
plt.tight_layout()
plt.show()

使用SciPy将数据插入到二次拟合中

if that wasn't what you meant...

However, from your question, it seems like you might be trying to force all your original y-values onto evenly spaced x-values (if that's not your intention, let me know, and I'll just delete this part).

然而,从你的问题来看,你可能试图把所有的y值强制到均匀间隔的x值(如果这不是你的本意,请让我知道,我将删除这部分)。

This is also possible, there are lots of ways to do this, but I've done it here in pandas:

这也是可能的,有很多方法可以做到这一点,但我在这里做过熊猫:

import pandas as pd
xy_df=pd.DataFrame({'x_orig': x, 'y_orig': y})
sorted_x_y=xy_df.sort_values('x_orig')
sorted_x_y['new_x'] = np.linspace(min(x), max(x), np.size(x))

plt.figure(figsize=[5,5])
plt.scatter(sorted_x_y['new_x'], sorted_x_y['y_orig'])
plt.xlim(min(x)-0.00001,max(x)+0.00001)
plt.xticks(rotation=90)
plt.tight_layout()

Which looks pretty different from your original data... which is why I think it might not be exactly what you're looking for.

这看起来和你的原始数据非常不同……这就是为什么我认为它可能不是你想要的。

使用SciPy将数据插入到二次拟合中

#1


2  

Solution 1

I'm pretty sure this does what you want. It fits a second degree (quadratic) polynomial to your data, then plots that function on an evenly spaced array of x values ranging from the minimum to the maximum of your original x data.

我很确定这是你想要的。它将一个二次多项式拟合到数据上,然后将该函数绘制在一个均匀间隔的x值数组上,该数组的值从最初的x数据的最小值到最大值不等。

new_x = np.linspace(min(x), max(x), num=np.size(x))
coefs = np.polyfit(x,y,2)
new_line = np.polyval(coefs, new_x)

Plotting it returns:

策划它返回:

plt.scatter(x,y)
plt.scatter(new_x,new_line,c='g', marker='^', s=5)
plt.xlim(min(x)-0.00001,max(x)+0.00001)
plt.xticks(rotation=90)
plt.tight_layout()
plt.show()

使用SciPy将数据插入到二次拟合中

if that wasn't what you meant...

However, from your question, it seems like you might be trying to force all your original y-values onto evenly spaced x-values (if that's not your intention, let me know, and I'll just delete this part).

然而,从你的问题来看,你可能试图把所有的y值强制到均匀间隔的x值(如果这不是你的本意,请让我知道,我将删除这部分)。

This is also possible, there are lots of ways to do this, but I've done it here in pandas:

这也是可能的,有很多方法可以做到这一点,但我在这里做过熊猫:

import pandas as pd
xy_df=pd.DataFrame({'x_orig': x, 'y_orig': y})
sorted_x_y=xy_df.sort_values('x_orig')
sorted_x_y['new_x'] = np.linspace(min(x), max(x), np.size(x))

plt.figure(figsize=[5,5])
plt.scatter(sorted_x_y['new_x'], sorted_x_y['y_orig'])
plt.xlim(min(x)-0.00001,max(x)+0.00001)
plt.xticks(rotation=90)
plt.tight_layout()

Which looks pretty different from your original data... which is why I think it might not be exactly what you're looking for.

这看起来和你的原始数据非常不同……这就是为什么我认为它可能不是你想要的。

使用SciPy将数据插入到二次拟合中