I am using Newton's method to generate fractals that visualise the roots and the number of iterations taken to find the roots.
我用牛顿的方法来生成分形,它可以直观地观察根的根,以及找到根的迭代次数。
I am not happy with the speed taken to complete the function. Is there are a way to speed up my code?
我对完成这个功能的速度不满意。有没有办法加快我的代码?
def f(z):
return z**4-1
def f_prime(z):
'''Analytic derivative'''
return 4*z**3
def newton_raphson(x, y, max_iter=20, eps = 1.0e-20):
z = x + y * 1j
iter = np.zeros((len(z), len(z)))
for i in range(max_iter):
z_old = z
z = z-(f(z)/f_prime(z))
for k in range(len(z[:,0])): #this bit of the code is slow. Can I do this WITHOUT for loops?
for j in range(len(z[:,1])):
if iter[k,j] != 0:
continue
if z[k,j] == z_old[k,j]:
iter[k,j] = i
return np.angle(z), iter #return argument of root and iterations taken
n_points = 1000; xmin = -1.5; xmax = 1.5
xs = np.linspace(xmin, xmax, n_points)
X,Y = np.meshgrid(xs, xs)
dat = newton_raphson(X, Y)
3 个解决方案
#1
1
Possible optimisations in addition to vectorising the loops
可能的优化除了矢量化的循环。
- Save the result of
z[mask]
to avoid repeatedly using fancy indexing (~80% faster) - 保存z[mask]的结果,避免重复使用花哨的索引(快80%)
- Combine
z - f(z) / f_prime(z)
into one function (~25% in this case) - 将z - f(z) / f_prime(z)组合成一个函数(在本例中为~25%)
- Use lower precision (
dtype=np.complex64
) if the larger error is acceptable (~90%) - 如果较大的误差可以接受(~90%),请使用较低的精度(dtype=np.complex64)
- Use broadcasting instead of meshgrid to create the z plane (negligible effect, but generally a good idea).
- 使用广播代替网格创建z平面(可以忽略的效果,但通常是个好主意)。
def iterstep(z):
return 0.75*z + 0.25 / z**3
def newton_raphson(Re, Im, max_iter):
z = Re + 1j * Im[:, np.newaxis]
itercount = np.zeros_like(z, dtype=np.int32)
mask = np.ones_like(z, dtype=np.bool)
for i in range(max_iter):
z_old = z.copy()
z[mask] = iterstep(z[mask])
mask = (z != z_old)
itercount[mask] = i
return np.angle(z), itercount
axis = np.linspace(-1.5, 1.5, num=1000, dtype=np.complex64)
angle, itercount = newton_raphson(Re=axis, Im=axis, max_iter=20)
它是什么样子
#2
2
You can simply vectorize the loops for fairly large speed gains:
你可以简单地对循环进行矢量量化,以获得相当大的速度增益:
def newton_raphson(x, y, max_iter=20, eps = 1.0e-20):
z = x + y * 1j
nz = len(z)
iters = np.zeros((nz, nz))
for i in range(max_iter):
z_old = z
z = z-(f(z)/f_prime(z))
mask = (iters == 0) & (z == z_old)
iters[mask] = i
return np.angle(z), items
Your presented equations are fairly simple; however, I would assume that your f
and f_prime
functions are significantly more complex. Further speedups can likely be found in those equations rather than in the presented problem.
你提出的方程相当简单;但是,我假设f和f_prime函数要复杂得多。在这些方程中可能会发现进一步的速度,而不是在出现的问题中。
I would also avoid the use of iter
as a variable name as it is a built in python function.
我还将避免使用iter作为变量名,因为它是在python函数中构建的。
#3
1
This should be what you are looking for and be working for all shapes of numpy arrays.
这应该是您正在寻找的,并且正在为所有numpy数组的形状工作。
def newton_numpy(x, y, max_iter=1000, eps = 1e-5):
z = x + y * 1j
# deltas is to store the differences between to iterations
deltas = np.ones_like(z)
# this is to store how many iterations were used for each element in z
needed_iterations = np.zeros_like(z, dtype=int)
it = 0
# we loop until max_iter or every element converged
while it < max_iter and np.any(deltas > eps):
z_old = z.copy()
# recalculate only for values that have not yet converged
mask = deltas > eps
z[mask] = z[mask] - (f(z[mask]) / f_prime(z[mask]))
needed_iterations[mask] += 1
deltas[mask] = np.abs(z_old[mask] - z[mask])
it += 1
return np.angle(z), needed_iterations
With this code:
这段代码:
n_points = 25
xmin = -1.5
xmax = 1.5
xs = np.linspace(xmin, xmax, n_points)
X, Y = np.meshgrid(xs, xs)
ang, iters = newton_numpy(X, Y, eps=1e-5, max_iters=1000)
It takes 0.09 seconds and a mean of 85 iterations per element.
它需要0.09秒,每个元素的平均迭代次数为85次。
#1
1
Possible optimisations in addition to vectorising the loops
可能的优化除了矢量化的循环。
- Save the result of
z[mask]
to avoid repeatedly using fancy indexing (~80% faster) - 保存z[mask]的结果,避免重复使用花哨的索引(快80%)
- Combine
z - f(z) / f_prime(z)
into one function (~25% in this case) - 将z - f(z) / f_prime(z)组合成一个函数(在本例中为~25%)
- Use lower precision (
dtype=np.complex64
) if the larger error is acceptable (~90%) - 如果较大的误差可以接受(~90%),请使用较低的精度(dtype=np.complex64)
- Use broadcasting instead of meshgrid to create the z plane (negligible effect, but generally a good idea).
- 使用广播代替网格创建z平面(可以忽略的效果,但通常是个好主意)。
def iterstep(z):
return 0.75*z + 0.25 / z**3
def newton_raphson(Re, Im, max_iter):
z = Re + 1j * Im[:, np.newaxis]
itercount = np.zeros_like(z, dtype=np.int32)
mask = np.ones_like(z, dtype=np.bool)
for i in range(max_iter):
z_old = z.copy()
z[mask] = iterstep(z[mask])
mask = (z != z_old)
itercount[mask] = i
return np.angle(z), itercount
axis = np.linspace(-1.5, 1.5, num=1000, dtype=np.complex64)
angle, itercount = newton_raphson(Re=axis, Im=axis, max_iter=20)
它是什么样子
#2
2
You can simply vectorize the loops for fairly large speed gains:
你可以简单地对循环进行矢量量化,以获得相当大的速度增益:
def newton_raphson(x, y, max_iter=20, eps = 1.0e-20):
z = x + y * 1j
nz = len(z)
iters = np.zeros((nz, nz))
for i in range(max_iter):
z_old = z
z = z-(f(z)/f_prime(z))
mask = (iters == 0) & (z == z_old)
iters[mask] = i
return np.angle(z), items
Your presented equations are fairly simple; however, I would assume that your f
and f_prime
functions are significantly more complex. Further speedups can likely be found in those equations rather than in the presented problem.
你提出的方程相当简单;但是,我假设f和f_prime函数要复杂得多。在这些方程中可能会发现进一步的速度,而不是在出现的问题中。
I would also avoid the use of iter
as a variable name as it is a built in python function.
我还将避免使用iter作为变量名,因为它是在python函数中构建的。
#3
1
This should be what you are looking for and be working for all shapes of numpy arrays.
这应该是您正在寻找的,并且正在为所有numpy数组的形状工作。
def newton_numpy(x, y, max_iter=1000, eps = 1e-5):
z = x + y * 1j
# deltas is to store the differences between to iterations
deltas = np.ones_like(z)
# this is to store how many iterations were used for each element in z
needed_iterations = np.zeros_like(z, dtype=int)
it = 0
# we loop until max_iter or every element converged
while it < max_iter and np.any(deltas > eps):
z_old = z.copy()
# recalculate only for values that have not yet converged
mask = deltas > eps
z[mask] = z[mask] - (f(z[mask]) / f_prime(z[mask]))
needed_iterations[mask] += 1
deltas[mask] = np.abs(z_old[mask] - z[mask])
it += 1
return np.angle(z), needed_iterations
With this code:
这段代码:
n_points = 25
xmin = -1.5
xmax = 1.5
xs = np.linspace(xmin, xmax, n_points)
X, Y = np.meshgrid(xs, xs)
ang, iters = newton_numpy(X, Y, eps=1e-5, max_iters=1000)
It takes 0.09 seconds and a mean of 85 iterations per element.
它需要0.09秒,每个元素的平均迭代次数为85次。