使用多处理时出现Sympy / mpmath / gmpy错误

时间:2021-02-10 20:23:22

EDIT: This is a sympy bug. I have moved the discussion to https://github.com/sympy/sympy/issues/7457

编辑:这是一个症状错误。我已将讨论移至https://github.com/sympy/sympy/issues/7457

I have a Python program that uses sympy to perform some core functionality that involves taking the intersection of a line and a shape. This operation needs to be performed several thousand times, and is quite slow when using the default sympy pure Python modules.

我有一个Python程序,它使用sympy来执行一些涉及线和形状交叉的核心功能。此操作需要执行数千次,并且在使用默认的sympy纯Python模块时非常慢。

I attempted to speed this up by installing gmpy 2.0.3 (I have also tried with gmpy 1.5). This does lead to the code speeding up somewhat, but when using multiprocessing to gain a further speed-up, the program crashes with a TypeError.

我尝试通过安装gmpy 2.0.3来加快速度(我也尝试过使用gmpy 1.5)。这确实会导致代码加速,但是当使用多处理来进一步加速时,程序会因TypeError而崩溃。

Exception in thread Thread-3:
Traceback (most recent call last):
  File "C:\python27\lib\threading.py", line 810, in __bootstrap_inner
    self.run()
  File "C:\python27\lib\threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "C:\python27\lib\multiprocessing\pool.py", line 376, in _handle_results
    task = get()
  File "C:\python27\lib\site-packages\sympy\geometry\point.py", line 91, in __new__
    for f in coords.atoms(Float)]))
  File "C:\python27\lib\site-packages\sympy\simplify\simplify.py", line 3839, in nsimplify
    return _real_to_rational(expr, tolerance)
  File "C:\python27\lib\site-packages\sympy\simplify\simplify.py", line 3781, in _real_to_rational
    r = nsimplify(float, rational=False)
  File "C:\python27\lib\site-packages\sympy\simplify\simplify.py", line 3861, in nsimplify
    exprval = expr.evalf(prec, chop=True)
  File "C:\python27\lib\site-packages\sympy\core\evalf.py", line 1300, in evalf
    re = C.Float._new(re, p)
  File "C:\python27\lib\site-packages\sympy\core\numbers.py", line 673, in _new
    obj._mpf_ = mpf_norm(_mpf_, _prec)
  File "C:\python27\lib\site-packages\sympy\core\numbers.py", line 56, in mpf_norm
    rv = mpf_normalize(sign, man, expt, bc, prec, rnd)
TypeError: ('argument is not an mpz', <class 'sympy.geometry.point.Point'>, (-7.07106781186548, -7.07106781186548))

The program works fine when run in a single process using gmpy and when run without gmpy using multiprocessing.Pool.

当使用gmpy在单个进程中运行时以及使用multiprocessing.Pool运行时没有gmpy时,程序运行正常。

Has anyone run into this sort of problem before? The program below reproduces this problem:

有没有人遇到过这类问题?下面的程序重现了这个问题:

import sympy
import multiprocessing
import numpy

def thread_function(func, data, output_progress=True, extra_kwargs=None, num_procs=None):
    if extra_kwargs:
        func = functools.partial(func, **extra_kwargs)

    if not num_procs:
        num_procs = multiprocessing.cpu_count()
    pool = multiprocessing.Pool(processes=num_procs)
    results = pool.map_async(func, data.T)
    pool.close()

    pool.join()
    return results.get()

def test_fn(data):
    x = data[0]
    y = data[1]
    circle = sympy.Circle((0,0), 10)
    line = sympy.Line(sympy.Point(0,0), sympy.Point(x,y))
    return line.intersection(circle)[0].evalf()

if __name__ == '__main__':
    data = numpy.vstack((numpy.arange(1, 100), numpy.arange(1, 100)))

    print thread_function(test_fn, data) #<--- this line causes the problem
#    print [test_fn(data[:,i]) for i in xrange(data.shape[1])] #<--- this one runs without errors

1 个解决方案

#1


1  

I've verified that gmpy objects are picklable and that mpmath.mpf objects that use gmpy are also picklable.

我已经验证了gmpy对象是可选择的,并且使用gmpy的mpmath.mpf对象也是可选择的。

The error occurs when the man argument to mpf_normalize() is not a gmpy object. If I force man to be an mpz, then I no longer get an error. But the answer is different from the single process version.

当mpf_normalize()的man参数不是gmpy对象时,会发生错误。如果我强迫男人成为一个mpz,那么我就不会再犯错了。但答案与单一流程版本不同。

Single process version:

单进程版本:

Point(-223606797749979/50000000000000, -223606797749979/25000000000000)

点(-223606797749979/50000000000000,-223606797749979 / 25000000000000)

Multiple process version:

多进程版本:

Point(-7.07106781186548, -7.07106781186548)

点(-7.07106781186548,-7.07106781186548)

Both the types used in Point() are different (rational vs. float) and the values are different (-223606797749979/50000000000000 is -4.47213595499958).

Point()中使用的两种类型都不同(有理与浮点数)且值不同(-223606797749979/50000000000000为-4.47213595499958)。

I'm still researching and will update this answer if I discover the root cause.

我还在研究,如果发现根本原因,我会更新这个答案。

Update #1: The differing values were caused by an error in the example code. The threaded function was passed different values than the non-threaded version.

更新#1:不同的值是由示例代码中的错误引起的。线程函数传递的值不同于非线程版本。

I'm still tracking down why multiprocessing triggers the exception. I've reduced the problem to the following example:

我仍然在追踪多处理触发异常的原因。我已将问题减少到以下示例:

import sympy
import multiprocessing
import numpy

def thread_function(func, data, output_progress=True, extra_kwargs=None, num_procs=None):
    if extra_kwargs:
        func = functools.partial(func, **extra_kwargs)

    if not num_procs:
        num_procs = multiprocessing.cpu_count()
    pool = multiprocessing.Pool(processes=num_procs)
    results = pool.map_async(func, data)
    pool.close()

    pool.join()
    return results.get()

def test_fn(data):
    return sympy.Point(0,1).evalf()

if __name__ == '__main__':
    test_size = 10
    print [test_fn(None) for i in xrange(1, test_size)] #<--- this one runs without errors
    print thread_function(test_fn, [None] * (test_size - 1)) #<--- this line causes the problem

#1


1  

I've verified that gmpy objects are picklable and that mpmath.mpf objects that use gmpy are also picklable.

我已经验证了gmpy对象是可选择的,并且使用gmpy的mpmath.mpf对象也是可选择的。

The error occurs when the man argument to mpf_normalize() is not a gmpy object. If I force man to be an mpz, then I no longer get an error. But the answer is different from the single process version.

当mpf_normalize()的man参数不是gmpy对象时,会发生错误。如果我强迫男人成为一个mpz,那么我就不会再犯错了。但答案与单一流程版本不同。

Single process version:

单进程版本:

Point(-223606797749979/50000000000000, -223606797749979/25000000000000)

点(-223606797749979/50000000000000,-223606797749979 / 25000000000000)

Multiple process version:

多进程版本:

Point(-7.07106781186548, -7.07106781186548)

点(-7.07106781186548,-7.07106781186548)

Both the types used in Point() are different (rational vs. float) and the values are different (-223606797749979/50000000000000 is -4.47213595499958).

Point()中使用的两种类型都不同(有理与浮点数)且值不同(-223606797749979/50000000000000为-4.47213595499958)。

I'm still researching and will update this answer if I discover the root cause.

我还在研究,如果发现根本原因,我会更新这个答案。

Update #1: The differing values were caused by an error in the example code. The threaded function was passed different values than the non-threaded version.

更新#1:不同的值是由示例代码中的错误引起的。线程函数传递的值不同于非线程版本。

I'm still tracking down why multiprocessing triggers the exception. I've reduced the problem to the following example:

我仍然在追踪多处理触发异常的原因。我已将问题减少到以下示例:

import sympy
import multiprocessing
import numpy

def thread_function(func, data, output_progress=True, extra_kwargs=None, num_procs=None):
    if extra_kwargs:
        func = functools.partial(func, **extra_kwargs)

    if not num_procs:
        num_procs = multiprocessing.cpu_count()
    pool = multiprocessing.Pool(processes=num_procs)
    results = pool.map_async(func, data)
    pool.close()

    pool.join()
    return results.get()

def test_fn(data):
    return sympy.Point(0,1).evalf()

if __name__ == '__main__':
    test_size = 10
    print [test_fn(None) for i in xrange(1, test_size)] #<--- this one runs without errors
    print thread_function(test_fn, [None] * (test_size - 1)) #<--- this line causes the problem