Django中的裁剪图像导致内存大量增加

时间:2021-01-25 21:23:46

I've recently run into a problem with my Django project and memory usage on WebFaction.

我最近遇到了一个问题,我的Django项目和web咯噔上的内存使用问题。

Here is the two processes running in memory for this project on webfaction:

以下是web咯血项目内存中运行的两个进程:

30396  4-20:20:00 13486 
30404  4-20:20:00 13487 

After the view is run one of the processes will increase substantially:

视图运行后,其中一个流程将大量增加:

69720  4-20:20:22 13486 
30404  4-20:20:22 13487 

As you can see the first process more than doubled in memory usage! As this function will be used often I need to figure out what is happening. I believe I've narrowed it down to the following view (it's a 3 step process to upload an image, add details, crop a thumbnail).

正如您所看到的,第一个进程的内存使用量增加了一倍多!由于这个函数经常被使用,我需要弄清楚发生了什么。我相信我已经把它缩小到下面的视图(它是一个3步的过程来上传图片,添加细节,裁剪缩略图)。

Here is the view below. It gets a photo object, loads the image from the file, gets the box coordinates that the user has submitted and then creates a 200,200 sized image. This newly created image is written back to the disk with a .thumbnail in the filename and the photo object is saved.

这是下面的视图。它获取一个照片对象,从文件中加载图像,获取用户提交的框坐标,然后创建一个200,200大小的图像。这个新创建的映像用文件名中的.thumbnail写回磁盘,并保存photo对象。

@login_required
def upload3(request, photo_pk):
    photo = get_object_or_404(Photo, pk=photo_pk, user=request.user)
    if request.method == "POST":
        form = upload3Form(request.POST)
        if form.is_valid():
            im = Image.open(photo.image.path)
            try:
                box =(form.cleaned_data['x1'],form.cleaned_data['y1'],form.cleaned_data['x2'],form.cleaned_data['y2'])
            except:
                box = ('0','0','1000','1000')
            cropped = im.crop(box)
            cropped.thumbnail((200,200),Image.ANTIALIAS)
            result = os.path.splitext(photo.image.path)
            cropped.save(result[0] + '.thumbnail' + result[1])
            photo.status = 3
            photo.save()

Any ideas of what I may be doing wrong would be greatly appreciated.

任何关于我可能做错的事情的想法都将非常感谢。

Update 1: Images used for testing are all Jpeg and have dimensions around 3600 x 2700 and are around 2 MB per image.

更新1:用于测试的图像都是Jpeg,尺寸在3600 x 2700左右,每个图像大约有2 MB。

2 个解决方案

#1


2  

The 2M figure is for a compressed JPEG image, but uncompressed, 3600 x 2700 truecolor will be about 38M (9,720,000 pixels at 4B per pixel), close to the memory usage increase you are experiencing.

2M是压缩的JPEG图像,但未压缩的3600 x 2700 truecolor将大约为38M(972万像素,每像素4B),接近您正在经历的内存使用增加。

This is a known problem with PIL, I can produce a pixel bomb by sending you a 40000x40000 pixel black picture as a png. Always check the resolution before loading (or protect the code with a try/except block handling OutOfMemory). See if using im.tile attribute to process the image chunk by chunk gives you a lower memory footprint.

这是PIL的一个已知问题,我可以将40000x40000像素的黑色图片作为png来生成一个像素炸弹。在加载前总是检查解析(或使用try/except块处理OutOfMemory保护代码)。是否使用im。tile属性来按块处理图像块,这样可以减少内存占用。

May be worth checking:

可能值得检查:

Some alternatives that are said to better handle memory when dealing with larger images:

在处理更大的图像时,有一些方法可以更好地处理内存:

  • GDAL (Geospatial Data Abstraction Library)
  • GDAL(地理空间数据抽象库)
  • OIIO (OpenImageIO)
  • OIIO(OpenImageIO)
  • Mahotas (NumPy)
  • Mahotas(NumPy)

[ update ]

(更新)

Do you know if there's a way in PIL to release the objects from memory? Because in theory that would be best for this view as I need it to work like it does, but just handle the image better.

你知道PIL是否有办法从内存中释放对象吗?因为理论上这对这个视图来说是最好的,因为我需要它像它一样工作,只是更好地处理图像。

  • In order to avoid memory spikes you can detect huge images and try to process them in chunks using im.tile instead of im.crop (unfortunatelly operating at a lower level).
  • 为了避免内存峰值,你可以检测到巨大的图像,并尝试用im来处理它们。瓷砖,而不是我。作物(不幸在较低的水平上操作)。
  • You can delete the intermediate image objects as soon as possible in order to get shorter spikes (using the gc module you can force the garbage collector to cleanup).
  • 您可以尽快删除中间映像对象,以便获得更短的峰值(使用gc模块可以强制垃圾收集器进行清理)。

#2


0  

After a lot of digging and dead-ends I tried something not suggested anywhere and it worked.

经过大量的挖掘和死角,我尝试了一些任何地方都没有的东西,而且成功了。

On each object that contained an image object used in PIL I had to delete the object once I was done with it. So for example:

在PIL中包含一个image对象的每个对象,我都必须在使用完后删除它。举个例子:

im = Image.open(photo.image.path)
try:
    box  =(form.cleaned_data['x1'],form.cleaned_data['y1'],form.cleaned_data['x2'],form.cleaned_data['y2'])
except:
    box = ('0','0','1000','1000')
cropped = im.crop(box)
newimage = cropped.resize((form.cleaned_data['dw'],form.cleaned_data['dh']),Image.ANTIALIAS)
del im
del cropped 

So once I'm done with the object I call del on that item. It seems to have fixed the problem. I no longer have memory increases and I couldn't be happier.

一旦我完成了对象,我调用了这个项目的del。它似乎解决了这个问题。我的记忆力不再增长了,我也不可能比以前更快乐了。

#1


2  

The 2M figure is for a compressed JPEG image, but uncompressed, 3600 x 2700 truecolor will be about 38M (9,720,000 pixels at 4B per pixel), close to the memory usage increase you are experiencing.

2M是压缩的JPEG图像,但未压缩的3600 x 2700 truecolor将大约为38M(972万像素,每像素4B),接近您正在经历的内存使用增加。

This is a known problem with PIL, I can produce a pixel bomb by sending you a 40000x40000 pixel black picture as a png. Always check the resolution before loading (or protect the code with a try/except block handling OutOfMemory). See if using im.tile attribute to process the image chunk by chunk gives you a lower memory footprint.

这是PIL的一个已知问题,我可以将40000x40000像素的黑色图片作为png来生成一个像素炸弹。在加载前总是检查解析(或使用try/except块处理OutOfMemory保护代码)。是否使用im。tile属性来按块处理图像块,这样可以减少内存占用。

May be worth checking:

可能值得检查:

Some alternatives that are said to better handle memory when dealing with larger images:

在处理更大的图像时,有一些方法可以更好地处理内存:

  • GDAL (Geospatial Data Abstraction Library)
  • GDAL(地理空间数据抽象库)
  • OIIO (OpenImageIO)
  • OIIO(OpenImageIO)
  • Mahotas (NumPy)
  • Mahotas(NumPy)

[ update ]

(更新)

Do you know if there's a way in PIL to release the objects from memory? Because in theory that would be best for this view as I need it to work like it does, but just handle the image better.

你知道PIL是否有办法从内存中释放对象吗?因为理论上这对这个视图来说是最好的,因为我需要它像它一样工作,只是更好地处理图像。

  • In order to avoid memory spikes you can detect huge images and try to process them in chunks using im.tile instead of im.crop (unfortunatelly operating at a lower level).
  • 为了避免内存峰值,你可以检测到巨大的图像,并尝试用im来处理它们。瓷砖,而不是我。作物(不幸在较低的水平上操作)。
  • You can delete the intermediate image objects as soon as possible in order to get shorter spikes (using the gc module you can force the garbage collector to cleanup).
  • 您可以尽快删除中间映像对象,以便获得更短的峰值(使用gc模块可以强制垃圾收集器进行清理)。

#2


0  

After a lot of digging and dead-ends I tried something not suggested anywhere and it worked.

经过大量的挖掘和死角,我尝试了一些任何地方都没有的东西,而且成功了。

On each object that contained an image object used in PIL I had to delete the object once I was done with it. So for example:

在PIL中包含一个image对象的每个对象,我都必须在使用完后删除它。举个例子:

im = Image.open(photo.image.path)
try:
    box  =(form.cleaned_data['x1'],form.cleaned_data['y1'],form.cleaned_data['x2'],form.cleaned_data['y2'])
except:
    box = ('0','0','1000','1000')
cropped = im.crop(box)
newimage = cropped.resize((form.cleaned_data['dw'],form.cleaned_data['dh']),Image.ANTIALIAS)
del im
del cropped 

So once I'm done with the object I call del on that item. It seems to have fixed the problem. I no longer have memory increases and I couldn't be happier.

一旦我完成了对象,我调用了这个项目的del。它似乎解决了这个问题。我的记忆力不再增长了,我也不可能比以前更快乐了。