如何在django ImageField中验证图像格式

时间:2022-07-03 00:25:04

Our project uses Python 2.7, PIL 1.1.7 and Django 1.5.1. There is an ImageField which works OK for many image formats, including bmp, gif, ico, pnm, psd, tif and pcx. However the requirement is to only allow png or jpg images. How can it be done?

我们的项目使用Python 2.7,PIL 1.1.7和Django 1.5.1。有一个ImageField适用于许多图像格式,包括bmp,gif,ico,pnm,psd,tif和pcx。但是要求仅允许png或jpg图像。如何做呢?

Upd. I know I can validate file extension and http Content-Type header. But neither method is reliable. What I'm asking is whether there's a way to check uploaded file content for being png/jpg.

UPD。我知道我可以验证文件扩展名和http Content-Type标头。但这两种方法都不可靠。我问的是,是否有办法检查上传的文件内容是否为png / jpg。

3 个解决方案

#1


5  

You don't specify whether you're using a Django form to upload the image, I assume so as it is in the form field that the validation is carried out.

您没有指定是否使用Django表单来上传图像,我假设它是在表单字段中执行验证。

What you could do is create a subclass of django.forms.fields.ImageField to extend the functionality of to_python.

你可以做的是创建一个django.forms.fields.ImageField的子类来扩展to_python的功能。

The file type check currently carried out in Django in to_python looks like this

目前在to_python中的Django中执行的文件类型检查如下所示

Image.open(file).verify()

Your subclass could look something like.

您的子类可能看起来像。

class DmitryImageField(ImageField):

    def to_python(self, data):
        f = super(DmitryImageField, self).to_python(data)
        if f is None:
            return None

        try:
            from PIL import Image
        except ImportError:
            import Image

        # We need to get a file object for PIL. We might have a path or we might
        # have to read the data into memory.
        if hasattr(data, 'temporary_file_path'):
            file = data.temporary_file_path()
        else:
            if hasattr(data, 'read'):
                file = BytesIO(data.read())
            else:
                file = BytesIO(data['content'])

        try:
            im = Image.open(file)
            if im.format not in ('BMP', 'PNG', 'JPEG'):
                raise ValidationError("Unsupport image type. Please upload bmp, png or jpeg")
        except ImportError:
            # Under PyPy, it is possible to import PIL. However, the underlying
            # _imaging C module isn't available, so an ImportError will be
            # raised. Catch and re-raise.
            raise
        except Exception: # Python Imaging Library doesn't recognize it as an image
            raise ValidationError(self.error_messages['invalid_image'])

        if hasattr(f, 'seek') and callable(f.seek):
            f.seek(0)
        return f

You may notice this is most of the code from ImageField.to_python and might prefer to just create a sub-class of FileField to use instead of ImageField rather than subclassing ImageField and duplicating much of its functionality. In this case make sure to add im.verify() before the format check.

您可能会注意到这是ImageField.to_python中的大部分代码,并且可能更喜欢创建一个FileField的子类来代替ImageField而不是子类化ImageField并复制其大部分功能。在这种情况下,请确保在格式检查之前添加im.verify()。

EDIT: I should point out that I've not tested this subclass.

编辑:我应该指出,我没有测试过这个子类。

#2


1  

You will probably want to use os for this. From the Python docs.

您可能希望使用操作系统。来自Python文档。

os.path.splitext(path) Split the pathname path into a pair (root, ext) such that root + ext == path, and ext is empty or begins with a period and contains at most one period. Leading periods on the basename are ignored; splitext('.cshrc') returns ('.cshrc', ''). Changed in version 2.6: Earlier versions could produce an empty root when the only period was the first character.

os.path.splitext(path)将路径名路径拆分为一对(root,ext),使得root + ext == path,ext为空或以句点开头,最多包含一个句点。基本名称的前导句点被忽略; splitext('。cshrc')返回('。cshrc','')。版本2.6中更改:当唯一的句点是第一个字符时,早期版本可能会生成空根。

example

import os
fileName, fileExtension = os.path.splitext('yourImage.png')

print fileName 
>>> "yourImage"

print fileExtension
>>> ".png"

So once you have your ext separated from the filename you should just use a simple string comparison to verify it's the right format.

因此,一旦将ext与文件名分开,就应该使用简单的字符串比较来验证它是否是正确的格式。

#3


1  

You can use python-magic, a ctype wrapper around libmagic, the library used by the file on Linux.

你可以使用python-magic,一个围绕libmagic的ctype包装器,这是Linux上文件使用的库。

From its doc:

从它的文档:

>>> import magic
>>> magic.from_file("testdata/test.pdf")
'PDF document, version 1.2'
>>> magic.from_buffer(open("testdata/test.pdf").read(1024))
'PDF document, version 1.2'
>>> magic.from_file("testdata/test.pdf", mime=True)
'application/pdf'

However this method simply look at mime information. You can still upload a non-valid PNG with the correct mime or embed unauthorized data in file's metadata.

然而,这种方法只是看mime信息。您仍然可以使用正确的mime上传无效的PNG,或者将未经授权的数据嵌入文件的元数据中。

#1


5  

You don't specify whether you're using a Django form to upload the image, I assume so as it is in the form field that the validation is carried out.

您没有指定是否使用Django表单来上传图像,我假设它是在表单字段中执行验证。

What you could do is create a subclass of django.forms.fields.ImageField to extend the functionality of to_python.

你可以做的是创建一个django.forms.fields.ImageField的子类来扩展to_python的功能。

The file type check currently carried out in Django in to_python looks like this

目前在to_python中的Django中执行的文件类型检查如下所示

Image.open(file).verify()

Your subclass could look something like.

您的子类可能看起来像。

class DmitryImageField(ImageField):

    def to_python(self, data):
        f = super(DmitryImageField, self).to_python(data)
        if f is None:
            return None

        try:
            from PIL import Image
        except ImportError:
            import Image

        # We need to get a file object for PIL. We might have a path or we might
        # have to read the data into memory.
        if hasattr(data, 'temporary_file_path'):
            file = data.temporary_file_path()
        else:
            if hasattr(data, 'read'):
                file = BytesIO(data.read())
            else:
                file = BytesIO(data['content'])

        try:
            im = Image.open(file)
            if im.format not in ('BMP', 'PNG', 'JPEG'):
                raise ValidationError("Unsupport image type. Please upload bmp, png or jpeg")
        except ImportError:
            # Under PyPy, it is possible to import PIL. However, the underlying
            # _imaging C module isn't available, so an ImportError will be
            # raised. Catch and re-raise.
            raise
        except Exception: # Python Imaging Library doesn't recognize it as an image
            raise ValidationError(self.error_messages['invalid_image'])

        if hasattr(f, 'seek') and callable(f.seek):
            f.seek(0)
        return f

You may notice this is most of the code from ImageField.to_python and might prefer to just create a sub-class of FileField to use instead of ImageField rather than subclassing ImageField and duplicating much of its functionality. In this case make sure to add im.verify() before the format check.

您可能会注意到这是ImageField.to_python中的大部分代码,并且可能更喜欢创建一个FileField的子类来代替ImageField而不是子类化ImageField并复制其大部分功能。在这种情况下,请确保在格式检查之前添加im.verify()。

EDIT: I should point out that I've not tested this subclass.

编辑:我应该指出,我没有测试过这个子类。

#2


1  

You will probably want to use os for this. From the Python docs.

您可能希望使用操作系统。来自Python文档。

os.path.splitext(path) Split the pathname path into a pair (root, ext) such that root + ext == path, and ext is empty or begins with a period and contains at most one period. Leading periods on the basename are ignored; splitext('.cshrc') returns ('.cshrc', ''). Changed in version 2.6: Earlier versions could produce an empty root when the only period was the first character.

os.path.splitext(path)将路径名路径拆分为一对(root,ext),使得root + ext == path,ext为空或以句点开头,最多包含一个句点。基本名称的前导句点被忽略; splitext('。cshrc')返回('。cshrc','')。版本2.6中更改:当唯一的句点是第一个字符时,早期版本可能会生成空根。

example

import os
fileName, fileExtension = os.path.splitext('yourImage.png')

print fileName 
>>> "yourImage"

print fileExtension
>>> ".png"

So once you have your ext separated from the filename you should just use a simple string comparison to verify it's the right format.

因此,一旦将ext与文件名分开,就应该使用简单的字符串比较来验证它是否是正确的格式。

#3


1  

You can use python-magic, a ctype wrapper around libmagic, the library used by the file on Linux.

你可以使用python-magic,一个围绕libmagic的ctype包装器,这是Linux上文件使用的库。

From its doc:

从它的文档:

>>> import magic
>>> magic.from_file("testdata/test.pdf")
'PDF document, version 1.2'
>>> magic.from_buffer(open("testdata/test.pdf").read(1024))
'PDF document, version 1.2'
>>> magic.from_file("testdata/test.pdf", mime=True)
'application/pdf'

However this method simply look at mime information. You can still upload a non-valid PNG with the correct mime or embed unauthorized data in file's metadata.

然而,这种方法只是看mime信息。您仍然可以使用正确的mime上传无效的PNG,或者将未经授权的数据嵌入文件的元数据中。