如何从终端更改文件的MIME类型?

时间:2022-11-08 02:52:21

What I'm looking for is a counterpart to file -I (Darwin; -i on Linux).

我正在寻找的是文件-I(Darwin; Linux上的-i)的对应物。

For example, given:

例如,给定:

$ file -I filename.pdf
filename.pdf: application/octet-stream; charset=binary

I would like to be able to do something like this:

我希望能够做到这样的事情:

$ [someCommand] filename.pdf application/pdf

The result would be that filename.pdf would then be typed as application/pdf.

结果是filename.pdf将被输入为application / pdf。

The reason for the question is that sometimes web servers use the wrong MIME type, which results in programs refusing to open the file. (Most often text/plain, in my experience.)

问题的原因是有时Web服务器使用错误的MIME类型,这导致程序拒绝打开该文件。 (根据我的经验,大多数情况下是文字/普通文字。)

I've been searching man, the web and this site for about two and a half hours. Tried everything from hex dumps to xattr to text editors.

我一直在搜索人,网站和这个网站大约两个半小时。尝试从十六进制转储到xattr到文本编辑器的所有内容。

Your help would very much be appreciated.

非常感谢您的帮助。

Chris

克里斯

3 个解决方案

#1


5  

The thing about MIME types is they're almost entirely fictional.

关于MIME类型的事情是它们几乎完全是虚构的。

MIME and HTTP ask us to pretend that all of our files have a piece of metadata identifying the "content type". When we send files around the network, the "content type" metadata goes with them, so nobody ever misinterprets the content of a file.

MIME和HTTP要求我们假装所有文件都有一段标识“内容类型”的元数据。当我们在网络上发送文件时,“内容类型”元数据随之而来,因此没有人会误解文件的内容。

The truth is this metadata doesn't exist. By the time MIME was invented, it was really too late to convince any OS vendors to adopt a new type system for files. Unix had settled on magic numbers, DOS had settled on 3-letter filename suffixes, and classic MacOS had its creator codes and type codes. (MacOS type codes were closest to the MIME model, since they actually were separate from both the filename and the content. But being only 4 letters long, MIME types wouldn't fit.)

事实是这个元数据不存在。当MIME被发明时,说服任何操作系统供应商为文件采用新型系统真的为时已晚。 Unix已经确定了魔术数字,DOS已经确定了3个字母的文件名后缀,而经典的MacOS有其创建者代码和类型代码。 (MacOS类型代码最接近MIME模型,因为它们实际上与文件名和内容分开。但只有4个字母长,MIME类型不适合。)

Nobody stores MIME-compatible content types in their filesystem. When a MIME message composer or HTTP server wants to send a file, it decides the file type in the traditional way (filename suffix and/or magic number) and maps the result to a MIME type.

没有人在其文件系统中存储与MIME兼容的内容类型。当MIME消息编写器或HTTP服务器想要发送文件时,它以传统方式(文件名后缀和/或幻数)决定文件类型,并将结果映射到MIME类型。

In contrast to the theory (where MIME eliminates file type guessing), MIME as implemented in practice has moved the "guess file type based on filename suffix and/or magic number" logic from the receiver of the file to the sender. As you have noticed, the sender doesn't usually do a better job than the receiver would have done if forced to figure it out for itself. Frequently in the case of a web server, the server's eagerness to slap a Content-type on a file makes things worse. There's no reason for a web server to know anything about the format of files it serves when it is only being used to distribute them and has no need to interpret their contents.

与理论(MIME消除文件类型猜测)相反,实际实现的MIME已经将“基于文件名后缀和/或幻数”的猜测文件类型从文件的接收者移动到发送者。正如你所注意到的那样,发送者通常不会比接收者做的更好,如果*为自己搞清楚。通常在Web服务器的情况下,服务器急于在文件上打一个Content-type会让事情变得更糟。当Web服务器仅用于分发文件时,没有理由知道它所服务的文件的格式,也不需要解释它们的内容。

The file command guesses file type by reading the content and looking for magic numbers and strings. The -I option doesn't change that. It just chooses a different output format.

file命令通过读取内容并查找幻数和字符串来猜测文件类型。 -I选项不会更改它。它只选择不同的输出格式。

To change the Content-Type header that a web server sends for a specific file, you should be looking in your web server's configuration manual. There's nothing you can do to the file itself.

要更改Web服务器为特定文件发送的Content-Type标头,您应该查看Web服务器的配置手册。你无法对文件本身做任何事情。

#2


2  

If you have a pdf, and the $file --mime-type command answer octet-stream and not application/pdf, you have a corruption in your file.

如果您有一个pdf,并且$ file --mime-type命令回答octet-stream而不是application / pdf,那么您的文件就会损坏。

The pdf readers will read it, and ignore the problem, but if you upload this file to a web application, the application will recognize the mime-type as a octet-sream. Sometimes it is a problem, mainly if you validate the mime-type (I sometimes have this problem in my application).

pdf阅读器将读取它并忽略该问题,但如果您将此文件上载到Web应用程序,则应用程序会将mime-type识别为八位字节。有时它是一个问题,主要是你验证mime类型(我有时在我的应用程序中有这个问题)。

To get a fast solution, use a ghost script like this:

要获得快速解决方案,请使用如下的ghost脚本:

gs -o new.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress old.pdf

#3


1  

It's a bit of a category mistake to talk about ‘the MIME type of a file’ – ‘files’ don't have MIME types; only octet streams have them (I'm not necessarily disagreeing with @wumpus-q-wumbley's description of MIME types as ‘fictional’, but this is another way of thinking about it).

谈论'文件的MIME类型'是一个类别错误 - 'files'没有MIME类型;只有八位字节流才有它们(我不一定不同意@ wumpus-q-wumbley对MIME类型的描述为'虚构',但这是另一种思考方式)。

MIME stands for Multipurpose Internet Mail Extensions, as originally described in in RFC 2045, and MIME types were originally intended to describe what a receiver is supposed to do with the bunch of bytes soon to follow down the wire, in the rest of the email message. They were very naturally repurposed in (for example) the HTTP protocol, to let a client understand how it is to interpret the bytes in the HTTP response which this MIME type forms the header of.

MIME代表多用途Internet邮件扩展,最初在RFC 2045中描述,并且MIME类型最初旨在描述接收器应该如何处理即将跟随线路的一堆字节,在其余的电子邮件中。它们非常自然地重新用于(例如)HTTP协议,让客户端理解如何解释HTTP响应中的这个MIME类型构成标题的字节。

The fact that the file command can display a MIME type suggests the further extension of the idea, to act as the key which lets a windowing system look up the name of an application which should be used to open the file.

file命令可以显示MIME类型的事实表明了这个想法的进一步扩展,作为一个键,它允许窗口系统查找应该用于打开文件的应用程序的名称。

Thus, if ‘the MIME type of a file’ means anything, it means ‘the MIME type which a web server would prefix to this file if it were to be delivered in response to an HTTP request’ (or something like that). Thought of like that, it's clear that the MIME type is part of the web server's configuration, and not anything intrinsic to the file – a single file might be delivered with various MIME types depending on the URL which retrieves it, and details of the request and configuration. Thus an XHTML file might be delivered as text/html or application/xml or application/octet-stream depending on the details of the HTTP request, the directory the file's located in, or indeed the phase of the moon (the latter would be an unhelpful server configuration).

因此,如果“文件的MIME类型”表示任何内容,则表示“如果要响应HTTP请求传递Web服务器将为此文件添加前缀的MIME类型”(或类似内容)。考虑到这一点,很明显MIME类型是Web服务器配置的一部分,而不是文件固有的任何内容 - 单个文件可能会提供各种MIME类型,具体取决于检索它的URL,以及请求的详细信息和配置。因此,XHTML文件可能以text / html或application / xml或application / octet-stream的形式提供,具体取决于HTTP请求的详细信息,文件所在的目录,或者月亮的相位(后者将是无用的服务器配置)。

A web server might have a number of mechanisms for deciding on this MIME type, which might include a lookup table based on any file extension, a .htaccess file, or indeed the output of the file command.

Web服务器可能具有许多用于决定此MIME类型的机制,其可能包括基于任何文件扩展名的查找表,.htaccess文件或文件命令的输出。

So the answer to your question is: it depends.

所以你的问题的答案是:它取决于。

  • If what you want to do is change how a web server delivers this file, then you need to look at either your web server documentation, or the contents of your system's /etc/mime.types file (if your system uses that and if the server is configured to fall back on that).
  • 如果您要做的是更改Web服务器提供此文件的方式,那么您需要查看Web服务器文档或系统的/etc/mime.types文件的内容(如果您的系统使用该文件,如果服务器配置为回退)。
  • If what you want to do is to change the application which opens a given (type of) file, then your OS/window-manager documentation should help.
  • 如果您要做的是更改打开给定(类型)文件的应用程序,那么您的OS /窗口管理器文档应该有所帮助。
  • If you need to change the output of the file command specifically, for some other reason, then man file is your friend, and you'll probably need to grub around in the magic numbers file, reasonably carefully.
  • 如果你需要专门更改文件命令的输出,由于某些其他原因,那么man文件是你的朋友,你可能需要在魔术数字文件中合理地仔细查看。

#1


5  

The thing about MIME types is they're almost entirely fictional.

关于MIME类型的事情是它们几乎完全是虚构的。

MIME and HTTP ask us to pretend that all of our files have a piece of metadata identifying the "content type". When we send files around the network, the "content type" metadata goes with them, so nobody ever misinterprets the content of a file.

MIME和HTTP要求我们假装所有文件都有一段标识“内容类型”的元数据。当我们在网络上发送文件时,“内容类型”元数据随之而来,因此没有人会误解文件的内容。

The truth is this metadata doesn't exist. By the time MIME was invented, it was really too late to convince any OS vendors to adopt a new type system for files. Unix had settled on magic numbers, DOS had settled on 3-letter filename suffixes, and classic MacOS had its creator codes and type codes. (MacOS type codes were closest to the MIME model, since they actually were separate from both the filename and the content. But being only 4 letters long, MIME types wouldn't fit.)

事实是这个元数据不存在。当MIME被发明时,说服任何操作系统供应商为文件采用新型系统真的为时已晚。 Unix已经确定了魔术数字,DOS已经确定了3个字母的文件名后缀,而经典的MacOS有其创建者代码和类型代码。 (MacOS类型代码最接近MIME模型,因为它们实际上与文件名和内容分开。但只有4个字母长,MIME类型不适合。)

Nobody stores MIME-compatible content types in their filesystem. When a MIME message composer or HTTP server wants to send a file, it decides the file type in the traditional way (filename suffix and/or magic number) and maps the result to a MIME type.

没有人在其文件系统中存储与MIME兼容的内容类型。当MIME消息编写器或HTTP服务器想要发送文件时,它以传统方式(文件名后缀和/或幻数)决定文件类型,并将结果映射到MIME类型。

In contrast to the theory (where MIME eliminates file type guessing), MIME as implemented in practice has moved the "guess file type based on filename suffix and/or magic number" logic from the receiver of the file to the sender. As you have noticed, the sender doesn't usually do a better job than the receiver would have done if forced to figure it out for itself. Frequently in the case of a web server, the server's eagerness to slap a Content-type on a file makes things worse. There's no reason for a web server to know anything about the format of files it serves when it is only being used to distribute them and has no need to interpret their contents.

与理论(MIME消除文件类型猜测)相反,实际实现的MIME已经将“基于文件名后缀和/或幻数”的猜测文件类型从文件的接收者移动到发送者。正如你所注意到的那样,发送者通常不会比接收者做的更好,如果*为自己搞清楚。通常在Web服务器的情况下,服务器急于在文件上打一个Content-type会让事情变得更糟。当Web服务器仅用于分发文件时,没有理由知道它所服务的文件的格式,也不需要解释它们的内容。

The file command guesses file type by reading the content and looking for magic numbers and strings. The -I option doesn't change that. It just chooses a different output format.

file命令通过读取内容并查找幻数和字符串来猜测文件类型。 -I选项不会更改它。它只选择不同的输出格式。

To change the Content-Type header that a web server sends for a specific file, you should be looking in your web server's configuration manual. There's nothing you can do to the file itself.

要更改Web服务器为特定文件发送的Content-Type标头,您应该查看Web服务器的配置手册。你无法对文件本身做任何事情。

#2


2  

If you have a pdf, and the $file --mime-type command answer octet-stream and not application/pdf, you have a corruption in your file.

如果您有一个pdf,并且$ file --mime-type命令回答octet-stream而不是application / pdf,那么您的文件就会损坏。

The pdf readers will read it, and ignore the problem, but if you upload this file to a web application, the application will recognize the mime-type as a octet-sream. Sometimes it is a problem, mainly if you validate the mime-type (I sometimes have this problem in my application).

pdf阅读器将读取它并忽略该问题,但如果您将此文件上载到Web应用程序,则应用程序会将mime-type识别为八位字节。有时它是一个问题,主要是你验证mime类型(我有时在我的应用程序中有这个问题)。

To get a fast solution, use a ghost script like this:

要获得快速解决方案,请使用如下的ghost脚本:

gs -o new.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress old.pdf

#3


1  

It's a bit of a category mistake to talk about ‘the MIME type of a file’ – ‘files’ don't have MIME types; only octet streams have them (I'm not necessarily disagreeing with @wumpus-q-wumbley's description of MIME types as ‘fictional’, but this is another way of thinking about it).

谈论'文件的MIME类型'是一个类别错误 - 'files'没有MIME类型;只有八位字节流才有它们(我不一定不同意@ wumpus-q-wumbley对MIME类型的描述为'虚构',但这是另一种思考方式)。

MIME stands for Multipurpose Internet Mail Extensions, as originally described in in RFC 2045, and MIME types were originally intended to describe what a receiver is supposed to do with the bunch of bytes soon to follow down the wire, in the rest of the email message. They were very naturally repurposed in (for example) the HTTP protocol, to let a client understand how it is to interpret the bytes in the HTTP response which this MIME type forms the header of.

MIME代表多用途Internet邮件扩展,最初在RFC 2045中描述,并且MIME类型最初旨在描述接收器应该如何处理即将跟随线路的一堆字节,在其余的电子邮件中。它们非常自然地重新用于(例如)HTTP协议,让客户端理解如何解释HTTP响应中的这个MIME类型构成标题的字节。

The fact that the file command can display a MIME type suggests the further extension of the idea, to act as the key which lets a windowing system look up the name of an application which should be used to open the file.

file命令可以显示MIME类型的事实表明了这个想法的进一步扩展,作为一个键,它允许窗口系统查找应该用于打开文件的应用程序的名称。

Thus, if ‘the MIME type of a file’ means anything, it means ‘the MIME type which a web server would prefix to this file if it were to be delivered in response to an HTTP request’ (or something like that). Thought of like that, it's clear that the MIME type is part of the web server's configuration, and not anything intrinsic to the file – a single file might be delivered with various MIME types depending on the URL which retrieves it, and details of the request and configuration. Thus an XHTML file might be delivered as text/html or application/xml or application/octet-stream depending on the details of the HTTP request, the directory the file's located in, or indeed the phase of the moon (the latter would be an unhelpful server configuration).

因此,如果“文件的MIME类型”表示任何内容,则表示“如果要响应HTTP请求传递Web服务器将为此文件添加前缀的MIME类型”(或类似内容)。考虑到这一点,很明显MIME类型是Web服务器配置的一部分,而不是文件固有的任何内容 - 单个文件可能会提供各种MIME类型,具体取决于检索它的URL,以及请求的详细信息和配置。因此,XHTML文件可能以text / html或application / xml或application / octet-stream的形式提供,具体取决于HTTP请求的详细信息,文件所在的目录,或者月亮的相位(后者将是无用的服务器配置)。

A web server might have a number of mechanisms for deciding on this MIME type, which might include a lookup table based on any file extension, a .htaccess file, or indeed the output of the file command.

Web服务器可能具有许多用于决定此MIME类型的机制,其可能包括基于任何文件扩展名的查找表,.htaccess文件或文件命令的输出。

So the answer to your question is: it depends.

所以你的问题的答案是:它取决于。

  • If what you want to do is change how a web server delivers this file, then you need to look at either your web server documentation, or the contents of your system's /etc/mime.types file (if your system uses that and if the server is configured to fall back on that).
  • 如果您要做的是更改Web服务器提供此文件的方式,那么您需要查看Web服务器文档或系统的/etc/mime.types文件的内容(如果您的系统使用该文件,如果服务器配置为回退)。
  • If what you want to do is to change the application which opens a given (type of) file, then your OS/window-manager documentation should help.
  • 如果您要做的是更改打开给定(类型)文件的应用程序,那么您的OS /窗口管理器文档应该有所帮助。
  • If you need to change the output of the file command specifically, for some other reason, then man file is your friend, and you'll probably need to grub around in the magic numbers file, reasonably carefully.
  • 如果你需要专门更改文件命令的输出,由于某些其他原因,那么man文件是你的朋友,你可能需要在魔术数字文件中合理地仔细查看。