服务器端包含和字符编码

时间:2021-12-19 16:18:15

I created a static website in which each page has the following structure:

我创建了一个静态网站,其中每个页面都具有以下结构:

  1. Common stuff like header, menu, etc.
  2. 标题,菜单等常见内容

  3. Page specific stuff in main content div
  4. 主要内容div中的页面特定内容

  5. Footer

In this website, all the common content is duplicated in each page. In order to improve the maintainability I refactored the pages to use server-side includes (SSI) so that the common content is not duplicated. The structure of each page is now

在本网站中,每个页面都复制了所有常见内容。为了提高可维护性,我重构了页面以使用服务器端包含(SSI),以便不重复共同内容。现在每页的结构

  1. SSI for Common stuff like header, menu, etc.
  2. 针对标题,菜单等常见内容的SSI

  3. Page specific stuff in main content div
  4. 主要内容div中的页面特定内容

  5. SSI for footer
  6. SSI for footer

In the refactored site, for some reason the French characters no longer display properly in the page-specific content area, though they display fine in the content included via SSIs.

在重构网站中,由于某种原因,法语字符不再在特定于页面的内容区域中正确显示,尽管它们在通过SSI包含的内容中显示得很好。

The included header specifies the character set as:

包含的标头将字符集指定为:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

If I open one of the main content pages in a browser it tells me that the character encoding is ISO-8859-1. I've tried adding a .htaccess file to the folder with the lines

如果我在浏览器中打开一个主要内容页面,它会告诉我字符编码是ISO-8859-1。我已经尝试将.htaccess文件添加到带有行的文件夹中

AddDefaultCharset UTF-8
AddCharset UTF-8 .shtml
AddCharset UTF-8 .html

But still those pesky French accents aren't displaying properly on the version of the site that uses SSIs.

但仍然那些讨厌的法语口音在使用SSI的网站版本上没有正确显示。

3 个解决方案

#1


You are serving your pages as UTF-8, which is good, but at least some of the page is being dragged in from files which are not actually saved as UTF-8. SSI just throws the raw bytes in, it doesn't attempt to recode the includes so that their charsets match the file they're being included into.

您正在以UTF-8的形式提供页面,这很好,但至少有一些页面是从实际上未保存为UTF-8的文件中拖入的。 SSI只是抛出原始字节,它不会尝试重新编码包含,以便它们的字符集与它们被包含的文件匹配。

You need to go through all your html and include files in a text editor and make sure each one is saved as UTF-8.

您需要浏览所有html并在文本编辑器中包含文件,并确保每个文件都保存为UTF-8。

As John mentioned, you can avoid encoding issues by using character references for all non-ASCII characters, but it's a tremendous pain.

正如John所提到的,你可以通过对所有非A​​SCII字符使用字符引用来避免编码问题,但这是一个巨大的痛苦。

#2


Your HTML document is using UTF-8 encoding, try these character codes for your accented letters: http://www.tony-franks.co.uk/UTF-8.htm

您的HTML文档使用UTF-8编码,请尝试使用这些字符代码作为重音字母:http://www.tony-franks.co.uk/UTF-8.htm

#3


I had the same problem as you and finally found a solution that fixed it.

我遇到了和你一样的问题,最后找到了一个修复它的解决方案。

UTF8 makes an extra line on my site

UTF8在我的网站上增加了一条线

Save all your files as UTF-8 without BOM (http://en.wikipedia.org/wiki/Byte_order_mark).

将所有文件保存为UTF-8,无BOM(http://en.wikipedia.org/wiki/Byte_order_mark)。

#1


You are serving your pages as UTF-8, which is good, but at least some of the page is being dragged in from files which are not actually saved as UTF-8. SSI just throws the raw bytes in, it doesn't attempt to recode the includes so that their charsets match the file they're being included into.

您正在以UTF-8的形式提供页面,这很好,但至少有一些页面是从实际上未保存为UTF-8的文件中拖入的。 SSI只是抛出原始字节,它不会尝试重新编码包含,以便它们的字符集与它们被包含的文件匹配。

You need to go through all your html and include files in a text editor and make sure each one is saved as UTF-8.

您需要浏览所有html并在文本编辑器中包含文件,并确保每个文件都保存为UTF-8。

As John mentioned, you can avoid encoding issues by using character references for all non-ASCII characters, but it's a tremendous pain.

正如John所提到的,你可以通过对所有非A​​SCII字符使用字符引用来避免编码问题,但这是一个巨大的痛苦。

#2


Your HTML document is using UTF-8 encoding, try these character codes for your accented letters: http://www.tony-franks.co.uk/UTF-8.htm

您的HTML文档使用UTF-8编码,请尝试使用这些字符代码作为重音字母:http://www.tony-franks.co.uk/UTF-8.htm

#3


I had the same problem as you and finally found a solution that fixed it.

我遇到了和你一样的问题,最后找到了一个修复它的解决方案。

UTF8 makes an extra line on my site

UTF8在我的网站上增加了一条线

Save all your files as UTF-8 without BOM (http://en.wikipedia.org/wiki/Byte_order_mark).

将所有文件保存为UTF-8,无BOM(http://en.wikipedia.org/wiki/Byte_order_mark)。