UTF-8 URI爆炸Apache和mod_rewrite

时间:2021-05-29 11:18:32

I have Apache with mod_rewrite, and whenever I enter a URI with an accented character in it, Apache gives me a "Page Not Found" error.

我有使用mod_rewrite的Apache,每当我输入带有重音字符的URI时,Apache就会给我一个“找不到页面”的错误。

The URI is: /places/tags/Café

URI是:/ places / tags /Café

My page encoding is UTF-8. My database connection & tables are UTF-8. My Apache DefaultCharacterSet = UTF-8. Yes, Apache has language packs, but I believe they're there for page content, not URIs.

我的页面编码是UTF-8。我的数据库连接和表是UTF-8。我的Apache DefaultCharacterSet = UTF-8。是的,Apache有语言包,但我相信它们用于页面内容,而不是URI。

We'd prefer not to have the url encoded into percent signs and html entities, and stripping out the special characters isn't practical at the moment, on our 100 million rows of data.

我们不希望将url编码为百分号和html实体,并且在我们的1亿行数据上剥离特殊字符目前不实用。

Any help would be greatly appreciated.

任何帮助将不胜感激。

1 个解决方案

#1


2  

Turns out I had a bad apache rewrite rule. I had been using: ([a-zA-Z0-9_-]) UTF-8 characters are not part of a-zA-Z. Change the rule to be: (.) That means any characters (ASCII, UTF-8, or othewise). Appears to work fine.

事实证明我有一个糟糕的apache重写规则。我一直在使用:([a-zA-Z0-9_-])UTF-8字符不是a-zA-Z的一部分。将规则更改为:(。)这意味着任何字符(ASCII,UTF-8或其他)。似乎工作正常。

#1


2  

Turns out I had a bad apache rewrite rule. I had been using: ([a-zA-Z0-9_-]) UTF-8 characters are not part of a-zA-Z. Change the rule to be: (.) That means any characters (ASCII, UTF-8, or othewise). Appears to work fine.

事实证明我有一个糟糕的apache重写规则。我一直在使用:([a-zA-Z0-9_-])UTF-8字符不是a-zA-Z的一部分。将规则更改为:(。)这意味着任何字符(ASCII,UTF-8或其他)。似乎工作正常。