WordPress:文件名中的特殊字符
问题描述:
我给文件命名为Glacière_Service-de-lEducation-Ambassade-Chine_map.png
。WordPress:文件名中的特殊字符
完整路径应该是http://example.com/.../Glacie%CC%80re_Service-de-lEducation-Ambassade-Chine_map.png
。 (è
= %CC%80
)
但是,由于路径被解释为http://example.com/.../Glaci%C3%A8re_Service-de-lEducation-Ambassade-Chine_map.png
,所以在发布帖子后没有显示图像。 (è
= %C3%A8
)
为什么è
有不同的编码?
答
注意区别:
↓
Glacie%CC%80re_Service-de-lEducation-Ambassade-Chine_map.png
Glaci%C3%A8re_Service-de-lEducation-Ambassade-Chine_map.png
阅读Normalization Forms在Unicode® Standard Annex #15: UNICODE NORMALIZATION FORMS。
不幸的是,我不会说PHP;然而,下面的蟒示例可以帮助:
import unicodedata,urllib
from urllib import parse
x = unicodedata.lookup('Latin Small Letter E With Grave')
print(x, len(x))
y = unicodedata.normalize('NFKD', x)
print(y, len(y))
for char in (x + ' ' + y):
print(char, urllib.parse.quote(char, safe='/'),unicodedata.name(char, '?'))
结果:
==> python
Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import unicodedata,urllib
>>> from urllib import parse
>>>
>>> x = unicodedata.lookup('Latin Small Letter E With Grave')
>>> print(x, len(x))
è 1
>>>
>>> y = unicodedata.normalize('NFKD', x)
>>> print(y, len(y))
è 2
>>>
>>> for char in (x + ' ' + y):
... print(char, urllib.parse.quote(char, safe='/'),unicodedata.name(char, '?'))
...
è %C3%A8 LATIN SMALL LETTER E WITH GRAVE
%20 SPACE
e e LATIN SMALL LETTER E
̀ %CC%80 COMBINING GRAVE ACCENT
>>>
>>>
结果截图加入作为我不能防止NFKC
正常化上述代码示例中e` 2
串的,看到结果print(y, len(y))
: