WordPress:文件名中的特殊字符

问题描述:

我给文件命名为Glacière_Service-de-lEducation-Ambassade-Chine_map.pngWordPress:文件名中的特殊字符

完整路径应该是http://example.com/.../Glacie%CC%80re_Service-de-lEducation-Ambassade-Chine_map.png。 ( = %CC%80

但是,由于路径被解释为http://example.com/.../Glaci%C3%A8re_Service-de-lEducation-Ambassade-Chine_map.png,所以在发布帖子后没有显示图像。 ( = %C3%A8

为什么有不同的编码?

注意区别:

 ↓ 
Glacie%CC%80re_Service-de-lEducation-Ambassade-Chine_map.png 
Glaci%C3%A8re_Service-de-lEducation-Ambassade-Chine_map.png 

阅读Normalization FormsUnicode® Standard Annex #15: UNICODE NORMALIZATION FORMS

不幸的是,我不会说PHP;然而,下面的示例可以帮助:

import unicodedata,urllib 
from urllib import parse 

x = unicodedata.lookup('Latin Small Letter E With Grave') 
print(x, len(x)) 

y = unicodedata.normalize('NFKD', x) 
print(y, len(y)) 

for char in (x + ' ' + y): 
    print(char, urllib.parse.quote(char, safe='/'),unicodedata.name(char, '?')) 

结果

==> python 
Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)] on win32 
Type "help", "copyright", "credits" or "license" for more information. 
>>> import unicodedata,urllib 
>>> from urllib import parse 
>>> 
>>> x = unicodedata.lookup('Latin Small Letter E With Grave') 
>>> print(x, len(x)) 
è 1 
>>> 
>>> y = unicodedata.normalize('NFKD', x) 
>>> print(y, len(y)) 
è 2 
>>> 
>>> for char in (x + ' ' + y): 
... print(char, urllib.parse.quote(char, safe='/'),unicodedata.name(char, '?')) 
... 
è %C3%A8 LATIN SMALL LETTER E WITH GRAVE 
    %20 SPACE 
e e LATIN SMALL LETTER E 
̀ %CC%80 COMBINING GRAVE ACCENT 
>>> 
>>> 

结果截图加入作为我不能防止NFKC正常化上述代码示例中e` 2串的,看到结果print(y, len(y))

python example