在Mac上将文本转换为十六进制
问题描述:
我试图将文本转换为十六进制字符以便在mac上构建dmg。
我被烧坏的事实是,十六进制似乎并没有在mac和windows上为ascii字符> 127指向相同的字符。而且它似乎基本的javascript函数只给出了“windows”的翻译。
我需要的 “陆委会” 翻译成十六进制...在Mac上将文本转换为十六进制
到目前为止,我这样做:
const fileData = await parseJson(readFile(item.file, "utf-8"))
const buttonsStr = labelToHex(fileData.lang)
function labelToHex(label: string) {
return hexEncode(label).toString().toUpperCase()
}
function hexEncode(str: string) {
let i
let result = ""
for (i = 0; i < str.length; i++) {
result += unicodeToHex(str.charCodeAt(i))
}
return result
}
function unicodeToHex(unicode: number) {
const hex = unicode.toString(16)
return ("0" + hex).slice(-2)
}
如果我通过:法语EAE
我得到: 46 72 61 6E E7 61 69 73 E9 E0 E8
但是当我读回,我得到:FranÀaisè‡Ë
我期待到g等: 46 72 61 6E 8D 61 69 73 8E 88 8F
,使得读回给出: 46 72 61 6E E7 61 69 73 E9 E0 E8
这对应于那些片材:
https://academic.evergreen.edu/projects/biophysics/technotes/program/ascii_ext-mac.htm https://academic.evergreen.edu/projects/biophysics/technotes/program/ascii_ext-pc.htm
尽管如此,我还是无法找到一个npm包,它会根据操作系统转换为十六进制或仅仅是一些我仍然没有发现的晦涩js函数?
我跑出来的想法,只是要做到:
function unicodeToHex(unicode: number) {
if (unicode < 128) {
const hex = unicode.toString(16)
return ("0" + hex).slice(-2)
}
if (unicode === 233) { return "8E" }//é
if (unicode === 224) { return "88" }//à
if (unicode === 232) { return "8F" }//è
return "3F" //?
}
,但我真的想避免...
答
我已经找到一种方法来做到这一点,这要感谢代码页,如@Keith所述
const cptable = require("codepage")
function hexEncode(str: string, lang: string, langWithRegion: string) {
let code
let hex
let i
const macCodePages = getMacCodePage(lang, langWithRegion)
let result = ""
for (i = 0; i < str.length; i++) {
try {
code = getMacCharCode(str, i, macCodePages)
if (code === undefined) {
hex = "3F" //?
} else {
hex = code.toString(16)
}
result += hex
} catch (e) {
debug("there was a problem while trying to convert a char to hex: " + e)
result += "3F" //?
}
}
return result
}
function getMacCodePage(lang: string, langWithRegion: string) {
switch (lang) {
case "ja": //japanese
return [10001] //Apple Japanese
case "zh": //chinese
if (langWithRegion === "zh_CN") {
return [10008] //Apple Simplified Chinese (GB 2312)
}
return [10002] //Apple Traditional Chinese (Big5)
case "ko": //korean
return [10003] //Apple Korean
case "ar": //arabic
case "ur": //urdu
return [10004] //Apple Arabic
case "he": //hebrew
return [10005] //Apple Hebrew
case "el": //greek
case "elc": //greek
return [10006] //Apple Greek
case "ru": //russian
case "be": //belarussian
case "sr": //serbian
case "bg": //bulgarian
case "uz": //uzbek
return [10007] //Apple Macintosh Cyrillic
case "ro": //romanian
return [10010] //Apple Romanian
case "uk": //ukrainian
return [10017] //Apple Ukrainian
case "th": //thai
return [10021] //Apple Thai
case "et": //estonian
case "lt": //lithuanian
case "lv": //latvian
case "pl": //polish
case "hu": //hungarian
case "cs": //czech
case "sk": //slovak
return [10029] //Apple Macintosh Central Europe
case "is": //icelandic
case "fo": //faroese
return [10079] //Apple Icelandic
case "tr": //turkish
return [10081] //Apple Turkish
case "hr": //croatian
case "sl": //slovenian
return [10082] //Apple Croatian
default:
return [10000] //Apple Macintosh Roman
}
}
function getMacCharCode(str: string, i: number, macCodePages: any) {
let code = str.charCodeAt(i)
let j
if (code < 128) {
code = str.charCodeAt(i)
}
else if (code < 256) {
//codepage 10000 = mac OS Roman
code = cptable[10000].enc[str[i]]
}
else {
for (j = 0; j < macCodePages.length; j++) {
code = cptable[macCodePages[j]].enc[str[i]]
if (code !== undefined) {
break
}
}
}
return code
}
charCodeAt返回一个unicode值,这比你的1字节修剪涉及的方式要多。如果你只想要节点中的字符串的十六进制值。试试这个 - >'新缓冲区('Françaiséàè')。toString('hex')'='4672616ec3a761697320c3a9c3a0c3a8',即16个字节,为你11个字符的字符串。 – Keith
这似乎是合乎逻辑的,但它仍然在我的伤害中不起作用。我得到垃圾而不是éàè:√ß√˘ß©ß®_这让我觉得它可能是因为dmg不知道它是utf8 ? – ether
'dmg不知道它是utf8'很可能,..如果是这样,有工具可以将utf8转换为选定的代码页,所以如果你能找出dmg文件使用的代码页,你可以使用类似于 - > https://www.npmjs.com/package/codepage – Keith