Node.js - Segfault：11与相当大的阵列

问题描述：

这里的第二篇文章，这真的让我挠了挠头。我有一个处理数组的函数来试图找到类似的数据。该数组包含1410个元素，我认为它很多，但Node或我的计算机不应该能够处理的东西。Node.js - Segfault：11与相当大的阵列

我的代码给出了“Segmentation Fault：11”错误，我发现它是与内存访问问题有关的，所以我甚至希望尽可能测试我的Mac的RAM，但一切正常。 segfault使调试非常困难，这就是我来到这里的原因。

那里事情错的代码中的位置：

return matchings.map(matchArray => { 
    const namesList = matchArray 
    .map(matchItem => matchItem.name) 
    .sort((a, b) => a.localeCompare(b)) 

    const symbolsList = matchArray 
    .map(matchItem => matchItem.symbol) 
    .sort((a, b) => a.localeCompare(b)) 

    return { 
    name: common.getMode(namesList), 
    symbol: common.getMode(symbolsList), 
    matches: matchArray 
    } 
}).sort((a, b) => a.name.localeCompare(b.name))

哪里matchings是我说的数组。 common.getMode(array)有这样的代码：

array.sort() 

const stats = { 
    top: { 
    name: '', 
    freq: 0 
    }, 
    current: { 
    name: array[0], 
    freq: 1 
    } 
} 

for (let idxName = 1; idxName < array.length; idxName++) { 
    const currentName = array[idxName] 
    const lastName = array[idxName - 1] 

    if (currentName === lastName) { 
    stats.current.freq++ 
    } else { 
    if (stats.current.freq > stats.top.freq) { 
     stats.top.name = stats.current.name 
     stats.top.freq = stats.current.freq 
    } 
    stats.current = { 
     name: currentName, 
     freq: 1 
    } 
    } 
} 

if (stats.current.freq > stats.top.freq) { 
    stats.top.name = stats.current.name 
    stats.top.freq = stats.current.freq 
} 

return stats.top.name

值得一提的是，与小尺寸〜1000的数组进行时，代码工作正常，这使我相信这不是我的代码。网上关于Node的Segfault 11的内容也很少，这没有帮助。

任何想法非常感谢！

感谢您的回复。非工作阵列的JSON在这里：[链接]（https://pastebin.com/SnVJM7xN），工作在这里：[link]（https://pastebin.com/GUYMMs6S）。有没有什么办法可能是因为等待承诺超时，因为该函数需要大约8秒才会发生段错误？ –

答

TL; DR使用tail call optimization消除来自call stack的压力。

编辑（有解释）

见Understanding Javascript Functions Execution...了解call stack，memory heap和queue之间的差异。尽管对象和变量存在于heap陆地中，但函数调用在call stack中引用，第二个数据集耗尽（16'000个堆栈帧）;所以你的算法无法跟上，因为它无法继续分配新的函数调用。

见this StackOverflow answer，它指向有关call stack和this one too进一步的信息，它指向的方式来获得对heap数据。

原来的答复

我可以完全离开，但我很好奇，看看是否转换您循环，以递归可以帮助内存应付。我会在我的盒子上尝试它，但设置一切都很麻烦。

你可以试试吗？它使用扩展运算符和数组解构，因此您可能需要将babel-preset-stage-0添加到您的项目以及.babelrc文件中。

的Javascript

let common = {}; 
common.getMode = (arr, compare_fn) => { 
    const compare = !!compare_fn ? compare_fn : (a, b) => a.localeCompare(b) 
    arr.sort(compare) 

    const stats = { 
    top: { 
     name: '', 
     freq: 0 
    }, 
    current: { 
     name: arr[0], 
     freq: 1 
    } 
    } 

    for (let i=1, imax = arr.length ; i < imax ; ++i) { 
    const currentName = arr[i] 
    const lastName = arr[i - 1] 

    if (currentName === lastName) { 
     stats.current.freq++ 
    } else { 
     if (stats.current.freq > stats.top.freq) { 
     stats.top.name = stats.current.name 
     stats.top.freq = stats.current.freq 
     } 
     stats.current = { 
     name: currentName, 
     freq: 1 
     } 
    } 
    } 

    if (stats.current.freq > stats.top.freq) { 
    stats.top.name = stats.current.name 
    stats.top.freq = stats.current.freq 
    } 

    return stats.top.name 
}; 

const build_prop_list = (prop, input_array, output_array = []) => { 
    if(input_array.length == 0) return output_array; 
    else { 
    const [current, ...tail] = input_array; 
    const new_array = [...output_array, current[prop]]; 
    return build_prop_list(prop, tail, new_array); 
    } 
} 

const work = (input_array, output_array = []) => { 
    if(input_array.length == 0) return output_array.sort((a, b) => a.name.localeCompare(b.name)); 
    else { 
    const [matchArray, ...tail] = input_array; 

    const namesList = build_prop_list("name", matchArray); 
    const symbolsList = build_prop_list("symbol", matchArray); 

    const new_element = { 
     name: common.getMode(namesList), 
     symbol: common.getMode(symbolsList), 
     matches: matchArray 
    }; 

    const new_array = [...output_array, new_element]; 
    return work(tail, new_array); 
    } 
} 

let result = work(insert_your_json_here);

编辑

您也可以申请尾部调用优化您for环路common.getMode(...)。虽然第一次迭代的行为不同，因为lastName未引用数组的姓（索引：-1），但是第一个。看看它是否符合你的需求，并且你会优化你的代码。
这应取代common.getMode(...)中的for循环。

const feed = (input_array) => { 
    if(input_array.length == 0) return; 
    const [lastName, currentName, ...tail] = input_array; 

    if (currentName === lastName) { 
     stats.current.freq++ 
    } else { 
     if (stats.current.freq > stats.top.freq) { 
     stats.top.name = stats.current.name 
     stats.top.freq = stats.current.freq 
     } 
     stats.current = { 
     name: currentName, 
     freq: 1 
     } 
    } 

    return feed(tail); 
    }(arr);

你似乎已经解决了，非常感谢。是否有节点造成这种情况的限制？我正在监视内存使用情况，同时运行我的代码，内存似乎很好。我认为Node应该能够处理这个问题，不是吗？ –

这叫做尾呼优化。基本上，你从堆栈增长到与它的参数数量成比例增长，到堆栈应该增长到一个固定的大小（取决于你在这个函数中分支多少次或者调用其他函数）。我会尽力为你提供一些文献。我不认为它依赖于Node。这是一个会在任何语言和计算机上出现的问题，恕我直言。你也可以在'common.getMode（...）'中优化你的循环，但是第一次迭代的行为会改变。我也会编辑它。 –

请参阅[ECMAScript 6中的尾部呼叫优化]（http://2ality.com/2015/06/tail-call-optimization.html）和[JS ES6递归尾部呼叫优化]（https://medium.com/@ mlaythe/JS-ES6递归尾叩优化-feaf2dada3f6）。这个想法是不要在你的函数结尾分支到另一个函数或者评估中，并且使用递归为你做循环。如果它对你有帮助，你也可以提出答案。 –

Node.js - Segfault：11与相当大的阵列

相关推荐