Hadoop中namenode启动不起来解决方法

之前Hadoop集群都能启动成功,今天准备使用hive的时候发现Hadoop中的namenode 启动不起来的,查看日志:2020-04-14 22:25:31,793 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Remote journal 192.168.52.110:8485 failed to write txns 1152-1152. Will try to write to this JN again after the next log roll.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.qjournal.protocol.JournalOutOfSyncException): IPC serial 477 from client /192.168.52.100 was not higher than prior highest IPC serial 496
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkSync(Journal.java:485)
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:439)
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:457)
at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:352)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:149)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService2.callBlockingMethod(QJournalProtocolProtos.java:25421)atorg.apache.hadoop.ipc.ProtobufRpcEngine2.callBlockingMethod(QJournalProtocolProtos.java:25421) at org.apache.hadoop.ipc.ProtobufRpcEngineServerProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)atorg.apache.hadoop.ipc.RPCProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPCServer.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler1.run(Server.java:2216)atorg.apache.hadoop.ipc.Server1.run(Server.java:2216) at org.apache.hadoop.ipc.ServerHandler1.run(Server.java:2212)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:422)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)atorg.apache.hadoop.ipc.Server1.run(Server.java:2212) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796) at org.apache.hadoop.ipc.ServerHandler.run(Server.java:2210)

at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1409)
at org.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:230)atcom.sun.proxy.Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.Proxy11.journal(Unknown Source)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal

百度搜索说是因为我的hdfs-site.xml配置出错,但是因为直到今天都一直可以运行所以不是这个原因。

然后又仔细看日志,发现一直在说journalnode的问题,虚拟机因为本身很不稳定,所以有些时候会出现JournalNode是用来同步NameNode的元数据的,同步的过程中发现当前同步的版本号低于系统当前的版本,也就是说他想拿已经过期的老数据修改新数据所以可能出现错误。

于是我先找到192.168.52.100(cdh01)这台机器上的Journalnode的目录
在hdfs-site.xml文件中的配置:

dfs.journalnode.edits.dir
/home/hadoop/data/journaldata/jn
Hadoop中namenode启动不起来解决方法最后一行显示的两个文件,都删除掉
执行:rm -rf current 和 rm -rf in_use.lock

再去查看namenode就好了。