HBase 2.2.2 on Hadoop 3.2.1完全分布式部署

在《HBase使用独立部署ZooKeeper的伪单机模式》中我们使用单机模式部署了HBase,这不能满足企业级的使用要求,接下来我们使用完全分布式部署。

由于HBase和Hadoop存在版本依赖关系(HBase和Hadoop之间的RPC需要精确的版本匹配),因此需要根据Hadoop判断决定要使用的HBase版本。具体可以到官网上查看Hadoop version support matrix,地址为https://hbase.apache.org/book.html。截止到2019年12月1日,HBASE和Hadoop之间的版本支持矩阵如下:

HBase 2.2.2 on Hadoop 3.2.1完全分布式部署

Hadoop version support matrix

我们直接使用最新的2.2.2的二进制发行版,下载后使用 tar zvxf hbase-2.2.2-bin.tar.gz将其解压。

1、修改所有实例上的hbase-env.sh文件
禁用让HBASE自己管理ZooKeeper:

# Tell HBase whether it should manage it's own instance of ZooKeeper or not. export HBASE_MANAGES_ZK=false

1

2

# Tell HBase whether it should manage it's own instance of ZooKeeper or not.

export HBASE_MANAGES_ZK=false

2、修改所有实例上的hbase-site.xml文件
hbase.cluster.distributed=true配置为分布式部署方式,并将hbase.rootdir指向HDFS(此处的HDFS的地址就是etc/hadoop/core.site.xml中的fs.defaultFS),同时将hbase.zookeeper.quorum指向我们自己的ZooKeeper集群:

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hbase.rootdir</name> <value>file:///root/hbase-uat/data</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>127.0.0.1</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>12181</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/root/hbase-uat/zookeeper</value> </property> <property> <name>zookeeper.znode.parent</name> <value>/hbase-uat</value> </property> </configuration>

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

    <property>

        <name>hbase.rootdir</name>

        <value>file:///root/hbase-uat/data</value>

    </property>

    <property>

        <name>hbase.cluster.distributed</name>

        <value>true</value>

    </property>

    <property>

        <name>hbase.zookeeper.quorum</name>

        <value>127.0.0.1</value>

    </property>

    <property>

        <name>hbase.zookeeper.property.clientPort</name>

        <value>12181</value>

    </property>

    <property>

        <name>hbase.zookeeper.property.dataDir</name>

        <value>/root/hbase-uat/zookeeper</value>

    </property>

    <property>

        <name>zookeeper.znode.parent</name>

        <value>/hbase-uat</value>

    </property>

</configuration>

3、修改所有实例上的regionservers文件
文本文件,每行是一个主机名,一行一个,将regoin服务器添加上:

hadoop-master hadoop-slave1 hadoop-slave2

1

2

3

hadoop-master

hadoop-slave1

hadoop-slave2

4、启动集群
配置完成,使用bin/start-hbase.sh启动集群即可,在哪台服务器使用上述命令启动则哪台服务器即为master节点。我们可以使用 jps命令,看到服务器1启动和HMaster和HRegionServer进程,服务器2和服务器3启动和HRegionServer进程,但一会儿HMaster就退出,查看日志发现报错:

2019-12-03 21:37:46,032 ERROR [master/hadoop-master:16000:becomeActiveMaster] master.HMaster: Failed to become active master java.lang.IllegalStateException: The procedure WAL relies on the ability to hsync for proper operation during component failures, but the underlying filesystem does not support doing so. Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount that can provide it. at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1092) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:424) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:576) at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1528) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:938) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2112) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:580) at java.lang.Thread.run(Thread.java:748) 2019-12-03 21:37:46,032 ERROR [master/hadoop-master:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master hadoop-master,16000,1575380253796: Unhandled exception. Starting shutdown. *****

1

2

3

4

5

6

7

8

9

10

11

2019-12-03 21:37:46,032 ERROR [master/hadoop-master:16000:becomeActiveMaster] master.HMaster: Failed to become active master

java.lang.IllegalStateException: The procedure WAL relies on the ability to hsync for proper operation during component failures, but the underlying filesystem does not support doing so. Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount that can provide it.

        at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1092)

        at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:424)

        at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:576)

        at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1528)

        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:938)

        at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2112)

        at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:580)

        at java.lang.Thread.run(Thread.java:748)

2019-12-03 21:37:46,032 ERROR [master/hadoop-master:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master hadoop-master,16000,1575380253796: Unhandled exception. Starting shutdown. *****

经查,网上大量的帖子中指出需要在hbase-site.xml中增加如下配置:

<property> <name>hbase.unsafe.stream.capability.enforce</name> <value>false</value> </property>

1

2

3

4

<property>

    <name>hbase.unsafe.stream.capability.enforce</name>

    <value>false</value>

</property>

加入后确实解决了该问题。

通过Web页面查看HBase集群情况,HMaster默认的Web UI端口为16010:

HBase 2.2.2 on Hadoop 3.2.1完全分布式部署

HBase HMaster WebUI

RegionServer默认的Web UI端口为16030:

HBase 2.2.2 on Hadoop 3.2.1完全分布式部署

HBase RegionServer WebUI

不过,对于为何需要这个的配置这些帖子均语焉不详,而且官网明确指出了该参数仅适用于本地文件系统:

HBase 2.2.2 on Hadoop 3.2.1完全分布式部署

Master fails to become active due to lack of hsync for filesystem

结合https://*.com/questions/48709569/hbase-error-illegalstateexception-when-starting-master-hsync中

HBase 2.2.2 on Hadoop 3.2.1完全分布式部署

HBase 2.2.2二进制发行版中使用的hadoop 2.x的依赖

的说法,查看hbase/lib目录中确实是使用的hadoop 2.8.5的库:

HBase 2.2.2 on Hadoop 3.2.1完全分布式部署

HBase 2.2.2二进制发行版中默认使用的是Hadoop 2.x的依赖(2.8.5)

决定编译源码进行尝试,具体见另外一篇文章:《HBase 2.2.2 on Hadoop 3.2.1源码编译》。

参考资料:
1、https://hbase.apache.org/book.html
2、https://*.com/questions/48709569/hbase-error-illegalstateexception-when-starting-master-hsync
3、https://www.cndba.cn/dave/article/3321
4、https://blog.csdn.net/gdeasy/article/details/103136090
5、https://blog.51cto.com/caiyuanji/2132738
6、https://www.cnblogs.com/barneywill/p/10283076.html