HBase and why you should care about a distributed mode
I’ve just thrown HBase onto my Hadoop stack and immediately ran into troubles with Zookeeper:
Could not start ZK at requested port of 2181
Had I bothered checking the docs, I’d have found out that Hbase runs in two modes – Standalone and Distributed. The standalone mode is, quite logically, the default one. In such a case HBase runs its own local Zookeeper occupying the default port 2181.
To avoid the error above, HBase needs to be instructed to start in (at least) a pseudo distributed mode. In the configuration directory ($HBASE_HOME/conf) modify or create a file called hbase-site.xml. Add (or update) the configuration section:
<configuration> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> </configuration>
Distributed mode makes HBase run against an existing Hadoop instance and prevents it from starting its own Zookeeper.