HBase and why you should care about a distributed mode

I’ve just thrown HBase onto my Hadoop stack and immediately ran into troubles with Zookeeper:

Could not start ZK at requested port of 2181

Had I bothered checking the docs, I’d have found out that Hbase runs in two modes – Standalone and Distributed. The standalone mode is, quite logically, the default one. In such a case HBase runs its own local Zookeeper occupying the default port 2181.

To avoid the error above, HBase needs to be instructed to start in (at least) a pseudo distributed mode. In the configuration directory ($HBASE_HOME/conf) modify or create a file called hbase-site.xml. Add (or update) the configuration section:

<configuration>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
</configuration>

Distributed mode makes HBase run against an existing Hadoop instance and prevents it from starting its own Zookeeper.

Similar Posts