Forge Home

hadoop

Install Hadoop MapReduce Next Generation.

10,227 downloads

9,307 latest version

1.5 quality score

Version information

  • 0.0.22 (latest)
  • 0.0.21
  • 0.0.20
  • 0.0.19
  • 0.0.18
  • 0.0.17
  • 0.0.16
  • 0.0.15
released Feb 3rd 2015

Start using this module

  • r10k or Code Manager
  • Bolt
  • Manual installation
  • Direct download

Add this module to your Puppetfile:

mod 'wtanaka-hadoop', '0.0.22'
Learn more about managing modules with a Puppetfile

Add this module to your Bolt project:

bolt module add wtanaka-hadoop
Learn more about using this module with an existing project

Manually install this module globally with Puppet module tool:

puppet module install wtanaka-hadoop --version 0.0.22

Direct download is not typically how you would use a Puppet module to manage your infrastructure, but you may want to download the module in order to inspect the code.

Download

Documentation

wtanaka/hadoop — version 0.0.22 Feb 3rd 2015

Puppet module for deploying Hadoop MapReduce Next Generation on cluster

This module deploys Hadoop MapReduce Next Generation on a cluster of machines. It is tested on Apache Hadoop 2.2.0 under puppet agent/master environment. It is based on bcarpio/hadoop 0.0.3.

Building:

(cd ..; puppet module build puppet-hadoop)

Usage:

Install this module on your puppet master node by:

sudo puppet install wtanaka-hadoop.

In site.pp, define:

node 'your hadoop slave nodes' {
    include java
    include hadoop::cluster::slave
}

node 'your hadoop master node' {
    include java
    include hadoop::cluster::master
}

For pseudo-distributed mode of Hadoop, put following codes in a .pp file (e.g. hadoop.pp).

include java
include hadoop::cluster::pseudomode

Then,

sudo puppet apply hadoop.pp

Hadoop Distribution:

This puppet module will automatically download Apache Hadoop distribution from one pre-defined Apache Mirror site. If you are like to use faster mirror site, please modify the URL in init.pp.

Parameters:

Some parameters are able to modify in params.pp. You should modify the parameters such as 'master', 'resourcemanager' and 'slaves' to reflect your hadoop cluster settings.

SSH keys:

Remember to generate your ssh keys and put the keys in files/ssh/.

Note: Since master deployment will run hadoop scripts to launch hadoop services on slaves nodes, please deploy hadoop slaves first. When all slaves are deployed, then deploy master node.