Forge Home

graphlab

Deploy OpenMPI cluster and setup Graphlab

10,758 downloads

10,118 latest version

2.0 quality score

We run a couple of automated
scans to help you access a
module's quality. Each module is
given a score based on how well
the author has formatted their
code and documentation and
modules are also checked for
malware using VirusTotal.

Please note, the information below
is for guidance only and neither of
these methods should be considered
an endorsement by Puppet.

Version information

  • 0.0.4 (latest)
  • 0.0.3
  • 0.0.2
  • 0.0.1
released May 31st 2013

Start using this module

  • r10k or Code Manager
  • Bolt
  • Manual installation
  • Direct download

Add this module to your Puppetfile:

mod 'viirya-graphlab', '0.0.4'
Learn more about managing modules with a Puppetfile

Add this module to your Bolt project:

bolt module add viirya-graphlab
Learn more about using this module with an existing project

Manually install this module globally with Puppet module tool:

puppet module install viirya-graphlab --version 0.0.4

Direct download is not typically how you would use a Puppet module to manage your infrastructure, but you may want to download the module in order to inspect the code.

Download

Documentation

viirya/graphlab — version 0.0.4 May 31st 2013

Puppet module for OpenMPI cluster and Graphlab

This puppet module helps setup OpenMPI cluster and install Graphlab.

Dependency

Puppet module for Java 'viirya/java'.

Usage

After installing this module in puppet master node, in site.pp, defining:

node 'your cluster slave nodes' {
    include java
    include graphlab::cluster::slave
}

node 'your cluster master node' {
    include java
    include graphlab::cluster::master
}

Download Graphlab release file and put it under files/.

Remember to modify necessary parameters in manifests/params.pp, such as 'version', 'master', 'slaves'. If you use Graphlab release graphlabapi_v2.1.4679.tar.gz, 'version' should be set to 'v2.1.4679'.

This module installs Graphlab under /opt/graphlab/graphlabapi. Configure and compile Graphlab on master node by:

cd /opt/graphlab/graphlabapi
./configure
cd release/
make -j4

On master node, a file containing slave nodes of OpenMPI named 'nodes' is created under graphlab user 'hduser' home dir. A symlink 'graphlab' that points to /opt/graphlab is also created under the home dir.

On slave nodes, the dir 'graphlab' under the user's home dir is a mounted point for the Graphlab installation (/opt/graphlab) of master node through NFS. So all slave nodes can access mpi programs and data if you put them under /opt/graphlab of master node.

Test

To test if the cluster and Graphlab are configured well, try on master node as the graphlab user 'hduser':

# Download test data

mkdir /opt/graphlab/smallnetflix
cd /opt/graphlab/smallnetflix
wget http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.train
wget http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.validate

# Go back home dir

cd

# Run Graphlab program

mpiexec -n 14 -hostfile nodes ./graphlab/graphlabapi/release/toolkits/collaborative_filtering/als --matrix ./graphlab/smallnetflix --max_iter=3 --ncpus=1