Forge Home

10,973 downloads

9,441 latest version

4.6 quality score

Version information

  • 0.9.3 (latest)
  • 0.9.2
  • 0.9.1
  • 0.9.0
released Jun 4th 2015
This version is compatible with:
  • , , , , ,

Start using this module

  • r10k or Code Manager
  • Bolt
  • Manual installation
  • Direct download

Add this module to your Puppetfile:

mod 'cesnet-pig', '0.9.3'
Learn more about managing modules with a Puppetfile

Add this module to your Bolt project:

bolt module add cesnet-pig
Learn more about using this module with an existing project

Manually install this module globally with Puppet module tool:

puppet module install cesnet-pig --version 0.9.3

Direct download is not typically how you would use a Puppet module to manage your infrastructure, but you may want to download the module in order to inspect the code.

Download
Tags: hadoop, pig

Documentation

cesnet/pig — version 0.9.3 Jun 4th 2015

####Table of Contents

  1. Overview
  2. Module Description - What the module does and why it is useful
  3. Setup - The basics of getting started with pig
  4. Usage - Configuration options and additional functionality
  5. Reference - An under-the-hood peek at what the module is doing and how
  6. Development - Guide for contributing to the module

##Overview

Install Apache Pig - platform for analyzing large data sets.

##Module Description

This module installs Apacha Pig - platform for analyzing large data sets. By default pig expects locally set-up Hadoop client.

Supported are:

  • Fedora 21: native packages (tested on Pig 0.13.0)
  • Debian 7/wheezy: Cloudera distribution (tested on CDH 5.3.0, Pig 0.12.0)
  • Ubuntu 14/trusty: Cloudera distribution (tested on CDH 5.3.0, Pig 0.12.0)
  • RHEL 6, CentOS 6, Scientific Linux 6: Cloudera distribution (tested on CDH 5.4.2, Pig 0.12.0)

##Setup

###What cesnet-pig module affects

  • Packages: installs pig packages

###Setup Requirements

Be aware of:

###Beginning with pig

Example:

include pig

##Usage

By default pig uses Hadoop for its operations, like launched with -x mapreduce:

pig -x mapreduce

Pig can be launched locally this way:

pig -x local

Use Pig with HBase: add following to the pig scripts (replace <ZooKeeper_version> and <HBase_version> by current values):

register /usr/lib/zookeeper/zookeeper-<ZooKeeper_version>.jar
register /usr/lib/hbase/hbase-<HBase_version>-security.jar

Use Pig with DataFu: add following to the pig scripts (replace <DataFu_version> by current value):

REGISTER /usr/lib/pig/datafu-<DataFu_version>.jar

###Classes

  • config
  • init
  • install
  • params

###Module Parameters

####datafu_enabled true

Install also Pig User-Defined Functions collection.

##Development