CESNET

cesnet

10,156 downloads

8,658 latest version

4.6 quality score

Version information

  • 0.9.3 (latest)
  • 0.9.2
  • 0.9.1
  • 0.9.0
released Jun 4th 2015
This version is compatible with:
  • CentOS
    ,
    Debian
    ,
    Fedora
    ,
    RedHat
    ,
    Scientific
    ,
    Ubuntu

Start using this module

Tags: hadoop, pig

Documentation

cesnet/pig — version 0.9.3 Jun 4th 2015

####Table of Contents

  1. Overview
  2. Module Description - What the module does and why it is useful
  3. Setup - The basics of getting started with pig
  4. Usage - Configuration options and additional functionality
  5. Reference - An under-the-hood peek at what the module is doing and how
  6. Development - Guide for contributing to the module

##Overview

Install Apache Pig - platform for analyzing large data sets.

##Module Description

This module installs Apacha Pig - platform for analyzing large data sets. By default pig expects locally set-up Hadoop client.

Supported are:

  • Fedora 21: native packages (tested on Pig 0.13.0)
  • Debian 7/wheezy: Cloudera distribution (tested on CDH 5.3.0, Pig 0.12.0)
  • Ubuntu 14/trusty: Cloudera distribution (tested on CDH 5.3.0, Pig 0.12.0)
  • RHEL 6, CentOS 6, Scientific Linux 6: Cloudera distribution (tested on CDH 5.4.2, Pig 0.12.0)

##Setup

###What cesnet-pig module affects

  • Packages: installs pig packages

###Setup Requirements

Be aware of:

###Beginning with pig

Example:

include pig

##Usage

By default pig uses Hadoop for its operations, like launched with -x mapreduce:

pig -x mapreduce

Pig can be launched locally this way:

pig -x local

Use Pig with HBase: add following to the pig scripts (replace <ZooKeeper_version> and <HBase_version> by current values):

register /usr/lib/zookeeper/zookeeper-<ZooKeeper_version>.jar
register /usr/lib/hbase/hbase-<HBase_version>-security.jar

Use Pig with DataFu: add following to the pig scripts (replace <DataFu_version> by current value):

REGISTER /usr/lib/pig/datafu-<DataFu_version>.jar

###Classes

  • config
  • init
  • install
  • params

###Module Parameters

####datafu_enabled true

Install also Pig User-Defined Functions collection.

##Development